ROBIN -  Open Source Mesh Network Forum Index ROBIN - Open Source Mesh Network
users community forum
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Back in the saddle......

 
Post new topic   Reply to topic    ROBIN - Open Source Mesh Network Forum Index -> Soekris
View previous topic :: View next topic  
Author Message
Ads






Posted: Sat May 27, 2017 8:14 pm    Post subject: Ads

Back to top
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Thu Nov 11, 2010 7:50 pm    Post subject: Back in the saddle...... Reply with quote

OK, I am back from an extended trip into the bowels of our truly screwed up global financial system. Just an FYI -- no one out there is lending, so if you need additional funds, you're best bet is an equity offering.

Good luck finding folks who are rich enough to back you up!

We are now in the process of adding another 100 backhauls to our Lawrence network. We have found a three items with our ROBIN version that need to be addressed:

1. httpd redirect.
2. madwifi drivers.
3. ext2 file writes.

PROBLEM:

1. We are using a http redirect to send users to the local cgi-bin directory. Unfortunately, we have to do this using:

Code:

<META HTTP-EQUIV="REFRESH" CONTENT="0; url=http://prizm.lawrencefreenet.org/cgi-bin/gateway/index.cgi">


So the user types "slate.com", their port 80 call gets redirected by iptables to the local httpd on the ROBIN box.

The local httpd loads index.html and sends the user to the LOCAL ROBIN NODE /cgi-bin/gateway/index.cgi

PROBLEM: Some browsers cache this, so every time the end user tries to go to "slate.com" their browser sends them to prizm.lawrencefreenet.org.

We have tried the following code, which expires the entry, sets no-cache, uses javascript to do the redirect, etc. Still no love. Users entries are cached locally and they have to clear their history or hit "Ctrl+F5"

Code:

<HTML>
<HEAD>
    <META HTTP-EQUIV="EXPIRES" CONTENT="0">
    <META HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">
    <META HTTP-EQUIV="CACHE-CONTROL" CONTENT="NO-CACHE">
    <META HTTP-EQUIV="REFRESH" CONTENT="0; url=http://prizm.lawrencefreenet.org/cgi-bin/gateway/index.cgi">
    <script type="text/javascript">
    window.location.replace('http://prizm.lawrencefreenet.org/cgi-bin/gateway/index.cgi')
    </script>
</HEAD>
<BODY>
    You are being re-directed to the payment portal.  If you are not re-directed please <a href="http://prizm.lawrencefreenet.org/cgi-bin/gateway/index.cgi">click here</a>.
</BODY>
</HTML>


We need to be able to send a 302 header message with the index.html file or via a local .htaccess entry.

2. We are losing all kinds of traffic via the madwifi drivers. 20%+ in some cases. I have no idea what the cause is, but the result is a network where the connection is unusable past 2 hops.

This has cost us AT LEAST 200 customers since June ($5,000/Month) and we really need to work on a fix now that we have funds/time to deal with the problem.

3. Our dashboard is set up to so that it only provides a response in the following circumstances:

a. If the last update was more than 10 min ago.
b. or the "RR" flag is set to "1" in the ROBIN request.
c. or the response string has changed.

This being said, the ROBIN code writes the following files every time it sends an update:

Code:

-rw-r--r--    1 root     root           90 Nov 11 13:17 ./etc/config/allowed.gateways
-rw-r--r--    1 root     root          121 Nov 11 13:17 ./etc/config/allowed.repeaters
-rw-r--r--    1 root     root          260 Nov 11 13:17 ./etc/update/general
-rw-r--r--    1 root     root           15 Nov 11 13:17 ./etc/update/madwifi
-rw-r--r--    1 root     root           29 Nov 11 13:17 ./etc/update/node
-rw-r--r--    1 root     root          290 Nov 11 13:17 ./etc/update/nodes
-rw-r--r--    1 root     root          215 Nov 11 13:17 ./etc/update/management
-rw-r--r--    1 root     root           27 Nov 11 13:17 ./etc/update/mesh
-rw-r--r--    1 root     root           21 Nov 11 13:17 ./etc/update/ra_switch
-rw-r--r--    1 root     root           19 Nov 11 13:17 ./etc/update/batman
-rw-r--r--    1 root     root           20 Nov 11 13:17 ./etc/update/radio
-rw-r--r--    1 root     root           75 Nov 11 13:17 ./etc/update/wireless
-rw-r--r--    1 root     root           78 Nov 11 13:17 ./etc/update/iprules
-rw-r--r--    1 root     root           46 Nov 11 13:17 ./etc/update/secondary
-rw-r--r--    1 root     root           15 Nov 11 13:17 ./etc/update/acl
-rw-r--r--    1 root     root           21 Nov 11 13:17 ./etc/update/cp_switch
-rw-r--r--    1 root     root          154 Nov 11 13:17 ./etc/update/egregious
-rw-r--r--    1 root     root           48 Nov 11 13:17 ./etc/update/installation
-rw-r--r--    1 root     root           92 Nov 11 13:17 ./etc/update/deauth
-rw-r--r--    1 root     root           98 Nov 11 13:17 ./etc/update/extreport


Unfortunately, these files are in the /dev/hda2 file system which is mounted ext2.

This means that we are writing to the file system every 5 min. That is 12/hour, 288/day, 100,000 every 347 days.

Since the CF is only rated to 100,000 write cycles, this file system will become corrupted in less than 1 year. We need to work to keep this file system mounted read-only unless the radio receives an actual, new update string.

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Fri Nov 12, 2010 2:47 am    Post subject: Solved Item 3 Reply with quote

I solved item 3 above to the best of my limited ability.

The problem was simply that the update scripts are writing their output to /etc/update instead of putting the results in a temp directory.

To solve the problem I moved most of the update files to the path /tmp/update


I am going to do a couple reboots, run it overnight and see if it works but here are the steps.

1. On boot create the file folder /tmp/update
2. Change $WDIR to /tmp/update in /lib/robin/checkin-functions.sh
3. Change $WDIR/uci-files to /etc/update/uci-files in /lib/robin/checkin-functions.sh

4. Re-write the following in /usr/sbin/update-nodes.sh

Code:

#!/bin/sh
# /usr/sbin/update-nodes.sh

CALLER=$1
CALLER="${CALLER:-1}"
WDIR=/tmp/update

ALLOWED_GW="/etc/config/allowed.gateways"
ALLOWED_RP="/etc/config/allowed.repeaters"
TMP_ALLOWED_GW="/tmp/allowed.gateways"
TMP_ALLOWED_RP="/tmp/allowed.repeaters"
rm -f $TMP_ALLOWED_GW $TMP_ALLOWED_RP

k_strict_mesh=$(uci get management.enable.strict_mesh)
IP_mesh=$(uci get node.general.IP_mesh)
Current_Host_Name=$(uci get system.@system[0].hostname)

while read r ; do
        [ -n "$r" ] && {
                NODE_ROLE=$(echo $r |awk '{print $1}')
                NODE_IP=$(echo $r |awk '{print $2}')
                HOST_NAME=$(echo $r |awk '{print $3}'|tr -d '\n\r\v' |tr [*] ['_'] |sed  s/"'"//g )
                NODE_MAC=$(echo $r  |awk '{print $4}' |tr A-Z a-z)
      
      # Write the updates to the temp files
                case $NODE_ROLE in
                        G) echo -e "$NODE_IP \t $NODE_MAC" >> $TMP_ALLOWED_GW ;;
                        R) echo -e "$NODE_IP \t $NODE_MAC" >> $TMP_ALLOWED_RP ;;
                esac

        }

done < $WDIR/nodes

# Only write to the disk if the temp file doe not match the config file
if [ -f "$TMP_ALLOWED_GW" ] ; then

   if [ -n "$(diff -q $TMP_ALLOWED_GW $ALLOWED_GW)" ] ; then
      mv $TMP_ALLOWED_GW $ALLOWED_GW
   fi

fi

if [ -f "$TMP_ALLOWED_RP" ] ; then

   if [ -n "$(diff -q $TMP_ALLOWED_RP $ALLOWED_RP)" ] ; then
      mv $TMP_ALLOWED_RP $ALLOWED_RP
   fi

fi

if [ "$IP_mesh" == "$NODE_IP" ] ; then
   if [ "$Current_Host_Name" != "$HOST_NAME" ] ; then
      uci set system.@system[0].hostname=$HOST_NAME
      uci commit system
      echo "$HOST_NAME" > /proc/sys/kernel/hostname
   fi               
fi

[ 1 -eq "$CALLER" -a "$k_strict_mesh" -eq 1 ] && /lib/robin/strict-mesh.sh
#



I might add a little logic that nukes off $ALLOWED_GW and $ALLOWED_RP if the update didn't get any nodes in the update.


5. Change /etc/update to /tmp/update in the following files /usr/sbin/

Code:

cd /usr/sbin
sed -i 's/etc\/update/tmp\/update/g' *


6. Make a change the the RSSI update script.

/lib/robin/checkin/50_RSSI

change /etc/update/nodes to /tmp/update/nodes

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
foxtroop11
Service Provider
Service Provider


Joined: 22 Mar 2009
Posts: 1168
Location: Ansbach, Germany and sometimes the States

PostPosted: Fri Nov 12, 2010 4:43 am    Post subject: Reply with quote

Thanks for the examples. I've always wondered why so much was being written in the firmware, all the uci commits and stuff as well on each boot. This seems like a very good solution and thanks for sharing.

Do you think one should be as concerned when it's writing to the routers flash compared to your cf flash card?
Back to top
View user's profile Send private message
codyc1515
Moderator
Moderator


Joined: 31 May 2010
Posts: 1752
Location: New Zealand

PostPosted: Fri Nov 12, 2010 5:10 am    Post subject: Reply with quote

foxtroop11 wrote:
Thanks for the examples. I've always wondered why so much was being written in the firmware, all the uci commits and stuff as well on each boot. This seems like a very good solution and thanks for sharing.

Do you think one should be as concerned when it's writing to the routers flash compared to your cf flash card?

I was fairly sure that writing to the internal flash would be worse than to an external card..

_________________

Only registered users can see links on this forum!
Register or Login on forum!

Back to top
View user's profile Send private message Visit poster's website
foxtroop11
Service Provider
Service Provider


Joined: 22 Mar 2009
Posts: 1168
Location: Ansbach, Germany and sometimes the States

PostPosted: Fri Nov 12, 2010 12:16 pm    Post subject: Reply with quote

It must not be to bad since people have devices that are probably several years old still running robin Smile
Back to top
View user's profile Send private message
foxtroop11
Service Provider
Service Provider


Joined: 22 Mar 2009
Posts: 1168
Location: Ansbach, Germany and sometimes the States

PostPosted: Fri Nov 12, 2010 1:49 pm    Post subject: Reply with quote

Just curious, if the conditions are not met are you having the dashboard send a zero length reply? It's my understanding robin can handle this and was added some time ago.
Back to top
View user's profile Send private message
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Fri Nov 12, 2010 8:45 pm    Post subject: CF Wear - Zero Length Reply Reply with quote

CF Wear

On the CF wear side, the router image is a JFFS image. This file system type is designed for compact flash and has built in wear leveling. SquashFS is the same way (only it is a Read Only file system - perfect for CF)

On the Soekris load, the file system type is ext2, which is not designed for flash. As a result the only wear leveling is whatever the CF vendor built into the physical card. If you create a file and write to it every 5 min, eventually the sectors that the file is stored on get corrupted and you end up needing to change out the CF card. $10 - no big deal - unless you have to change out 500 of them which takes 3 months of full time work in a bucket truck - $15,000 to $20,000.

We have been seeing these CF cards come back with errors on them and have lost 20+ CF cards. We don't KNOW that this is the problem, but it is a potential culprit, so we are going to work to limit write operations to reboots/rare events in order to make sure that these don't get written to each time an update is run.

Zero Length Reply

I thought that this would solve the problem with writes as well, so I wrote some dashboard code so that updates are only sent if there are changes, if the node requests them or if no update has been sent for 600 seconds.

Unfortunately, this doesn't appear to work. If the update script receives a zero length reply, the update script uses /etc/update/received.old and processes the response just as though it got a fresh response from the dashboard.

This causes write operations to all of the files in /etc/update/ as well as /etc/config/allowed.gateways and /etc/config/allowed.repeaters.

I looked at mounting /etc/update as a RAMdisk, but decided it is cleaner to move all of the scripts to the /tmp directory which is already on a RAMdisk.

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
Antonio (isleman)
Site Admin
Site Admin


Joined: 10 Feb 2008
Posts: 2323
Location: Toscana, Italy

PostPosted: Fri Nov 12, 2010 10:40 pm    Post subject: Reply with quote

ext2 is mounted with no atime juts to limit i/o operations, anyway $10 for a CF means a not-so-good stuff. I do not want speak about manufacturers (at least in a public forum) but I can assure you that other brand CF (used in Alix-2C2 and MikroTik) are really solid rock.

But I definitely agree on the processing of the received.old file. Unfortunately, that is the best way to be safe that the settings sent by dashboard are really applyied. I'm working at that problem since some weeks: the idea is to process the whole received.old file w/out splitting the file into n-small files (if you know the code then you may understand waht I mean).

As you already have posted, I must say that the development of r3xxx is frozen just to add only the needed fixes and not new features which may introduce new bugs. In my opinion, we are in the right way to get a pretty stable r3xxx version (currently r3625).
Back to top
View user's profile Send private message Send e-mail Visit poster's website
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Fri Nov 12, 2010 11:38 pm    Post subject: Since you is...... Reply with quote

Antonio,

Since you are here.....

Where are the uci settings written to in the Soekris version?

Also: was there are reason for choosing OpenWRT 1.11 and not 1.17?

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
codyc1515
Moderator
Moderator


Joined: 31 May 2010
Posts: 1752
Location: New Zealand

PostPosted: Fri Nov 12, 2010 11:42 pm    Post subject: Reply with quote

Antonio (isleman) wrote:
As you already have posted, I must say that the development of r3xxx is frozen just to add only the needed fixes and not new features which may introduce new bugs. In my opinion, we are in the right way to get a pretty stable r3xxx version (currently r3625).

Excellent stuff, heres hoping for a stable release.

_________________

Only registered users can see links on this forum!
Register or Login on forum!

Back to top
View user's profile Send private message Visit poster's website
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Sat Nov 20, 2010 6:06 pm    Post subject: Reply with quote

FYI for everyone. The settings for uci are written to small files in /etc/config.

They are only overwritten if a change has been made.

Hope this helps someone else. I was looking for a single large file and hadn't made any changes, so the write operation wasn't changing any file dates.

Thanks,

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
Antonio (isleman)
Site Admin
Site Admin


Joined: 10 Feb 2008
Posts: 2323
Location: Toscana, Italy

PostPosted: Sat Nov 20, 2010 7:14 pm    Post subject: Reply with quote

Quote:
They are only overwritten if a change has been made.

how you detect if a change has made?

I try to clarify the 'original' issue.
You could compare the received reply against the old reply and they may be the same: in this case you may think that no change happened so dashboard and node settings are in sync... but are you really safe that the settings in the old reply were really applied to the node?
That's the problem! What about if a crash/reboot happened during that operation?
We could risk that node-settings and dashboard-settings are not in sync by just comparing the new reply against the old reply. That is why I always compare dashboard-settings against the node-settings and - unfortunatelly - at least now I need to write smallish files in /etc/update directory at every checkin.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
foxtroop11
Service Provider
Service Provider


Joined: 22 Mar 2009
Posts: 1168
Location: Ansbach, Germany and sometimes the States

PostPosted: Sat Nov 20, 2010 7:47 pm    Post subject: Reply with quote

Why can't it be written in /tmp like he has suggested? Then it would not be a big deal to process each time. Maybe you could make it process a no reply result from the dash in /tmp but a real result in flash so it's there on a reboot to process once.

How are the nodes working that have updates written to /tmp btw?
Back to top
View user's profile Send private message
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Sat Nov 20, 2010 7:49 pm    Post subject: Reply with quote

I understand your point.

The radio should run an update with RR=1 on reboot and continue running with RR=1 until it has received and successfully processed an update.

Alternatively we could simply write an MD5 Hash of the reply to a file if the update is complete:

Code:

if [ UPDATE COMPLETE SUCCESSFUL ] ; then

httpd -m "$(cat /etc/update/received.old)" > /etc/update/success

fi


Then just check periodically to make sure that the string in received.old matches the string in /etc/update/success

Code:

if [ "$(httpd -m "$(cat /etc/update/received.old)")" == "$(cat /etc/update/success)" ] ; then

            exit;

fi

// If we got here, go ahead and process the update.
Back to top
View user's profile Send private message
Antonio (isleman)
Site Admin
Site Admin


Joined: 10 Feb 2008
Posts: 2323
Location: Toscana, Italy

PostPosted: Sat Nov 20, 2010 7:59 pm    Post subject: Reply with quote

Quote:
Alternatively we could simply write an MD5 Hash of the reply to a file if the update is complete

or just a flag somewher... but that could not be sufficient.
The update sequence involve the scripts in /usr/sbin: one of these files may fail w/out breaking the sequence.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
codyc1515
Moderator
Moderator


Joined: 31 May 2010
Posts: 1752
Location: New Zealand

PostPosted: Sat Nov 20, 2010 9:05 pm    Post subject: Reply with quote

foxtroop11 wrote:
Why can't it be written in /tmp like he has suggested? Then it would not be a big deal to process each time. Maybe you could make it process a no reply result from the dash in /tmp but a real result in flash so it's there on a reboot to process once.

I may be wrong, but please feel free to correct me but the open-mesh dashboard will only send the reply once, regardless of the RR setting that Robin provides. e.g. It only sends them when you have changed the settings.

_________________

Only registered users can see links on this forum!
Register or Login on forum!

Back to top
View user's profile Send private message Visit poster's website
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Sat Nov 20, 2010 10:17 pm    Post subject: Reply with quote

Writing these files to /tmp is working fine. I am going to push the update to the entire network tonight.

Cody - Don't know, we gave up on the open-mesh dashboard when it became clear that the source was not going to be available to us in the time frame we required.
Back to top
View user's profile Send private message
foxtroop11
Service Provider
Service Provider


Joined: 22 Mar 2009
Posts: 1168
Location: Ansbach, Germany and sometimes the States

PostPosted: Sat Nov 20, 2010 10:38 pm    Post subject: Reply with quote

I'll have to go back and look at your patches again. You might have said, but is the /etc/update/received.old written atleast once to the flash? I just seem to recall during boot the recieved file being processed as well and just thought if all was in temp it would of course not be there during a reboot. Sorry if you answered already, just looking at so many things that I easily get mixed up.
Back to top
View user's profile Send private message
oojoshua
Moderator
Moderator


Joined: 27 Aug 2009
Posts: 132

PostPosted: Sun Nov 21, 2010 11:29 pm    Post subject: Reply with quote

Yeah, it wouldn't be there on boot, however, the /etc/config/ files for the UCI are there, so it comes up with the correct settings.

This is working fine on our network. The only failure case I see is the one that Antonio pointed out (i.e. a reboot half-way through an update) and I don't see that as a huge issue.

I am far more concerned about 100% chance that our CF cards are going to go bad in the next 12 months unless we do something.

_________________
Joshua Montgomery
Community Wireless Communications Co.

546 Soekris 4801 Based Mesh Nodes (SR2 Cards)
800+ PePWave CPE Devices
8 WBD-500 Open-Mesh Nodes - (Outdoor Hardened)
3 ECB-3500 Indoor Nodes
3,600 Unique Users/Day
600 GB of Traffic/Day
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    ROBIN - Open Source Mesh Network Forum Index -> Soekris All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
c d
e



Powered by phpBB © 2001, 2005 phpBB Group

Abuse - Report Abuse - TOS & Privacy.
Powered by forumup.it free forum, create your free forum! Created by Hyarbor & Qooqoa
Confirmed

Page generation time: 0.539