View Full Version : StarV3 1.0.0(beta-14) build 906 (WAR Edition) is ready for testing
The new release is on-line and ready for public testing. Please download it from the www.star-os.com/starvx (http://www.star-os.com/starvx) website.
There are release notes in the firmware archive which will list the steps needed to upgrade.
Most settings are preserved during the upgrade. For the list of ones that are not, please refer to the release notes.
Any and all comments are welcome.
Changes since the last beta:
*) cbq should now behave as expected, and no longer prevent traffic from flowing through the system
Release Caveats:
*) see previous beta notes
Some common Q&A answers regarding the new release:
Q: Is cloaking supported in this BETA?
A: Yes, cloaking is supported as of beta-7
Q: What kind of performance should we expect?
A: If you disable connection tracking, the performance while routed should be roughly the same, if not better than a bridged StarVX system.
Q: Does this beta contain WEP?
A: Yes, WEP is fully supported as of beta-12
Q: What interface will StarV3 use?
A: It will use SSH, much like StarOS. The current interface borrows much from StarOS, but enhanced for the WAR platform.
Q: Will the same StarVX atheros features be present in StarV3 beta?
A: Most of the basic settings are present, however once we complete the new Atheros configuration dialog, the rest will follow.
Q: Will there be firewall support?
A: Yes, firewall, NAT and CBQ are all available in this release.
Q: Is there WDS support?
A: The first BETA does not have WDS support, so you need to convert your StarVX network to a routed design before upgrading, or wait until this feature is available.
Q: Is there any way to secure a wireless connection in the first BETA?
A: Yes, we have included VDS support which supports AES encryption and can be used to secure your connection, or act as a WDS alternative.
Q: Will there be tcpdump, or some other way to see system traffic?
A: Yes, we have included beacon, a real-time traffic monitor which can be used to see the active connections in your system.
Q: Will the system be field upgradeable?
A: Quite definitely, the system is fully field upgradable.
Please feel free to post any additional questions you may have.
Updated and testing with CBQ now. Also, that kernel-generated error message has gone away.
Keep us informed as to your results. The new version should work much better now.
I have had no CBQ problems today.
I have a much more alarming problem, though. My WAR2 running beta14 just pegged at 100% CPU and stopped passing traffic to the wireless clients on its atheros card.
A couple of hours ago it did this, too, but only about 4 or 5 clients lost connectivity. It actually said the 4 or 5 clients disassociated in the log and it would not accept any new associations until I rebooted. The clients were all different, D-Link G810s and WRAPs, nothing similar about them. When this happened, the association list showed one client was in a state "N" rather than the normal "C" and it was also showing "0" for r-tx.
Just now when the WAR2 went to 100% CPU (took 30+ seconds to login to it) and stopped passing traffic, 11 out of the 22 associations were showing the "N" state. There was absolutely nothing interesting in the syslog. Two of the associations in the "N" state were showing "0" for r-tx.
Thank you. Do you know if a card activate resolves this problem, or only a reboot? Also, do you know how long after a reboot does this occur?
Before you activate changes, go into the card configuration dialog and hit 'Ok' without modifying anything. At this time, you can activate changes and have that card be reset. Please let us know if it occurs again after that, or starts behaving from that point forward. This information will help us identify the problem quicker.
Thanks!
An activate had no effect on the problem, only a reboot.
It was up for about 24 hours before it happened the first time, the second time which was more major (all clients rather than just 4 - 5 stopped passing traffic) happened about 2 hours after boot.
I believe I can reproduce it by leaving an upload going for a while. I haven't had this problem until today when I had a long upload going. Then after the first time I had to reboot, I left the upload going for 1 - 2 hours and suddenly it did that. The upload was only going about 30K - 40K/sec.
Thank you, I'll try and duplicate the problem.
Well, while you are looking into problems here is another one. I upgraded a link that had been on beta 5 to beta 14 last night. After reboots at both ends the link did not come back up. A sight survey showed the beacon from the remote end and the logs showed repeated failed attempts to associate with the ap. Since I wasn't sure what the problem was I reconfigured the end I could get to as the ap with the most basic settings I could. I then made the trip to the other end configured it as a client and it linked immediately. I reconfigure everything again and was back up and running. The only thing that seemed odd was the default distance came up as 35 miles and when I went to save the config it complained that that was too far for the mode it was in. I dropped it down to 10 miles and everything saved fine. I don't know what happened, but a guess might be that the default distance after the upgrade was wrong and the timing was off enough that it did not succesfully associate even though it could see the other end. I don't have another test subject I can use where the remote end isn't a real pain in the butt to get to so I am reluctant to try this upgrade twice.
The distance feature has been revamped between beta-5 and beta-14. Before you upgrade, ensure they are set to something sane, such as 10 miles, especially if you have links that require the distance option set.
Yep, that was what I guessed after the fact. Another quibble is that the speed shown does not seem to reflect the cloaking feature.
Thanks, Tony
This does raise a question for the upgrade of a few remaining units that were still starVX. Will I run into a problem going from vx1.06 to v3.b14.
The cloaking and proper distance options should be preserved during the StarVX upgrade, assuming you are going directly to beta-14. I would read through the readme file that comes with beta-14 to make sure nothing else will jump at you during the upgrade.
lonnie
03-26-2006, 03:11 PM
I am not sure what you mean by a quibble with the test. Please explain that one.
The speed test has no knowledge of the mode and simply reports what it achieves. What benefit would a speed test be that adjusted results for certain modes?
Yep, that was what I guessed after the fact. Another quibble is that the speed shown does not seem to reflect the cloaking feature.
Thanks, Tony
At the moment, the cloaking option does not scale the rates that are shown, like it does in StarVX apConfig. This is purely cosmetic, and does not effect the operation at all.
Yep, that was what I guessed after the fact. Another quibble is that the speed shown does not seem to reflect the cloaking feature.
Thanks, Tony
lonnie, the speed reported on the association display does not appear to reflect the cloaking. For instance x2 and a rate of 48.
lonnie
03-26-2006, 06:31 PM
OK, we had a couple of people worried about the speed test in that it gave numbers that suggested non cloaking.
As Tony says the display is cosmetic only. In a sense this current display is actually better since it uses the same reference as non cloaked and therefore is more easily compared. At any rate it will get repaired but since it is not a show stopper issue we prefer to leave it until we have a few other important things worked out.
lonnie, the speed reported on the association display does not appear to reflect the cloaking. For instance x2 and a rate of 48.
I sort of see your point on the display, but without some indication that cloaking is in effect it would be very misleading. The lack of priority is why I said it was a quibble.
kadanem
03-27-2006, 02:21 AM
I have same problem ... in 1.0.6 and StarV3 1.0.0
I use linux server to reboot WAR2 board (ping to 2 IP and starutil to reboot) :-)
I have had no CBQ problems today.
I have a much more alarming problem, though. My WAR2 running beta14 just pegged at 100% CPU and stopped passing traffic to the wireless clients on its atheros card.
A couple of hours ago it did this, too, but only about 4 or 5 clients lost connectivity. It actually said the 4 or 5 clients disassociated in the log and it would not accept any new associations until I rebooted. The clients were all different, D-Link G810s and WRAPs, nothing similar about them. When this happened, the association list showed one client was in a state "N" rather than the normal "C" and it was also showing "0" for r-tx.
Just now when the WAR2 went to 100% CPU (took 30+ seconds to login to it) and stopped passing traffic, 11 out of the 22 associations were showing the "N" state. There was absolutely nothing interesting in the syslog. Two of the associations in the "N" state were showing "0" for r-tx.
Thank you kadanem,
From the sound of your post, and the versions you list, I would suspect what you are encountering may be unrelated. Can you tell me what version of StarV3 you are running (build number), and what you saw when you logged in?
From internal testing, this 'N' and CPU load problem appear to be related to the Atheros card being over worked, due to severe noise spikes in the channel being used. We are still investigating however.
Bossman
03-28-2006, 04:14 PM
We had 2 issues last night which I'm not sure are related to anything.
1. Upgraded this box from Vx1.06 to the stepup and then to VV 1.0.0 b14. Everything was fine... continued on and did another with no problems. After getting back to the office I was doing some adjustments and switching the rest of the links over to the new backhaul... everything good. Just before going home I noticed a wrong subnet mask on Ether2. I changed it and the link died (I was talking to it via wpci1). After a drive and a climb, I took the box down and it's back to Factory defaults. Everything from the dns name to the channels etc. NOTHING from the original working config survived. I still had the old link running so by 5:00am I had it all working again.
2. Attempted my 5th box upgrading from Vx1.06 to the stepup and then to VV 1.0.0 b14. As it was UPLOADING the stepup (not flashing) the connection to the box died. I'm not sure what I have yet, but thus far no IP connectivity works. All other stepups worked fine with very little to adjust after the fact.
Again, not sure if any of this relates to software or just the fact that the universe hates me, but I thought I'd share it with everyone.
Bossman
03-28-2006, 09:54 PM
Now it looks like I have additional problems....
I reprogrammed the box from #1 above (location H) and put it up. It's feed from the NOC (location V) works fine and performs well. The partner down the road (location E) links up but does not show an IP in the wireless association. There is a WRAP at location E which can ping the war it's connected to, but cannot SSH in. When I do a TCP dump on the WRAP, I can see requests from time to time, but it's like it's dead for anything but a ping on the Ethernet side.
am I alone here?
lonnie
03-28-2006, 10:35 PM
What are you using for power supply?
Bossman
03-28-2006, 10:38 PM
All have 24V 1amp power supplies. Bench systems ran for over a week with no problems. I initially had a 12V 1a at location H but thought it was causing my odd RSSI numbers.... that ended up being a VERY bad dish alignment at location E
Now it looks like I have additional problems....
I reprogrammed the box from #1 above (location H) and put it up. It's feed from the NOC (location V) works fine and performs well. The partner down the road (location E) links up but does not show an IP in the wireless association. There is a WRAP at location E which can ping the war it's connected to, but cannot SSH in. When I do a TCP dump on the WRAP, I can see requests from time to time, but it's like it's dead for anything but a ping on the Ethernet side.
am I alone here?
Are you trying to SSH into the WAR from the WRAP system, or from a Windows system? If you are trying to get in from the WRAP, you will need StarOS v2.11.0 which will allow you to log into a WAR properly.
Bossman
03-29-2006, 02:44 PM
Are you trying to SSH into the WAR from the WRAP system, or from a Windows system? If you are trying to get in from the WRAP, you will need StarOS v2.11.0 which will allow you to log into a WAR properly.
That could be the problem. We're currently running 2.01.1 on that box and I've not adjusted routes to get to it from a windows box. I'll have to decide if I want to chance an upgrade over the wire or make the drive. Thanks!
nelson05
03-29-2006, 07:01 PM
Just had something similar happen to my WAR that Tog reported though I didn't think clearly enough at the time to look at the CPU usage before I hit restart. A large number of my clients went to N (around 27 out of 61) for their status in the association table and stopped passing data. I applied changes to see if that would bring things back to life, but they were still inaccessible.
I then told the system to restart via the SSH menu, but the WAR did not come back up (I couldn't ping it through its ethernet interface). I had to power cycle the box remotely to revive it...once I did, everything came back up and has been steady since. About 45 minutes. The box ran fine for hours and would have run since last night probably except that I crashed it testing the save feature (I documented that issue in the V3 section).
nelson05
03-29-2006, 08:56 PM
Ouch- maybe there is a greater load on the AP this evening, but we had another instance where a large number of our clients are unable to connect to the network and show up in the client list display as N. I took a little extra time this round and grabbed a little more information and some more screenshots. CPU usage does not appear to be a factor in this as it is never gets much past 15%.
http://www.springvillewireless.com/images/WARAssoc1.png
http://www.springvillewireless.com/images/WARAssoc2.png
http://www.springvillewireless.com/images/WARAssoc3.png
The first screenshot is of the association window with the clients that have turned to N status and are unreachable. The second is a section of the ARP table and the third is another capture of the system's ARP table. The incomplete lines appear after I attempt to ping one of the "N" clients and the unit does not respond. Before I ping, the ARP table contains the correct MAC address, the WAR just stops being able to talk to them. After I ping, the good MAC is purged and the incomplete entry appears as shown in screenshot 3. I'm sure this is by design, but I just wanted to show as much of what I am seeing that I can.
An activate changes does not help- only a restart (or power cycle in my case, since I am currently having trouble restarting) cures it.
lonnie
03-30-2006, 12:47 AM
Is there anything common with the units that turn to N and are not reachable? Are you forcing G mode? Do you have the AP set for 24 mbps tx rate?
nelson05
03-30-2006, 01:26 AM
Sorry for not posting my config in this section as well... I didn't want to be redundant and post the same info in two different sections of the board. Here is a link to the thread where I explain my config in more detail:
http://forums.star-os.com/showthread.php?t=5245
And to specifically answer your questions--
I am forcing g mode only on the V3 AP and also have the transmit rate set to 24.
I can't see anything consistent between the clients that drop off and become "N". All are on a variety of versions of StarOS V2 (as you've mentioned previously, I don't break them if they don't need fixing), with some at the absolute latest version. All are scattered across different distances, with some being as close as .5 mile and others as far away as 4 miles. Additionally, the same clients don't always seem to go to "N". Sometimes one will be fine while others shift to N while another time it will be part of the affected group. Also, all seem to shift to N around the same time. It doesn't appear to be a gradual thing where each client individually moves to N status and the inability to pass data.
Tog- have you done any more testing on this?
Yes, my experiences have been the same as yours, but I have a few old Netgear WGE101s, WRAP/v2 clients and D-Link G810s. I have been talking to tony privately trying to convey as much information as possible about the problem.
The only interesting observation I have to add is I just got my big box of WAR boards today so I bench tested a couple WAR4s tonight. I transferred 4GB files across my LAN from WAR AP to WAR client. With a 40-bit WEP key, I stopped being able to pass traffic for minutes about three times. Twice was between 1GB - 2GB of steady full-speed data transfer in 2x cloaking.
I disabled WEP and successfully transferred the 4GB file twice.
Bossman
03-30-2006, 04:13 AM
I've noticed what appears to be a routing issue.
I was changing a far end link to run over the new gear. There was still some funny routing taking place where it would go from WAR to WAR and then on one hop flip back to the WRAP's.
I shut off all the RIP2 on all the systems between here and there and manually entered the routes for that far subnet.... still nothing. I ended up having to reboot the WAR board in the middle hop in order for the routes to work as I had entered them.
All systems are running on V1.0.0 b4. 1 System still had connection tracking turned on. All systems are running on 802.11a with 2x cloaking, short preamble and super AG.
I've also noticed latency that wasn't there in the early betas with the AP config. I had tested a WAR-WAR link with a throughput test running at 17Mbps and had 8ms average with no big blips. Now on that exact same link I'm seeing about 13ms avearge with blips up to 150ms or more and only basic user traffic in the pipe. I also see a few -300ms or similar pings from windows (and possibly from the WAR sytem too).
Finally, as I type this I tried to log into the new war on the far side, and as soon as I hit the enter key on the password, all traffic died. This last one could be just my bad luck though.
----- Update ------- I drove out and the box although still powered up was dead to pings etc. I power cycled it and all is well again and work as it was before the freeze.
For those experiencing the 'N' problem, where people can no longer associate.
Can you please post with the following information:
1.) What build are you running?
2.) What mode (802.11a or 802.11g, etc.) / channel are you using?
3.) Are you using WEP?
4.) How many times has this happened on the same system?
5.) Is this a WAR-2 or WAR-4 system?
Thanks!
nelson05
03-30-2006, 08:29 AM
1) Beta-14
2) 802.11G only, 2412 on the CM9 that clients are connecting to. The other three CM9s are in AP mode running in 802.11A on different channels, but have nothing connected to them currently.
3) Yes. I have two 104 bit keys programmed with key 1 checked as the default key and shared authentication set.
4) It has happened around five times in a two hour window. What is strange is that the system ran fine for most of the day and only started to exhibit the issue in the evening. I assume it was the increased use though traffic did not appear to be an issue. I ended up having to convert clients back to our overworked Cisco AP as we started getting complaints. While it isn't as fast as the WAR, the clients have been stable since.
5) This is a WAR-4 system
lonnie
03-30-2006, 08:33 AM
Can you try with b/g selected?
nelson05
03-30-2006, 10:06 AM
The WAR board stopped responding on what I think is an unrelated issue- even a power cycle won't bring it up. I'll have to swap it out with a WAR-2 until my new WAR-4s arrive....
I'm a little hesitant to put customers on the new unit and test the 802.11b/g mode you suggested, but I guess that is the only real way we can get a good test. As I mentioned previously, the problem didn't show up in the lab and actually didn't appear for a number of hours when the unit was in production with all of the clients. As Tog suggested earlier, it seems to rear its head when the AP is being pushed.
I'll see what I can do.
It looks like the primary cause of the problem you and tog are having, may be related to WEP. We are currently investigating this closely.
lonnie
03-30-2006, 11:00 AM
It is not simply when an AP gets pushed, since we have all sorts of high use AP units that do not do it. You are using WEP and in fact have a number of different clients, whereas we mostly do not use WEP and when we do it is to our own client software.
The WAR board stopped responding on what I think is an unrelated issue- even a power cycle won't bring it up. I'll have to swap it out with a WAR-2 until my new WAR-4s arrive....
I'm a little hesitant to put customers on the new unit and test the 802.11b/g mode you suggested, but I guess that is the only real way we can get a good test. As I mentioned previously, the problem didn't show up in the lab and actually didn't appear for a number of hours when the unit was in production with all of the clients. As Tog suggested earlier, it seems to rear its head when the AP is being pushed.
I'll see what I can do.
All right, Tony is on it like white on rice.
1.) What build are you running?
2.) What mode (802.11a or 802.11g, etc.) / channel are you using?
3.) Are you using WEP?
4.) How many times has this happened on the same system?
5.) Is this a WAR-2 or WAR-4 system?
1) You know from our PMs
2) 802.11g only AND 802.11b/g both reproduce it, no effect
3) Yes, a single 104-bit random digits, shared-key
4) Too many to count, sometimes it's fine for 6 - 12 hours, sometimes it does this twice an hour. It doesn't necessarily have anything to do with load on the AP, but you do have to have some clients on it of course to reproduce this particular behavior. It happens a lot even when the 20+ clients are pretty much sitting there idle.
5) WAR2, one CM9, huge 24V jameco brick
nelson05
03-30-2006, 11:49 AM
Actually Lonnie, as I posted previously, all of the clients are WRAPs running StarOS v2 so aside from using WEP, I am pretty much in the same boat as you in that I am running your client software exclusively for the units associated to this v3 test AP. I left the subscribers with Cisco gear on the Cisco AP at this tower site as well as any StarOS clients that I felt did not have a great signal. I didn't want to bog down the test with marginal signals or introduce another layer of complexity with clients from another vendor.
Out of curiosity, what is the maximum number of clients you have associated to one V3 radio? I plan to add another sector, once I am confident in V3's stability, but thought I would really give it a high load now, especially since this is exclusively an 802.11g AP. Am I pushing it with 60 clients in g mode?
Thank you for the info. We will get to the bottom of this right away.
nelson05,
There is a maximum limit of 200 or so associations (and several more in non-authenticated 'N' state). The practical limit one should impose for best performance is around 50-60 clients per radio.
Bossman
03-30-2006, 02:49 PM
The WAR board stopped responding on what I think is an unrelated issue- even a power cycle won't bring it up. I'll have to swap it out with a WAR-2 until my new WAR-4s arrive....
I'm curious what's up with the dead WAR. I'd like to know if it is in line with what I've seen and A: what triggered it B: what state it is in now.
nelson05
04-02-2006, 11:29 PM
An update on the dead Quad WAR....
Not really sure what triggered it. I did have to power cycle it a number of times after it would freeze when saving (a problemm which beta 15 thankfully fixed!). After one of the power cycles I wasn't able to get into it any longer.
Its current status is that it has reset to factory defaults. It would be no big deal except that the ethernet 1 port went out shortly before I put it into service and now I obviously can't get in to reconfigure an IP on any interface. The new shell option available at the console does allow you to see and do a lot. For example, I used the SSH client to SSH into itself (192.168.1.1) and was able to see the familiar main status screen. However, I couldn't activate any of the menu options to configure anything.
At this point I'm stuck with a useless board until I can figure out some way of assigning an IP to an interface other than Ethernet 1. Its a bummer, because I was afraid of something like this happening before I deployed the board and even suggested to Valemount that another IP for ethernet 2 be built into the factory default option. I wasn't expecting it in the next beta, but I guess I should have waited until another mechanism existed for gaining access. At least the board isn't completely dead.
Bossman
04-04-2006, 01:49 AM
I had the one that went back to factory, but I was ok there other than the down time.
The unit that died during the stepup.rom upload is now just booting in circles. It gives the Valemont BIOS and then the serial number and WAR 4 listing and then says it failed booting.... disabling Ethernet & rebooting... and so on... and so on.
You may have already done this, but if you send the system back for RMA, we'll be able to recover it for you.
Thanks!
nelson05
04-04-2006, 09:28 AM
Tony, do I need to do the same with my board or do you have any other ideas for recovery since I can boot into the console? As I mentioned previously, I can even SSH into the device, I just can't activate any of the menus.