PDA

View Full Version : StarV3 v1.2.9b build 2407 is ready for testing


tony
07-29-2007, 09:25 AM
The BETA has been released, and is available on http://files.star-os.com/

The Status of this BETA is stable, though use with caution and provide as much feedback as possible. Also, when upgrading to this release, ensure you make a configuration backup first. (starutil -d option)

While the change log is not complete at the moment, there are some changes listed on this thread http://forums.star-os.com/showthread.php?t=6714, and previous BETA postings (1.2.1b and newer)

http://www.star-os.com/images/new.gif1.2.x client notes:
For best results, please use 1.2.x clients with 1.2.x APs. If you wish to inter-mix with V3 1.1.x, or other APs, please disable SuperA/G features on the client first as to prevent any incompatibilities that can cause connectivity issues.

Changes since v1.2.8b build 2396
*) Fixed-rate support for non-managed systems has been restored. Only managed links will have the MAX rate ability.
*) Atheros channel ACLs are now functional once again.
*) Small performance gain on startup; will only output bootup messages if there is a terminal connected to the serial port.

Known deficiencies in this release:
*) Layer-7 support is not functional in this release.

Some operational differences include (mainly related to regulatory requirements):
*) Atheros channel and country code list no longer contains an 'All Channel' setting (## and #!).
*) U1 country code (US + FCC3) no longer contains the FCC3 channels, however country UZ can be used in it's place for systems that require them.
*) Due to the nature of our unique rate control, the rate specified is now a MAX rate, and not a FIXED rate.
*) The upcoming 1.3.0 release will contain both a World and FCC version. The World version will support the 'all channel' option.

For best operation, please ensure your client and AP use the same country code.

For those wanting to try the live upgrade feature via starutil, simply add the -a (apply) flag to the command line when uploading the firmware.
This requires that your system is using build 2321 or newer, and starutil 1.16.

HotSpot notes:

The HotSpot offered is vastly different from the version offered in V2.

To configure the HotSpot, there is a new 'hotspot' pull-down menu entry. The configuration script should be self explanitory.

The website users log into is now hosted on your own webserver, and not on the StarV3 box itself. We have offered a login webpage template on our files website, though any ChilliSpot login portal will work just as well.

While in the SSH interface, you can press ALT-H to view the on-line hotspot users. If there is a star (*) next to the IP address, that means the user has authenticated. The rest of the screen should be explanatory.

The hotspot user manipulation support via utilistar is also supported.

Sample HotSpot configuration steps:

Remove all IPs from wpci1 (this will be the AP the users log into)
Make sure your DNS server listed in "advanced->dns server list" is valid, as the hotspot service will require it.
Update RIP and OSPF (if used) to only advertise the Ethernet interface, as to not propagate the hotspot's private IP range.
Enter the HotSpot configuration script and enter the following commands:
hotspot enabled
interface wpci1
radius 1.2.3.4 <-- change to your radius server
radius_secret MySecret <-- change
login_server http://my-auth.server.com/ <-- change to the url where you installed the login script.
login_server_secret 5d8cp1fr9ua <-- change to match the one specified in your login script
dhcp_network 192.168.57.0/24
dhcp_dns 1.2.3.4 <-- change to your DNS server
Enter this in your NAT script: masq from 192.168.57.0/24 to dev ether1
Activate Changes, and you are done.If you use the default domain (hotspot.star-os.com), and default IP range of 192.168.57.0/24, then your customers can use the "exit" keyword in the Internet Explorer address bar to go to the logout page, if the popup window is closed. You can setup similar keywords for your own network if you do not use the default settings.

Notes regarding the Login script:

This is the web-based login prompt your HotSpot users will see when they try to access the Internet.

The system requirements needed for the Login script is a webserver and a stock install of PHP. (Perl login scripts are also available).

The only purpose of the login script is to collect a username and password from the user, and forward it back to the V3 hotspot system for authentication and can be hosted anywhere.

pananix
07-29-2007, 02:13 PM
Tony,

Something I noticed on 1.2.8b WRT vds that is also present on 1.2.9b: I get short periods (6 seconds or so) where the vds seems to stop then start. On a continuous ping, I loose 6 packets (approx) then it goes back to running smoothly. Total packet loss approaches 2% on 1000 pings. But when working through the link, it is annoying. Running a side-by-side from the public IP to the public IP, I get no loss, only on the vds connection. Clocks are within seconds of each other.

Also, do I understand from the above you changed the unmanaged stations to FIXED if other than auto is present in the connection rate?


David-

tony
07-29-2007, 02:17 PM
Yes, it will be a fixed rate if it's not set to auto.

With regards to VDS, can you check the logs and see if the VDS connection gets restarted causing some pings to fail?

pananix
07-29-2007, 02:49 PM
Yes, it will be a fixed rate if it's not set to auto.

With regards to VDS, can you check the logs and see if the VDS connection gets restarted causing some pings to fail?

Nothing on the VDS, but I have these every couple of minutes:
br0: neighbor 8000.00:0d:b9:02:24:08 lost on port1(eth0)
br0: topology change detected, propagating

I also see a lot of:
eth0: received packet with own address as source address

tony
07-29-2007, 03:22 PM
Topology changes in the bridge would cause brief interruptions for sure.

DrLove73
07-29-2007, 08:09 PM
I tested Sync button, and found errors with syncing, but only in some cases.
systems in question are X86-PC both, signal -77 on cloaking 1x.

I used 11a, channel 42, distance 10, and when I changed to cloaking 4x, when I tried to switch back to 1x, distance changed to 9.99, and sync stoped working. Then I switched to 2x with success, but switching to 1x did not work.

In all that testing, I also used channel 48, and corrected distance to 10, so I can not claim that flow of testing was exact, but one of combinations had a problem. It's 04:00h, so I'm of to sleap.

If unable to replicate problem, I'll test more and post exact settings.

jeff
07-29-2007, 09:58 PM
I have seen this several times in the last few days using both 1.2.8 and 1.2.9. These are backhaul links between two wraps using cm9's at each end with identical settings except for ap/client. The links will work fine for hours and then the idle time on the ap will start to climb and no traffic is flowing. A reboot of the ap clears the problem. I happened to be on the other side of this earlier and a site survey shows the ap with plenty of signal strength, but no traffic. You cannot ping the ap from the client and rebooting the client didn't fix it.

This is a show stopper until I figure it out.

Any help would be appreciated.

tony
07-29-2007, 10:06 PM
Are the software versions the same on both the AP and Client?

jeff
07-30-2007, 02:11 AM
Are the software versions the same on both the AP and Client?

The versions are identical. I just had it happen again and going into the wireless configuration and hitting ok without any actual changes and then activating changes was enough to get it running again.

Since I upgraded a bunch of wraps to v3 and am now having this problem does anyone have a link to download wrap v3 1.1.11 so I can take them back to a previous stable until this is figured out?

mimbach
07-30-2007, 02:33 AM
Jeff I uploaded it for you to http://www.rigidtech.com/jeff/
If you need any others let me know.

Mimbach

DrLove73
07-30-2007, 03:31 AM
Latest stable V3 is 1.1.13 build 2080, link is starv3-1.1.13-2080.X86-WRAP.zip (http://files.star-os.com/index.php?dir=Latest%20Firmware/Release/firmware-zips/&file=starv3-1.1.13-2080.X86-WRAP.zip)
Just have to login to site first to download.

pananix
07-30-2007, 06:37 AM
Yes, it will be a fixed rate if it's not set to auto.



Too bad that's not selectable. I liked the MAX rate. With the heterogeneous network I have, I will have to set all APs with Tranzeos back to auto, but then I guess I can expect quality issues with the WAR clients -- or just run 1.2.8b. That or get occasional disconnects from the Tranzeos. Aggghh.

David-

lonnie
07-30-2007, 09:26 AM
David,

We liked the MAX rate concept but way too many people complained. Personally I feel that MAX was way better but people hate change.

We are now putting our energy into making OUR units better, as in the new managed system to have AP and Client better sync'ed. I am tired of trying to make another product work as good as ours. I feel we have wasted too much time and the result is simply not as good as our units talking to our units (even without the custom features).

We would have had a better product than we do now, if we had simply kept developing things that made us better. Expect more StarOS only features to begin showing up.

I am aware of the complaints of a few that it will make their "other" gear useless. It has served well, but it is time to move forward and begin to innovate again. Use the old stuff on less critical segments. Sell it on eBay. Whatever.

jeff
07-30-2007, 09:30 AM
I have seen this several times in the last few days using both 1.2.8 and 1.2.9. These are backhaul links between two wraps using cm9's at each end with identical settings except for ap/client. The links will work fine for hours and then the idle time on the ap will start to climb and no traffic is flowing. A reboot of the ap clears the problem. I happened to be on the other side of this earlier and a site survey shows the ap with plenty of signal strength, but no traffic. You cannot ping the ap from the client and rebooting the client didn't fix it.


Lack of sleep kept me from realizing that while each of the troubled links has a wrap v3 as the ap. One of them is feeding a war. As I mentioned the firmware and settings are the same between the two.

soulmata
07-30-2007, 09:59 AM
I would much prefer to see more StarOS specific features over more generic "compatbility mode" features. If I wanted generic features, I'd pick from the dozen+ free firmwares that exist. When I want reliability and performance, I'll pick StarOS.

nelson05
07-30-2007, 10:27 AM
Agreed and we wanted to emphasize how impressed we are that Valemount has been able to rollout these great new features (again, we LOVE sync) to all platforms. While some may say, they are abandoning legacy products and compatibility, we're beyond pleased that the 150+ WRAPs we bought with v2 over three years ago are now happily running alongside WAR1+2s as cloaked V3 CPEs.

With the improvements made to starutil that allow live firmware upgrades and the new managed client concept, it is looking easier and easier to manage our network.


I would much prefer to see more StarOS specific features over more generic "compatbility mode" features. If I wanted generic features, I'd pick from the dozen+ free firmwares that exist. When I want reliability and performance, I'll pick StarOS.

tony
07-30-2007, 10:57 AM
While we will continue to work on compatibility with 3rd party and legacy devices, a good portion of our time will be spent adding, and refining StarV3 specific enhancements.

mimbach
07-30-2007, 11:30 AM
Guys I have been running 1.2.7b and it has been running well. Yesterday I upgraded most of the network to 1.2.9b.
On a 802.11g ptp link I have that is 5 miles, -57dbm, 18mb data rate set, 100% quality.
this morning my latency on the link skyrocketed. Not much traffic or anything. So I rebooted both ends, changed the distance to 10 miles still latency at least 300ms up to 8000ms.
So I downgraded to 1.2.7b with a latency of about 100-300ms. Next step was 1.1.13 2080 now my latency is under 2ms.
If there is any feedback I can give you let me know.

Mimbach

DrLove73
07-30-2007, 12:05 PM
While I also think it is natural to force features that make all-StarV3 network better then mixed network, I hope and expect that you, in future releases "do not brake" compatibility with other systems - clients.

It will definitely seem pocket change to most of you, but I invested $410 in your software for my small network (100+ clients), and planning $220 more in next months, in belief that Star-OS (now StarV3) is general purpose wireless AP. I had a choice then, illegal MT's ($0) or legal Star-OS that cost me cheap POP site so far (I do not regret it one bit!!!).

I also directed investment of ~$2000 (and ~$2900 planed in next months) in your direction from network I administrate 350Km from my location. I also recommend StarV3 to anyone that asks my about wireless POP's, even suggest people to select WAR's over Avila's. We might even finally start small reseller operation in my country.

I never saw any indication on your site that read "FOR OUR HARDWARE ONLY", so we expect that STARV3 will be good AP for general wireless clients (at least most of them that follow standards). Larger part of the world are poor countries, with limited client budget. That means no WAR1's, PC's (PII/PIII) instead of METRO's and cheap client units in plastic boxes or even PCI Wi-Fi cards.

I remodeled my network for All-Star-OS POP's and plan to continue to develop in that fashion, and every network I'm given design authority will be STARV3.

But clients will be like a "Tropical-mix juice", all flavors mixed together. I do not ask miracles, just to stay compatible with them, and I am sure you will be most-selling wireless software in the world for a long time.

As for the MAX rate vs. FIXED, best thing for both you and us would be an checkbox next to rate field that would say "Max on", just like "custom" for freq. If you do just one option, never mind witch, somebody will complain and drag you thru mud.

But if you squeeze checkbox in, since you already have both algorithms, and you spend so much time on there development and perfection, you will become heros for both fractions, with just one "if..then..else" loop.

Do not consider this post as an attack on you, you've done tremendous job, especially from 1.1.x to 1.3, and since I can not remember any remark to 1.2.x (well, except for an association reports by starutil), but rather startled customer seeking reassurance that his path in your shadow will not be stopped when you start your way to the stars.

mimbach
07-30-2007, 12:48 PM
We just added a new vlan interface and then assigned an ip to it. When we activated changes we lost connectivity. We waited 10 minutes for it and it stayed down.
We had to restart ospf manually and then in about 30 seconds it came back to life.
If ospf is being restarted correctly in 1.2.9b after an interface/vlan ip change/add upon active changes. Maybe a pause of a few seconds before restarting ospf could help.
Lonnie/Tony what do you think?

Mimbach

mimbach
07-30-2007, 01:35 PM
Running 1.2.9b we setup a new vlan interface. We have to set all devices to a max mtu of of 1450 that run through that vlan interface. Is there a way to get around this?

Mimbach

DLNoah
07-30-2007, 02:21 PM
One of our employees has a PtP shot to his house, about 7 miles away from the office. We upgraded both ends of this link from X86-WRAP 1.1.13 2080 to 1.2.9b 2407 today, and after the upgrade, computers connected to the domain from behind the client for this link were not pingable from computers on the main domain, or from the access point, though they were reachable from the client (all with <1ms ping times). Also, when we initially did the upgrade, the client was not pingable from my computer until after I logged into the access point, pinged the client, ssh'd into the client via the shell mode, and then pinged my computer from the client. Only after that point could my computer ping the client. Neither problem replicated after downgrading to 1.1.13.

Configuration for both sides:
802.11a channel in the 5.3 range
Cloaking 2x
Super A/G
Distance 9.00
Transmit auto

1x CM9 in a V3-WRAP on both sides, wpci0 bridged to eth0

tony
07-30-2007, 04:07 PM
Are you using WDS? If you are, your clients will have to send traffic (any traffic not originating from the CPE itself) before the AP knows the CPE is WDS capable. Once this is done, then the traffic will flow.

DLNoah
07-30-2007, 04:39 PM
In a situation where we have a V3 CPE with the ether/wpci bridged together, should DHCP requests be sufficient to trigger the WDS, or will the client need to be able to pass other traffic?

Otherwise, what I understand your post to say is... if I have any traffic from behind the client device, then the AP will realize it needs WDS mode for the client and I will be able to see all the computers behind the client device?

tony
07-30-2007, 05:12 PM
"In a situation where we have a V3 CPE with the ether/wpci bridged together, should DHCP requests be sufficient to trigger the WDS, or will the client need to be able to pass other traffic?"

Yes, as long as the CPE itself isn't causing the DHCP requests (doesn't use WDS), then yes, it should do the trick.

"Otherwise, what I understand your post to say is... if I have any traffic from behind the client device, then the AP will realize it needs WDS mode for the client and I will be able to see all the computers behind the client device?"

That is correct. Until this is done, you will be able to talk with the CPE (as that does not use WDS), but communications to the systems behind the CPE can only occur once the AP realizes this is a WDS-capable CPE.

DrLove73
07-30-2007, 06:05 PM
I tried to make starutil pull association reports from AP's. Is this currently possible, and if not, for what time/build is it planed?

tony
07-30-2007, 06:10 PM
No, association information will not be available via starutil, but it is available via SNMP.

jeff
07-31-2007, 04:50 AM
I have now seen the problem on 3 separate ap's. The symptoms are the same on each. The ap stops listening to the client. The client can see the ap using the site survey, but eventually the ap completely ignores it. All it takes is for the card to be reset and the link comes back up. At first I was rebooting the ap, but now I find I can just go to the wireless config and click ok without making any changes and the apply changes and it comes right back up. Each end is running exactly the same version (either 1.2.8 or 1.2.9) and the ap's are all wraps. Two of the clients are wars. I don't know if it is relevant, but one of the ap's has a 2.4 backhaul and a 5.8 and I just had to reset the 5.8 but not the 2.4. The radio card at each end of all the links is a cm9 running 2x cloaked.

Since there have been no suggestions and I can't seem to find any combination of settings that work I'm going to roll back to 1.1.11 and see if that fixes it.

I have 1.2.9 running on 3 aps multipoint to arround 100 clients at 2.4ghz with no apparent problems. It only seems to affect the 5ghz p2p connections and so far only if it is a wrap ap. So far the war ap's don't seem to be affected.

eoinok
07-31-2007, 05:02 AM
Hi guys, just want to report that we have deployed this on three x86 WRAP which are serveing as AP's and one backhaul link. We are 2x cloaking at 5GHz and have seen no problems.
Our CPE's (about 25 in total) are osbridges and are working fine. We even noticed in some cases after the upgrade that we had a 2-3db gain in signal strenght on the CPE's at the AP so good work in rewriting the atheros drivers!

tony
07-31-2007, 07:54 AM
Jeff, I have sent you a PM.

oscarBravo
07-31-2007, 08:27 AM
While we will continue to work on compatibility with 3rd party and legacy devices... On that note: I finally got to test, and WPA-PSK still doesn't work with Tranzeo CPQ clients.

To recap: we have 850-odd Tranzeos connected to WRAP V2 APs. Nothing would please me more than to upgrade the APs to V3 (which would mean I can replace some of the busier ones with WAR-2s or Metros), but WPA-PSK stopped talking to Tranzeos in 2.10.1b2, still didn't work in 2.11.0, and hasn't worked in any version of V3 that I've tested.

There was a release of the Tranzeo firmware that didn't work well with 2.10.0, but I was able to work with them to get it fixed and it's been behaving since. Is it possible to figure out what broke between 2.10.0 and 2.10.1b2, and never got fixed?

On another note, I've moved one point-to-point link to 1.2.8b, and I'm very pleased with the performance on what had been quite a marginal link up until then.

tony
07-31-2007, 08:33 AM
There is nothing wrong with the StarV3 WPA-PSK implementation (fully standards based), however you may need to tweak the cipher suit used a little, depending on what the Tranzeo systems actually support.

You may also need to enable WPA-only (disbale WPA2/RSN) in case that is confusing your clients.

I am pleased to hear the 1.2.8b release is working well.

DLNoah
07-31-2007, 09:03 AM
Tried upgrading the same link again, same problem. Before I upgraded, I remoted into a server behind the client to start a ping back to the main server, which is on my side of the AP.

Upgraded both sides to 1.2.9b at the same time, when they came back up the client shows * under A,Q,U,C,F,W,M but I cannot even ping the client from the AP. Client's ether0 is static assigned x.x.x.248, AP's ether0 is static assigned x.x.x.249.

If you would like to poke around, let me know and I can arrange it.

After downgrading the AP to 1.1.13, all traffic passed normally (client was still on 1.2.9b).

Are you using WDS? If you are, your clients will have to send traffic (any traffic not originating from the CPE itself) before the AP knows the CPE is WDS capable. Once this is done, then the traffic will flow.

lonnie
07-31-2007, 09:08 AM
PM either myself or Tony with the login details.

Tried upgrading the same link again, same problem. Before I upgraded, I remoted into a server behind the client to start a ping back to the main server, which is on my side of the AP.

Upgraded both sides to 1.2.9b at the same time, when they came back up the client shows * under A,Q,U,C,F,W,M but I cannot even ping the client from the AP. Client's ether0 is static assigned x.x.x.248, AP's ether0 is static assigned x.x.x.249.

If you would like to poke around, let me know and I can arrange it.

mimbach
07-31-2007, 12:42 PM
Guys,
I have posted 4 seperate issues and have not recieved a "you are doing it wrong" or "we are working on it".
If I am expecting to much please let me know. I simply want to help in the beta process so I have a product that works for others and myself.
I am begging you (Lonnie and Tony) to please respond so I dont sit here questioning myself if I have actually found a problem or not that causes my customers heartache.

1. snmp works great on the ether interfaces, and is hit and miss on the extended OIDs
2. ospf when adding ips/vlan ips to interfaces sometimes stops routing and doesn't start routing again until ospf has been restarted
3. mtu problem through vlans( i searched the forum but never found a definite answer)
4. On a 802.11g ptp link I have that is 5 miles, -57dbm, 18mb data rate set, 100% quality.
this morning my latency on the link skyrocketed. Not much traffic or anything. So I rebooted both ends, changed the distance to 10 miles still latency at least 300ms up to 8,000ms.
So I downgraded to 1.2.7b with a latency of about 100-300ms. Next step was 1.1.13 2080 now my latency is under 2ms

Mimbach

tony
07-31-2007, 12:51 PM
We got your messages and are currently investigating the issues you are seeing.

#1. The OIDs are dynamic, and if you walk the tree too slowly, they may change. Recommended solution is to use the snmpbulk* programs to get the information you need.

#2. We have been unable to duplicate this issue, however we are still investigating.

#3. This is the first we've heard of this issue, and will have to look into it.

#4. While we have not been able to duplicate this problem, the upcoming 1.2.10b release may help.

rbolduc
07-31-2007, 01:39 PM
#4. While we have not been able to duplicate this problem, the upcoming 1.2.10b release may help.

and I was going to start playing with the 1.2.9 tonight.. ill just wait ;)

oscarBravo
07-31-2007, 03:05 PM
There is nothing wrong with the StarV3 WPA-PSK implementation (fully standards based), however you may need to tweak the cipher suit used a little, depending on what the Tranzeo systems actually support.

You may also need to enable WPA-only (disbale WPA2/RSN) in case that is confusing your clients. I really don't want to seem argumentative, but WPA stopped working sometime between 2.10.0 and 2.11.0. With identical configurations, one works and the other doesn't. I've tried every combination imaginable of configurations on both Tranzeo and StarV3, but they won't sync up.

I hope you can understand why I keep pushing this back to you. When a new Tranzeo firmware release came out and WPA stopped working, I didn't even raise it here; with identical configurations both ends and no change in the StarOS version, it was clear that the fault lay with Tranzeo. On that basis I kept the pressure on them until they fixed it.

Conversely, from one release to another, StarOS stopped working with the Tranzeos, and hasn't worked since. I can accept that you've rewritten the WPA code, and that it's standards-based, but I can't go to Tranzeo and ask them to solve this problem when I have a working/not working scenario, with the only point of difference being the software at the AP end.

I haven't really made an issue out of this, but hopefully you can see my problem: I can't realistically replace 850 Tranzeos with WAR-1s; I can't realistically remove WPA from them all and replace with easily-hacked WEP - which leaves me with an AP platform it's rapidly getting harder and harder to find spare parts for, because I can't upgrade to a software version that gives me a growth path. Which sucks, because I like WARs and I like v3.

tony
07-31-2007, 04:16 PM
Starters, StarOS and StarV3 have nothing in common regarding WPA and comparing WPA support between the two is a moot point.

The implementation of WPA / WPA2 in StarV3 fully compatible with StarOS V2, StarV3, Windows (every wireless vendor you can think of), OSX, PS3, PSP, and a handful of hardware CPEs. The Tranzeo systems are the only ones I have heard of that have a WPA issue.

I do not think there is something wrong with the WPA implementation in the Tranzeo units, however many vendors tends to implement a subset of the WPA spec, and require some tuning on the AP.

At a minimum, setting the AP to tkip-only, and WPAv1 should be enough to get most systems going.

bobbyc
07-31-2007, 05:36 PM
Guys,
4. On a 802.11g ptp link I have that is 5 miles, -57dbm, 18mb data rate set, 100% quality.
this morning my latency on the link skyrocketed. Not much traffic or anything. So I rebooted both ends, changed the distance to 10 miles still latency at least 300ms up to 8,000ms.
So I downgraded to 1.2.7b with a latency of about 100-300ms. Next step was 1.1.13 2080 now my latency is under 2ms

Mimbach

I have a client who I upgraded to 1.2.8b over the air. The client never rebooted properly, probably due to a 15VDC power supply. Its a WAR2 with a 50' outdoor shielded ethernet run with 1' stranded jumpers (properly terminated is the point I'm making)

The AP is build 2080 and SR9 x2 AP locked to 12. Client is locked to 12 as well.

I went out to the client today, since it never rebooted properly after the upgrade. All I had to do was unplug the power supply and plug it back in, and they were back online... but pings were in the 1000-4000ms range to the AP.
I rebooted and tried a few different rates, but no avail. I backed it down to build 2080 from my laptop, and things were back to normal. I gave them a 24V power supply as well, and swapped their antenna and got the signal to -77.
Will be upgrading their radio back to 1.2.9b now that the signal is better, and the power supply is better. Will let you know if it suffers from same ping problem. Other 1.2.8b clients on the same AP didn't suffer from the high pings, but I downgraded them as well for the time being.
Bob C

lonnie
07-31-2007, 06:43 PM
Tony addressed #1.
#2 when you add or change an IP, simply restart OSPF as a work around.
#3 isn't bridging wonderful? A VLAN adds data and if the rest of the systems cannot adjust MTU size to compensate you have MTU issues. We allow you to make the MTU larger to handle it,, but again all systems in the loop need to be able to handle the increased MTU size. Everything must agree on the size to use.
#4 with 18 mbps data rate you can overload the system and of course latency skyrockets. You have a great signal, so why not go auto and let it use a higher data rate?


Guys,
I have posted 4 seperate issues and have not recieved a "you are doing it wrong" or "we are working on it".
If I am expecting to much please let me know. I simply want to help in the beta process so I have a product that works for others and myself.
I am begging you (Lonnie and Tony) to please respond so I dont sit here questioning myself if I have actually found a problem or not that causes my customers heartache.

1. snmp works great on the ether interfaces, and is hit and miss on the extended OIDs
2. ospf when adding ips/vlan ips to interfaces sometimes stops routing and doesn't start routing again until ospf has been restarted
3. mtu problem through vlans( i searched the forum but never found a definite answer)
4. On a 802.11g ptp link I have that is 5 miles, -57dbm, 18mb data rate set, 100% quality.
this morning my latency on the link skyrocketed. Not much traffic or anything. So I rebooted both ends, changed the distance to 10 miles still latency at least 300ms up to 8,000ms.
So I downgraded to 1.2.7b with a latency of about 100-300ms. Next step was 1.1.13 2080 now my latency is under 2ms

Mimbach

lonnie
07-31-2007, 07:03 PM
WPA stopped working for the Tranzeo units. It has always worked for our units, and we test against reference code from Atheros. If we work with it, then we are fine. If someone else does not work with us, then they are broken. I don't care if they fixed something and then we fixed something and broke them. We made a fix per Atheros recommendations and they did not. I will not break our software to make it work with another unit (not even sorry). They should fix their images. If you have that many units then they should be willing to work it out. You can't expect us to fix their issues. Do they support their older units or are they simply interested in new sales?

We are moving ahead and are now making our stuff work better with our units. The endless job of trying to compensate for all of the other stuff out there is not something we are willing to do. We will do our best to ensure that we remain true to our previous units.

I really don't want to seem argumentative, but WPA stopped working sometime between 2.10.0 and 2.11.0. With identical configurations, one works and the other doesn't. I've tried every combination imaginable of configurations on both Tranzeo and StarV3, but they won't sync up.

I hope you can understand why I keep pushing this back to you. When a new Tranzeo firmware release came out and WPA stopped working, I didn't even raise it here; with identical configurations both ends and no change in the StarOS version, it was clear that the fault lay with Tranzeo. On that basis I kept the pressure on them until they fixed it.

Conversely, from one release to another, StarOS stopped working with the Tranzeos, and hasn't worked since. I can accept that you've rewritten the WPA code, and that it's standards-based, but I can't go to Tranzeo and ask them to solve this problem when I have a working/not working scenario, with the only point of difference being the software at the AP end.

I haven't really made an issue out of this, but hopefully you can see my problem: I can't realistically replace 850 Tranzeos with WAR-1s; I can't realistically remove WPA from them all and replace with easily-hacked WEP - which leaves me with an AP platform it's rapidly getting harder and harder to find spare parts for, because I can't upgrade to a software version that gives me a growth path. Which sucks, because I like WARs and I like v3.

Mark
08-02-2007, 01:39 PM
I, too, have experienced the "ap stops listening to clients" problem using a WRAP at one end and a WAR at the other. The WRAP is an ap (was a client installed long ago, and now we're relaying to another client down the road) and the WAR1 is a client. Both have been using 1.2.8b.

Here's what eventually brought the link stable. First, I started with SR9's on both ends and running X4 cloaking. The link was up, down, up, down for a while, and then stay down until I rebooted ap. At X4, it only occaisionally passed pings. I even tried no cloaking, and it sort of worked in 11b mode, but was wildly unstable - but could have been noise, possibly.

At X2, the SR9's would pass pings for a while, then die. Reboot would fix it for an indeterminate period of time (anywhere from seconds to hours).

Finally, I replaced the sr9's with CM9's and instantly it linked.

again, we turned out be up and down constantly. Back to X2 in order to get any kind of stable link. X4 was up and down randomly. It always APPEARED to be linked, but pings would stop passing.

Since the cpe is also on a 2.4 ghz ap, I have run the channels as far apart as I can, and turned the power on the cm9 down to 11.

I also found that turning off "short preamble" seemed to change the behavior of the link. Associated quicker, more stable, etc, when in 11b mode.

mimbach
08-02-2007, 01:49 PM
Guys,
I wanted to report that snmp in 1.2.9b is indeed working better. I have more consisitant data then ever before for extended oids.
Also thank you for the updates on all the other questions I posted. I have been in the field all week working with new customers. So updates will come this weekend.
Thank you!

Mimbach

I plan on creating templates for this kind of thing for people to download for cacti.
here is a link to my first signal, noise, and quality graph i have been playing with in cacti....
http://www.rigidtech.com/staros/graphs/graph_image.php.png

here is a link to how it used to work before the new 1.2.9b
http://www.rigidtech.com/staros/graphs/graph_image.oldphp.png

Beebe
08-03-2007, 10:20 AM
v 1.2.9 has generally been excelent with my 2.4ghz customers links. However, I just updated a 5.8 cm9 link to 1.2.9 and it was unusable. It's about a 6 mile link with fresnel zone encroachment. I get about a -73 and I keep it locked in at 6 meg, with x2 cloaking and generally it's been very reliable. I get about 5-6 meg throughput consistantly.

When I upgraded it first of all it wouldn't link up until I turned off "Hide SSID". Then it linked up, but throughput was terrible. I fiddled for quite a while to try to improve it but then I downgraded it to 1.1.13 and everything is back to normal.

Thanks,
Roger

tony
08-03-2007, 10:25 AM
The upcoming release has several improvements with the 2x cloaking, especially with regards to interoperability between the 1.1.x and 1.2.x releases.

lonnie
08-03-2007, 10:38 AM
I recommend that both ends be upgraded at the same time, by uploading both, applying firmware and rebooting both at the same time (remote one first). I also advise to use auto rate with our new image. The true value comes when you are using the new managed mode.

v 1.2.9 has generally been excelent with my 2.4ghz customers links. However, I just updated a 5.8 cm9 link to 1.2.9 and it was unusable. It's about a 6 mile link with fresnel zone encroachment. I get about a -73 and I keep it locked in at 6 meg, with x2 cloaking and generally it's been very reliable. I get about 5-6 meg throughput consistantly.

When I upgraded it first of all it wouldn't link up until I turned off "Hide SSID". Then it linked up, but throughput was terrible. I fiddled for quite a while to try to improve it but then I downgraded it to 1.1.13 and everything is back to normal.

Thanks,
Roger

Beebe
08-03-2007, 11:05 AM
Yes, both ends were upgraded, rate was set both on 6 meg and on auto. Nothing worked until I downgraded. It's a bit strange because all the links I've upgraded before have been improved, usually quite significantly.

I can only think that this link is working by some kind of voodoo magic and that any kind of change will knock it out.

Thanks,
Roger

Beebe
08-03-2007, 11:47 AM
Maybe I spoke too soon. It's messing up under 1.1.13 also. :( Whole network is having trouble. When it first links up I get great throughput, around 5-8 megs. but then when OSPF propogates and traffic starts to flow, it bogs right down. Seems like self interference, but it's always worked in the past.

tog
08-03-2007, 12:54 PM
If you happen to be using a country code and frequency where DFS would be enabled, keep in mind Hide SSID would need to be turned off at the AP in order to use 1.2.x.

Aside from that, it sounds like you may have a physical radio link problem to deal with.

Beebe
08-03-2007, 12:55 PM
Ok, here's what's happening. This link is carrying my main backhaul to feed the network back through my 3 meg bonded T1s.

If I lock it into 6 meg then actual throughput starts to bog down at around 1.5 meg even though a throughput test will show 6 meg when it's not in use.

If I lock it in to 24 meg then it works great, ping times in the single didgits and it will carry all the traffic I need it to.

Only thing is, in v1.2.9 you can't lock it in to 24 meg and it falls back to a lower speed and falls over itself.

This is a situation where 24 meg actually works better than 6 meg, and falling back results in a worse link not a better one.

Signal levels are at about -70

Thanks,
Roger

tony
08-03-2007, 01:17 PM
If you are using 1.2.9 on both ends (not just one), then the rate will be between 24 and 36Mbps at a -70 signal.

tog
08-03-2007, 01:19 PM
When you had both sides using 1.2.9b and auto, what kinds of rates did it decide to use?

It sounds like you just have a lot of traffic to carry and transmit rates of 6mbit don't cut it. Your backhaul sounds like it's getting pretty busy, if possible you might want to try to use no cloaking on that link to get more throughput or add a second 2x cloaked link and balance your traffic over it.

I had poor performance with 1.2.9b transmitting towards 1.1.x, though tony has led me to believe that running 1.2.x on both ends and having the link in "managed" mode works much better currently.

Anyway, tony just said next release has lots of improvements where 2x cloaking is concerned so you should stick to 1.1.x forced to the best transmit rate you can manage until the next 1.2.x release becomes available.

Tony, did you plan to provide the ability to enable/disable managed links with the 1.2.x series. My thinking is that if auto still ended up working poorly, turning off the managed link part would probably still allow us to still enjoy fully statically forced transmit rates (?).

I still have yet to ever see an auto-rate algorithm anywhere created by anybody that can do better than a static forced transmit rate in many situations on outdoor links that are less-than-perfect. Of course I'd love to see one that does work well.

Beebe
08-03-2007, 01:22 PM
It's quite possible I was doing something stupid like locking both ends at 6 meg or something. I'll try upgrading again with the next release and make sure I set it to 24 megs. Maybe that will work.

Thanks,
Roger

Beebe
08-03-2007, 01:44 PM
I decided to give it one more try since everyone is already mad at me anyway. On version 1.2.9 it shows a signal of -83 and locks in at 9 meg.

So I set it back to 1.1.13 I'm back to -70 and 24 meg.

Thanks,
Roger

lonnie
08-03-2007, 05:47 PM
1.2.9 will show a low signal until you start to pass traffic.

jeff
08-04-2007, 11:23 AM
What is the formula for the interface index in the experimental mib. I have seen quite a few different values and they don't even seem to be in order. wpci0 on 11, wpci1 on 9, etc.

I'm trying to make some templates and would like some idea how the interface indexes are assigned.

tony
08-04-2007, 02:19 PM
The interface index vs. device name can be in any order, and can change from time to time during operation (for example, when you change wireless settings).

jeff
08-04-2007, 03:15 PM
The interface index vs. device name can be in any order, and can change from time to time during operation (for example, when you change wireless settings).

Not to start an argument, but that seems like a remarkably bad choice. I can't think of any monitoring application I have ever used that periodically checks to see if the oid it is using is still valid. This probably explains Mimbach's snmp complaints from earlier this month. It seems to me that you order the interfaces wpci1..whatever. Why can't the interface indexes match this and be fixed so that if I have 4 radio cards they index at 1..4. This is your mib and your code, why is it necessary for them to be arbitrarily numbered?

tony
08-04-2007, 03:52 PM
You are entitled to your views, however our snmp mibs follow the same device numbering as all other mibs in the system.

"I can't think of any monitoring application I have ever used that periodically checks to see if the oid it is using is still valid."

All monitoring applications read the interface lists to determine the interface index for each interface. When the snmp service has been updated (indicated by the uptime oid), the snmp monitoring software should 'refresh' itself with the new information. (or at least it should, if implemented properly).

The problem Mimbach was having is not related to your issue, and was resolved in the 1.2.9b release as stated in his follow up post here (http://forums.star-os.com/showpost.php?p=46147&postcount=44).

Our implementation is correct in it's behavour.

mimbach
08-04-2007, 06:00 PM
1. As of last night cacti graphs for exp oids is broke again, no changes made to any radios it just stop grabbing values. I will be digging into this further and posting detailed reasons

2. I would like to explain how I use snmp on all my backhaul links. I monitor the signal strength every 5 minutes and if it varies by more then 20% I get a alarm so I can deal with it before there is an outage. I have been unable to add alarming and if what is true I now understand why.
It seem Jeff is on the right track. I would like to see ...1.2.X.2 "x" being correlated to the interface you are grabbing stats for. This cannot change. Trying to find a client on a ap of course the order will change. However wpci1, wpci2, etc should always have the same base oid.

I have to run but will do more digging on my snmp stuff this weekend and into next week and let you know what i find.

Thanks,
Mimbach

tony
08-04-2007, 06:58 PM
It looks like you guys are onto something. After looking at cacti, it appears as though they require static device indexes.

I will do some investigation, and see about doing a virtualized device index scheme for the custom mibs after I check out cacti's requirements.

jeff
08-04-2007, 07:03 PM
You are entitled to your views, however our snmp mibs follow the same device numbering as all other mibs in the system.

All monitoring applications read the interface lists to determine the interface index for each interface. When the snmp service has been updated (indicated by the uptime oid), the snmp monitoring software should 'refresh' itself with the new information. (or at least it should, if implemented properly).

Our implementation is correct in it's behavour.

This is not a correct statement. There are numerous indexed entities that appear in snmp implementations and the fields can change. There is no reason to expect that each pole of the device will return the connected clients in the same order etc. However, what you are doing is the equivalent of renumbering the ports on a managed switch each time you power it up. This would mean having to read a list of (index 1 = port 20),(index 2 = port 3), (index 3 = port 18), ..... every time the switch is reset and maintaining a mapping as to what that means. I can promise you that snmp is not implemented that way on any device I have used over the 15 years I have done this. If I plug a device into port 1 of a managed switch I can determine the oid for txErrors for that port and it will always be the oid for txErrors for that port.

I believe you are using the standard implementation for the ethernet interfaces. Reboot the device as many times as you want and tell me if the oid for ether0 changes. There will be an arbitrary number of blocks for each device present in the system, but the mapping of base index to physical device will remain the same unless hardware is added or removed.

We are talking about hardware plugged into the pci bus. Its position in an snmp walk should not change any more than the two ethernet ports should randomly change order. One reboot the left connector is ether0 the next reboot the right connector is ether0. There is snmp data that is inherently unordered and can be reported in any order you wish. Physical devices that are fundamental to the operation of the device are not such entities and should be ordered consistently across configuration changes.

My position stands. The implementation of the experimental mib is broken and does not conform to industry practice. Note I did not say the mib. It is the population of the data that is wrong.

This is not an attack. I am simply trying to help you create the best implementation you can. I assume you internally maintain a list of interfaces since they are consistently reported in the user interface. If you simple tie the base index for the interface to the same numbering scheme rather than dynamically index the interfaces everything will work as it should. This would raise the question of what to do when a card is disabled in the interface and I would suggest leaving it in the list so that the other devices are not renumbered, but anything is an improvement over the tree I just walked where wpci0 was index 11 and wpci1 was index 9 and no other index was in use.

jeff
08-04-2007, 07:06 PM
Tony,

You posted while I was writing mine. It is not a function of cacti. It is the difference between data that is allowed and expected to vary and that which should not vary. The description of the difference appears in my previous post.

Thank you for taking this seriously, as always I appreciate your hard work.

jeff
08-04-2007, 07:09 PM
I have also noticed that snmp does not bind to all interfaces and have run into the case where I have changed parameters and subsequently found that the ip address that was being used for snmp polls no longer responds, but one of the other ip addresses on the system does. There should be some consistent convention for where to find snmp data.

It really doesn't matter what you choose, it should just be consistent.

tony
08-04-2007, 08:13 PM
The experimental mib implementation is industry standard, and is comparable with the likes of Cisco. You have to keep in mind that the software you are using (cacti in this case) is designed for static systems that do not change during operation, such as a switch or master gateway server.

With StarV3, most devices are created / destroyed as needed based on the use of the system. If you modify something during operation, then yes, the snmp related information will mold itself to the changes. For example, ppp and vds interfaces come and go, as do wireless interfaces when you enable / disable, or reconfigure the interfaces.

With that said, I will implement a static index mechanism for pieces of software that cannot handle dynamic interfaces.

The proposed interface numbering scheme will be as such:
1000 = wpci0
1100 = wpci1
1200 = wpci2
1300 = wpci3
etc..

Thank you for bringing up the IP binding issue, I will look into it.

jeff
08-04-2007, 08:38 PM
The experimental mib implementation is industry standard, and is comparable with the likes of Cisco. You have to keep in mind that the software you are using (cacti in this case) is designed for static systems that do not change during operation, such as a switch or master gateway server.

With StarV3, most devices are created / destroyed as needed based on the use of the system. If you modify something during operation, then yes, the snmp related information will mold itself to the changes. For example, ppp and vds interfaces come and go, as do wireless interfaces when you enable / disable, or reconfigure the interfaces.

With that said, I will implement a static index mechanism for pieces of software that cannot handle dynamic interfaces.

The proposed interface numbering scheme will be as such:
1000 = wpci0
1100 = wpci1
1200 = wpci2
1300 = wpci3
etc..

Thank you for bringing up the IP binding issue, I will look into it.

This statement I can agree with. The difference in my mind continues to be physical vs virtual interfaces. I have never dealt with a system where the physical interfaces re-order arbitrarily. Normally physical interfaces are initialized in a fixed order and do not change as long as the hardware remains the same. I think your proposed solution would work quite well and look forward to seeing it implemented.

p.s. I don't use cacti. We use a combination of custom scripts and zabbix. I know from experience that the current implementation would also make mrtg and several other solutions painful. Also from experience Cisco and HP will produce consistent indexes for hardware interfaces plugged into a given slot even as virtual interfaces are added. Someone may want to chime in on how Microtik, etc handle this. Didn't like them enough to ever integrate them into our infrastructure.

tony
08-05-2007, 12:07 AM
Once the device index changes are made, everything should run lickady-split.