View Full Version : OSPF Over Wireless Interfaces
bradg
01-11-2005, 02:57 PM
OK, I've been tinkering with OSPF over wireless interfaces for a couple weeks now, and have a configuration that *works* but I'm sure isn't optimal. There are some issues I had and still have with it that puzzle me, and from what I read, there are at least three other people trying to do OSPF over wireless that are having experiences similar.
Hardware setup is three WRAP boards running Star-OS v2.01.1 build 4590 (latest).
Visually, the configuration is as such:
http://www.mnns.com/~bradg/star-ospf.gif
The PRISM cards are to be used as internal AP's eventually - they are currently unconfigured.
The Atheros cards are 5GHz PtP shots with qualities of 27 or more - the link qualities aren't an issue here.
Initially I set the units up with all Atheros advanced options enabled, no encryption, Hide SSID enabled, and with a dead simple OSPF configuration, and it worked - converged almost immediately. It needed no interfaces set to non-broadcast or any neighbor specifications. I then disabled inter-BSS relay as well - all continued to work.
Then, I set to work encrypting the wireless links WPA-AES, and setting up a couple other standard configuration items.
Last week, I went to reconfigure my OSPF backbone to area 0.0.0.0, and when I was done (I wasn't watching the network while I was working on it) - things would not reconverge.
The Star-OS WRAP connected to the ethernet segment established and exchanged OSPF links to the other directly connected gear and got a full set of routes. However, nothing would propagate over the wireless links at all. I could hop to each one from the other, but no full routing.
I fought with this for sime time, and nothing seemed to cure it - I turned off the advanced options, un-hid the SSID and allowed inter-BSS relaying thinking (well, knowing actually) it was broadcast related, and so on. I even set the interfaces to non-broadcast, and specified neighbors in the config, and it still didn't converge.
After I specified different timer intervals, and waited several minutes, it suddenly reconverged and has been working ever since.
Below is the configuration I'm currently using and it does work, but convergence time is not nearly what it should be (especially on a broadcast network).
Current configuration:
!
hostname hr-main
password 1234
!
!
!
interface lo
!
interface tunl0
!
interface gre0
!
interface eth0
!
interface wpci0
ip ospf network non-broadcast
ip ospf dead-interval 90
!
interface wpci1
ip ospf network non-broadcast
ip ospf dead-interval 90
!
interface ecb
!
interface ipacct
!
interface beacon
!
interface wlanbr
!
interface cbq
!
router ospf
ospf router-id 10.255.255.6
redistribute kernel
redistribute connected
redistribute static
network x.x.x.0/24 area 0.0.0.0
neighbor x.x.x.246 poll-interval 90
neighbor x.x.x.250 poll-interval 90
!
access-list vtylist permit 127.0.0.1/32
access-list vtylist deny any
!
line vty
access-class vtylist
!
end
Here's what puzzles me - why did it work initially with no special configuration or interface broadcast changes? It could be that I just didn't give it enough time, but I'm hesitant to screw with it again for right now since it is working.
Please, post your experiences and observations here related to OSPF so collectively we might be able to figure out just exactly what issues there are, and what configurations and options, and even workarounds work reliably. As I said in an earlier post, I think it's very important to get OSPF working correctly (configuration or otherwise) in Star-OS over wireless links. IMHO it's a much more flexible and modern autonomous routing protocol that RIP v2 will ever be.
Brad
phendry
01-18-2005, 05:42 AM
I had similar issues. This was due to a combination of broadcast and mtu related issues. After this experience we decided OSPF was too unstable at present as our IGP so we now use static routes across the core and edge devices talk bgp across vds tunnels. seems to work ok and adds a layer of protection but would be nice to use a dynamic routing protocol to route our internal routes.
When you had the issues did you try running debug? did the neighbors establish?
oscarBravo
01-18-2005, 01:20 PM
We're having problems here also - I'm working with bminish on his network.
A couple of symptoms seemed significant: first, I regularly see ospf looking disabled (not highlighted) on the main screen. When I select Routing > Advanced, OSPF shows a status of "stopped" for a second, then switches to "started". (I see a similar report here (http://forums.star-os.com/viewtopic.php?t=3780).) Second, the gateway router - the only one with a default gateway - seems to have trouble connecting with its neighbours.
I've set the wpci interfaces to "non-broadcast" and specified neighbours. Everything else seems pretty much vanilla. It should Just Work. But it doesn't: we've had to revert to static routes also.
We need OSPF to work. This network is going to get a lot bigger, with a lot of redundant routes, and static routing isn't going to cut it. RIP doesn't look like it will scale to the level we require.
phendry
01-18-2005, 01:42 PM
Second, the gateway router - the only one with a default gateway - seems to have trouble connecting with its neighbours.
Is your gateway router a Cisco or does it peer with a Cisco? I had this too. When I debugged the OSPF packets I saw the neighbors fail because the MTU sizes on the hello packets where different and the only way around it was to change the MTU size on the Cisco router.
bminish
01-18-2005, 01:59 PM
Is your gateway router a Cisco or does it peer with a Cisco?
No, it's the staros router at the network edge.
Paul didn't mention but all staros routers here are running version 2.01.1 4590
All radio cards involved in the intranode links are Atheros based CM9's and all hardware is WRAP
.Brendan
wwalcher
01-18-2005, 03:29 PM
RIP doesn't look like it will scale to the level we require.
Why does RIP not scale in the same way as OSPF?
bkehoe
01-18-2005, 03:45 PM
We also tried ospf across a network of approx 30 nodes and had to stop using it within a few days as it was too unstable - as others have said the process was just stopping (seemed to happen after an activate changes was selected, but sometimes randomly). A functional stable OSPF is something I'd love to have, but for the moment have to make do with static routes and some rip, but rip isnt suitable for long term plans, though I try to keep the number of hops to a minimum in the network, but I'm heading towards its limits. :(
bminish
01-18-2005, 03:47 PM
RIP doesn't look like it will scale to the level we require.
Why does RIP not scale in the same way as OSPF?
Rip has a 15 Hop limit.
It's slower to converge than OSPF (that is if OISPF is actually working..)
RIP does not know about link costs and network latency, this is important in a wireless network setting where not all links will offer the same throughput and link quality.
RIP networks are flat
The network we are building will (we hope.) end up quite large and complex, it will also have a fair bit of redundancy. RIP is probably fine on smaller networks but we will be eventually dealing with hundreds of nodes
.Brendan
wwalcher
01-19-2005, 12:54 PM
ip ospf dead-interval 90
neighbor x.x.x.250 poll-interval 90
Brad
What do the "dead-interval 90" and the "poll-interval 90" settings do?
oscarBravo
01-19-2005, 07:42 PM
What do the "dead-interval 90" and the "poll-interval 90" settings do?Those settings are key to the timing of how OSPF recalculates routes. The poll-interval setting determines how often a router sends out "hello" messages to its neighbours - either unicast to those routers explicitly mentioned in "neighbor" clauses, or broadcast on Ethernet segments. The dead-interval setting determines how long before a router considers a neighbour to have died, and recalculates its routing graph.
The default settings are 10 for poll-interval, and 40 for dead-interval. If you "show ip ospf neighbors" you can see this in action: normally the "dead time" counts down from 0:40, then when it hits 0:30 it reverts to 0:40 as "hello" messages are sent out.
bminish
01-19-2005, 08:10 PM
having dead-interval the same or smaller than the poll-interval is probably not a good idea
bairdc
01-20-2005, 12:41 PM
Does anyone know if these OSPF issues appeared in just the recent versions of StarOS? The reason I ask is that I've had OSPF running for several months on a small segment of my network, and I haven't noticed any trouble. I did have to set the wireless interfaces as non-broadcast, and I had to specify the neighbors, but I don't think I've had any trouble with OSPF crashing. The boxes in question, however are all running older versions:
2.0.0.2 build 4346
1.12.4 build 3404
1.12.6 build 3494
1.13.2 build 3790
I know that the version of Zebra was changed in the latest version of StarOS, and there are some people who have had troulble with it, so I'm just wondering if maybe the key is to use an older version of StarOS until Tony and Lonnie can fix it.
Craig
bminish
01-20-2005, 02:01 PM
Does anyone know if these OSPF issues appeared in just the recent versions of StarOS?
We are only running the latest version 2.01.1 build 4590 and we now have, it seems a stable network that is only using OSPF
Like you we are declaring wireless interfaces non broadcast and specifying neighbours however it appears that the ospf broadcast may work ok between CM9 radios (using 802.11a anyway) although we are only trying that on 4 test nodes so far to see what will happen.
Broadcast doesn't seem to work over prism 2.5 links or over Prsim 2.5 to atheros CM9
Another possible issue with this version seems to be that the default setting distributes the kernel routing table which is generated by ospf in the first place, turning this off seems to be a requirement for stable configurations.
We suspect that this default behaviour can create a feedback situation when it is on since OSFP updates the kernel routing table, which updates other nodes via OSPF, which updates the kernel routing table ..
I'm just wondering if maybe the key is to use an older version of StarOS until Tony and Lonnie can fix it.
Craig
Well how are they going to fix it if we don't go out and break it in the current version :D
Actually it may not need that much fixing, just documentation
OscarBravo (Paul) will soon be posting a writeup on how we have it working on this network, we now have a fully OSPF network working exactly as it should across 10 nodes, 19 radio links and some Ethernet segments. we also have some redundancy so being able to cost based route things is nice
There is an interface issue that seems to be responsible for ospf stopping and restarting but I think it's only going to bite you when you are actually setting it up for the first time on a node
oscarBravo
01-21-2005, 03:06 AM
OK, here's the skeleton of an operational OSPF configuration as we have it working on the WestNet network: Current configuration:
!
hostname rta
password ****
...some interfaces omitted...
interface eth0
!
interface eth1
!
interface wpci0
ip ospf network non-broadcast [1]
ip ospf cost 500 [2]
!
interface wpci1
ip ospf network non-broadcast
ip ospf cost 200
...some more interfaces omitted...
router ospf
ospf router-id 10.30.19.13 [3]
network 10.30.2.0/24 area 0.0.0.0
network 10.30.19.12/30 area 0.0.0.0
network 10.30.20.0/24 area 0.0.0.0
neighbor 10.30.2.1 [4]
neighbor 10.30.20.10
!
access-list vtylist permit 127.0.0.1/32
access-list vtylist deny any
!
line vty
access-class vtylist
!
end Some notes:
[1] The wireless interfaces are set to non-broadcast. As Brendan mentioned, we have had some success with CM9-CM9 broadcasting, but setting a wireless link to non-broadcast and explicitly specifying neighbours always works.
[2] We're still experimenting with link costs. OSPF generally calculates a cost as 100M divided by the interface speed, so 10M Ethernet has a cost of 10. The costs need to be specified on the radio links to approximate the bandwidth available.
[3] The router ID is the address of the first Ethernet port, for consistency. It actually doesn't seem to matter what the router ID is, but this makes them easy to recognise in the OSPF output.
[4] This is where we specify the neighbours for the wireless links. The wired links are broadcast-capable, so we don't bother specifying them.
Notice that there are no "redistribute-" clauses in here at all. We found that leaving them in made each router an ASBR, which is simply not the case. The only exception is the gateway StarOS router, which has a default route; all it needs is "default-information originate" to propagate the default route throughout the network.
This seems to work, and it seems to be stable. It's a minimal configuration, it generates the minimum number of kernel routes needed to make the network function, and it propagates link state updates (like changed link costs) almost instantly.
Obviously this isn't a definitive configuration. It's working for us in a small network with no BGP or any such complications (yet). Comments and feedback welcome.
bminish
01-21-2005, 03:32 AM
[1] The wireless interfaces are set to non-broadcast. As Brendan mentioned, we have had some success with CM9-CM9 broadcasting, but setting a wireless link to non-broadcast and explicitly specifying neighbours always works.
I did see one of the test nodes which is doing Prism 2.5 to atheros CM9 that would not come up until the 'use Fw 1.7.4' option was checked but other than that it appears to be working nicely
.Brendan
bminish
01-21-2005, 12:25 PM
[2] We're still experimenting with link costs. OSPF generally calculates a cost as 100M divided by the interface speed, so 10M Ethernet has a cost of 10. The costs need to be specified on the radio links to approximate the bandwidth available.
You can also pick different 'cost base' if you like.
In the above example we are costing 1G = 1
so 10 = 100M
100 = 10M
500 = 2M
1000 = 1M
.Brendan
Steve
01-22-2005, 07:20 AM
Thanks for taking the time to document and post this info. How long have you had this config running without seeing any issues?
bminish
01-22-2005, 10:38 AM
Thanks for taking the time to document and post this info. How long have you had this config running without seeing any issues?
Paul did all the hard work, we have had it running for only a few days but at this stage i would be reasonably happy to deploy OSFP on a well monitored network since we haven't seen any issues with OSPF that were not down to configuration problems.
This doesn't mean that there aren't issues, just that we haven't seen any yet :D
I would like to see it running for a good while longer if I was going to deploy it on a network I cannot easily monitor and it would certainly be helpful if anyone else finds issues that they be discussed in these forums.
.brendan
lonnie
01-22-2005, 10:52 AM
Brendan - thanks for your help on this. You have spent a lot of time and the results are great. Please check your account for a token of our appreciation.
phendry
01-22-2005, 12:04 PM
Have you tested adding new elements? One of the problems we where seeing was poor route propagation. Also, did you test dropping a link to make sure that routes get removed correctly? I think your right in that most problems where because of broadcasts on wireless links but still not too happy to deploy it on an active network.
oscarBravo
01-22-2005, 12:09 PM
Yes, we've done some work on adding and removing routers and links, and in the current iteration - which now has an extra stub area, with consolidated routing, in addition to the backbone - it seems to work well. Dropped links disappear after the 40-second dead interval, and new links appear within seconds.
oscarBravo
01-22-2005, 12:31 PM
A followup to the above: it's better than I thought. I did some quick testing (using ping -R) to measure the downtime caused by a broken link, and it was about 15 seconds before an alternative route was calculated. Once the optimal route was re-established (within a minute or so of the link being reconnected) it was resumed seamlessly.
bminish
01-22-2005, 01:03 PM
Have you tested adding new elements?
yes and it works ok, although it is sometimes necessary to stop and restart OSPF on the machine you are configuring once you are finished configuring it.
One of the problems we where seeing was poor route propagation.
Also, did you test dropping a link to make sure that routes get removed correctly?
We are having no problems with this, I have had a couple of nodes going up and down due to power issues and have been disabling various test nodes and redundant routes.
I think one of the problems that people are seeing with route propagation could be down to the default configuration behaviour of 'redistribue kernel' which redistributes the kernel routing table.
disabling this (and any other un-needed redistributes) certainly seems to clear up the problems with route distributions
I think your right in that most problems where because of broadcasts on wireless links but still not too happy to deploy it on an active network.
The broadcast stuff seems to work over cm9 to cm9 links but since specifying non broadcast and defining the neighbours is no real hardship and definitely works over prism links too it's probably the best way to go. Our Ethernet links are broadcast though.
.brendan
wwalcher
01-23-2005, 09:28 PM
I echo Lonnie in thanking you for this information. I am just completing an upgrade to my wireless network. It is comparable in size to yours. I have been having fits with OSPF, with it going up and down. I would make a change on one router and then a router up the line would lose its routes. It got so frustrating that last night I shut it off and switched over to static routes.
So, maybe my problem was that I was redistributing the kernel and it was getting in a loop.
Your post gives me hope that, once I recover from the last few days, I can try OSPF again. It would certainly be beneficial if I could get it to work.
Exactly the same experience for me, removing the redistributes seemed to have got it working. Many Thanks
I am continuing to see problems with soekris connections though, I don't know what it is, but wireless cards on wlan interfaces only seem to show up as being on wpcm ones. I did report this a while back.
If I do a "show ip ospf int" to show all interfaces, the wpcm interfaces are shown as :
wpcm0 is up, line protocol is up
Internet Address 192.168.4.2/24, Area 0.0.0.2
Router ID 2.0.0.5, Network Type NBMA, Cost: 10
Transmit Delay is 1 sec, State Waiting, Priority 1
No designated router on this network
No backup designated router on this network
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
Hello due in 00:00:05
Neighbor Count is 1, Adjacent neighbor count is 0
and the wlan interfaces are shown as
wlanx is down, line protocol is down
ospf is not enabed on this interface
It is as though the wlan/wpcm interfaces for the pcmcia card are working for wpcm but not wlan
bminish
01-24-2005, 11:57 AM
Exactly the same experience for me, removing the redistributes seemed to have got it working. Many Thanks
I am continuing to see problems with soekris connections though, I don't know what it is, but wireless cards on wlan interfaces only seem to show up as being on wpcm ones. I did report this a while back.
I don't have any soekris boards to try out
what does sh running-config look like on the soekris setup?
do you have the wpcm interfaces listed?
I suppose both ends are in the same area (2)
.Brendan
I don't have any soekris boards to try out
what does sh running-config look like on the soekris setup?
do you have the wpcm interfaces listed?
I suppose both ends are in the same area (2)
.Brendan
It looks just the same as the wrap except that on the soekris the pcmcia cards have 2 interfaces listed wlanx & wpcmx, whereas the wraps only have a single wpcix.
No matter what I do I cannot get the wlan interfaces to show OSPF enabled, so I suspect this is why soekri won't do ospf.
both ends are in the same area, and all boards have the 4590 staros release installed
oscarBravo
01-25-2005, 07:43 AM
I assume the snippet you posted earlier was copied from the running router? wpcm0 is up, line protocol is up
Internet Address 192.168.4.2/24, Area 0.0.0.2 Your OSPF configuration assigns networks to areas, not interfaces. The fact that this interface shows as being in this area implies that you have a statement like "network 192.168.4.0/24 area 2" in your config.
The info I've quoted above indicates that OSPF has associated the address 192.168.4.2 with interface wpcm0. Does it hugely matter what the interface is called? For example, in our experiments with WRAP boards, what StarOS calls wpci1 shows in OSPF as wpci0, but it doesn't change the functionality.
The only real problem I see in your post is: Neighbor Count is 1, Adjacent neighbor count is 0 For some reason, the router has recognised that it has a neighbour, but isn't establishing adjacency. What does "show ip ospf neighbors" tell you?
I assume the snippet you posted earlier was copied from the running router? Your OSPF configuration assigns networks to areas, not interfaces. The fact that this interface shows as being in this area implies that you have a statement like "network 192.168.4.0/24 area 2" in your config.
Yes, my config has network 192.168.4.0/24 area 2
The info I've quoted above indicates that OSPF has associated the address 192.168.4.2 with interface wpcm0. Does it hugely matter what the interface is called? For example, in our experiments with WRAP boards, what StarOS calls wpci1 shows in OSPF as wpci0, but it doesn't change the functionality.
The only real problem I see in your post is: For some reason, the router has recognised that it has a neighbour, but isn't establishing adjacency. What does "show ip ospf neighbors" tell you?
There is a neighbor statement in the config, if I do a show ip ospf neigh, it shows NO neighbors. On soekri with ethernet interfaces, ospf works fine. It is only on soekri wlan/wpcm interfaces that nothing works.
I understand what you are saying about the star-os confusion on interface numbering, but there is consistency within ospf, a wpci0 interface shows as a wpci0 neighbor etc. this is something very different, with there being two interface names for the same appliance.
oscarBravo
01-25-2005, 08:50 AM
I'd like to get more information to help you sort this out. I know I was convinced a while ago that there must be something b0rken in the StarOS implementation of OSPF (sorry Lonnie!) but persistence - and an expanded understanding of the OSPF protocol - got it working.
Would you care to post your running config from both routers (the ones that should be neighbours), as well as the output of "sh ip os ne" and "sh ip os da" from both routers?
I've attached the info for a soekris, it is functioning as a relay with two orinoco ruby cards.
I have attached (in order)
sh run
sh ip ospf data
sh ip ospf neigh
sh ip ospf route
sh ip ospf int
show run
=======
hostname walpole
password xxxx
!
!
!
interface eth0
description spare - ip 192.168.1.1/24
!
interface lo
!
interface tunl0
!
interface gre0
!
interface eth1
description spare - ip 192.168.2.1/24
!
interface ecb
!
interface ipacct
!
interface beacon
!
interface wlanbr
!
interface wpcm0
description backlink - ip 192.168.4.1/24
ip ospf network non-broadcast
!
interface wpcm1
description ap - ip 62.72.162.193
ip ospf network non-broadcast
!
interface cbq
!
interface wlan0
description backlink - ip 192.168.4.2/24
ip ospf network point-to-point
!
interface wlan1
description ap . ip 62.72.162.193/24
ip ospf network point-to-point
!
router ospf
ospf router-id 2.0.0.5
passive-interface eth0
passive-interface eth1
network 62.72.162.192/28 area 0.0.0.2
network 192.168.4.0/24 area 0.0.0.2
neighbor 62.72.162.194
neighbor 62.72.162.195
neighbor 192.168.4.1
!
access-list vtylist permit 127.0.0.1/32
access-list vtylist deny any
!
line vty
access-class vtylist
!
end
sh ip ospf data
===========
OSPF Router with ID (2.0.0.5)
Router Link States (Area 0.0.0.2)
Link ID ADV Router Age Seq# CkSum Link count
2.0.0.5 2.0.0.5 1597 0x80000010 0xdaec 2
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
0.0.0.0 2.0.0.5 874 0x8000000f 0x6dd7 E2 0.0.0.0/0 [0x0]
show ip ospf neigh
==============
Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL
w
sh ip ospf route
============
============ OSPF network routing table ============
N 62.72.162.192/28 [10] area: 0.0.0.2
directly attached to wpcm1
N 192.168.4.0/24 [10] area: 0.0.0.2
directly attached to wpcm0
============ OSPF router routing table =============
============ OSPF external routing table ===========
sh ip ospf int
==========
eth0 is up, line protocol is up
OSPF not enabled on this interface
lo is up, line protocol is up
OSPF not enabled on this interface
tunl0 is down, line protocol is down
OSPF not enabled on this interface
gre0 is down, line protocol is down
OSPF not enabled on this interface
eth1 is up, line protocol is up
OSPF not enabled on this interface
ecb is down, line protocol is down
OSPF not enabled on this interface
ipacct is down, line protocol is down
OSPF not enabled on this interface
beacon is down, line protocol is down
OSPF not enabled on this interface
wlanbr is down, line protocol is down
OSPF not enabled on this interface
wpcm0 is up, line protocol is up
Internet Address 192.168.4.2/24, Area 0.0.0.2
Router ID 2.0.0.5, Network Type NBMA, Cost: 10
Transmit Delay is 1 sec, State DR, Priority 1
Designated Router (ID) 2.0.0.5, Interface Address 192.168.4.2
No backup designated router on this network
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
Hello due in 00:00:03
Neighbor Count is 0, Adjacent neighbor count is 0
wpcm1 is up, line protocol is up
Internet Address 62.72.162.193/28, Area 0.0.0.2
Router ID 2.0.0.5, Network Type NBMA, Cost: 10
Transmit Delay is 1 sec, State DR, Priority 1
Designated Router (ID) 2.0.0.5, Interface Address 62.72.162.193
No backup designated router on this network
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
Hello due in 00:00:03
Neighbor Count is 0, Adjacent neighbor count is 0
cbq is down, line protocol is down
OSPF not enabled on this interface
wlan0 is down, line protocol is down
OSPF not enabled on this interface
wlan1 is down, line protocol is down
OSPF not enabled on this interface
oscarBravo
01-25-2005, 10:24 AM
One change I notice from before: the neighbor count shows as zero on both interfaces. Is OSPF running on (say) 192.168.4.1? Can you post the same information from it?
A couple of questions about the configuration - why do you have "ip ospf network point-to-point" specified on wlan0 and wlan1 (not that it affects anything, since OSPF considers those interfaces to be down)? Is there a reason why these routers are in area 2? Do you have an area 0, and is there a router configured as an ABR?
It works
The 192.168.4.1 end did not have a network statement, once I put it in, the neighbor came up right away with sh ip ospf neigh showing it as wpcm0.
One change I notice from before: the neighbor count shows as zero on both interfaces. Is OSPF running on (say) 192.168.4.1? Can you post the same information from it?
I've listed the "sh ip ospf neigh" & "sh ip ospf route" info, all the rest is the same.
Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL
2.0.0.1 1 Full/Backup 00:00:39 62.72.160.81 eth1:62.72.160.82 0 0 0
2.0.0.4 1 Init/DROther 00:00:40 62.72.162.250 wpci0:62.72.162.249 0 0 0
2.0.0.5 1 Full/Backup 00:00:32 192.168.4.2 wpci1:192.168.4.1 0 0 0
a
============ OSPF network routing table ============
N IA 62.72.160.32/27 [30] area: 0.0.0.2
via 62.72.160.81, eth1
N IA 62.72.160.72/30 [20] area: 0.0.0.2
via 62.72.160.81, eth1
N 62.72.160.80/30 [10] area: 0.0.0.2
directly attached to eth1
N 62.72.162.0/25 [30] area: 0.0.0.2
via 62.72.160.81, eth1
N 62.72.162.128/30 [20] area: 0.0.0.2
via 62.72.160.81, eth1
N 62.72.162.192/28 [20] area: 0.0.0.2
via 192.168.4.2, wpci1
N 62.72.162.248/30 [10] area: 0.0.0.2
directly attached to wpci0
N 192.168.4.0/24 [10] area: 0.0.0.2
directly attached to wpci1
============ OSPF router routing table =============
R 0.0.0.2 IA [20] area: 0.0.0.2, ASBR
via 62.72.160.81, eth1
R 2.0.0.1 [10] area: 0.0.0.2, ABR, ASBR
via 62.72.160.81, eth1
R 2.0.0.3 [20] area: 0.0.0.2, ASBR
via 62.72.160.81, eth1
============ OSPF external routing table ===========
N E2 0.0.0.0/0 [20/10] tag: 0
via 62.72.160.81, eth1
A couple of questions about the configuration - why do you have "ip ospf network point-to-point" specified on wlan0 and wlan1 (not that it affects anything, since OSPF considers those interfaces to be down)? Is there a reason why these routers are in area 2? Do you have an area 0, and is there a router configured as an ABR?
The point-to-point was just random attempts to try everything. It was meant to be a p2p link hence thats why I put in.
Area2, I have about a 100 routers in my network, before I put it live everywhere I am testing it in my village (7 ospf enabled routers). My network consists of a 5.8 core backbone into which I plug small community networks. My thinking is that by making each community a seperate area & setting the backbone to area 0, the network "should" respond to broken links more rapidly.
The cisco site says that 50 routers should be the limit for a single area, and less depending on topology
OTOH everyone I have spoken to has told me to use a single area, so right now the jury is out, I have a single area (which I have numbered 2) but I am ready to jump in either direction - once I get area 2 working.
I don't have any routers configured as ABR, but the interface configured to connect to area 0 is coming up as an ABR in the "sh ip ospf route" above
oscarBravo
01-26-2005, 09:55 AM
All's well that ends well.
I'd say go with your multiple-area configuration. Our test network had two areas, and it worked perfectly. The important thing is that every area must have a direct connection to a router in area 0. If an area only has a single link to area 0, you can configure it as a stub area - every router in the area must have "area x stub" in the configuration - and summarise the routes in and out of it, assuming your IP addressing is set up to facilitate that. This reduces the size of the routing tables throughout the network.
Paul
Thanks for that help it has been invaluable.
My test area is a stub area, so that means I need to add the area 2 stub statement to EVERY ospf enabled router in that area :
router ospf
ospf router-id 2.0.0.2
passive-interface eth0
network 62.72.160.80/30 area 0.0.0.2
network 62.72.162.248/30 area 0.0.0.2
network 192.168.4.0/24 area 0.0.0.2
area 0.0.0.2 stub
neighbor 62.72.160.81
neighbor 62.72.162.250
neighbor 192.168.4.2
!
But I am not clear about your next statement "and summarise the routes in & out of it"
Do you mean as I have done it above by specifying adjacent neighbors or do you mean something else ?
My next step is to add an alternative route back via a DSL connection, so I can see the requirement for specifying the routes out of the area (with costings) but I'm not sure how to do this, other than by using static routes, in which case I'd have thought I'd need to put the redistribute static statement back in ?
oscarBravo
02-06-2005, 06:26 AM
My test area is a stub area, so that means I need to add the area 2 stub statement to EVERY ospf enabled router in that area : Yes, I mentioned that in my last post (but not very clearly, sorry). But I am not clear about your next statement "and summarise the routes in & out of it"
Do you mean as I have done it above by specifying adjacent neighbors or do you mean something else ? What I mean is that if all the networks in area 2 can be reached through a single subnet specification, then the routes to all those networks don't need to be propagated out of the area. Instead, you can use the "area range" ospf command to summarise the routes. Note that this is probably not a good idea if you have more than one route out of an area.
As an example, our test network used addresses in the 10.30.0.0/16 range for area 0, and 10.40.0.0/16 for area 2. There was only one route from area 2 to the backbone, so the ABR had the command "area 2 range 10.40.0.0/16" in its configuration. The effect of this was simply that all OSPF routers in the backbone area had a single route to area 2 instead of individual routes to all the area 2 networks, all via the ABR. I hope this makes sense. My next step is to add an alternative route back via a DSL connection, so I can see the requirement for specifying the routes out of the area (with costings) but I'm not sure how to do this, other than by using static routes, in which case I'd have thought I'd need to put the redistribute static statement back in ? When you say "an alternative route back" I presume you mean an alternative route from area 2 to area 0? If the route via the DSL backup goes directly to a router in area 0, there shouldn't be a problem. If it goes via another area, area 2 is no longer a stub area.
I'd be interested in exploring this a little further with you.
cdavis
07-10-2006, 05:54 AM
Can I see the confi for the two neighbors? What info was left out of that config example?
OK, here's the skeleton of an operational OSPF configuration as we have it working on the WestNet network: Current configuration:
!
hostname rta
password ****
...some interfaces omitted...
interface eth0
!
interface eth1
!
interface wpci0
ip ospf network non-broadcast [1]
ip ospf cost 500 [2]
!
interface wpci1
ip ospf network non-broadcast
ip ospf cost 200
...some more interfaces omitted...
router ospf
ospf router-id 10.30.19.13 [3]
network 10.30.2.0/24 area 0.0.0.0
network 10.30.19.12/30 area 0.0.0.0
network 10.30.20.0/24 area 0.0.0.0
neighbor 10.30.2.1 [4]
neighbor 10.30.20.10
!
access-list vtylist permit 127.0.0.1/32
access-list vtylist deny any
!
line vty
access-class vtylist
!
end Some notes:
[1] The wireless interfaces are set to non-broadcast. As Brendan mentioned, we have had some success with CM9-CM9 broadcasting, but setting a wireless link to non-broadcast and explicitly specifying neighbours always works.
[2] We're still experimenting with link costs. OSPF generally calculates a cost as 100M divided by the interface speed, so 10M Ethernet has a cost of 10. The costs need to be specified on the radio links to approximate the bandwidth available.
[3] The router ID is the address of the first Ethernet port, for consistency. It actually doesn't seem to matter what the router ID is, but this makes them easy to recognise in the OSPF output.
[4] This is where we specify the neighbours for the wireless links. The wired links are broadcast-capable, so we don't bother specifying them.
Notice that there are no "redistribute-" clauses in here at all. We found that leaving them in made each router an ASBR, which is simply not the case. The only exception is the gateway StarOS router, which has a default route; all it needs is "default-information originate" to propagate the default route throughout the network.
This seems to work, and it seems to be stable. It's a minimal configuration, it generates the minimum number of kernel routes needed to make the network function, and it propagates link state updates (like changed link costs) almost instantly.
Obviously this isn't a definitive configuration. It's working for us in a small network with no BGP or any such complications (yet). Comments and feedback welcome.
cdavis
07-10-2006, 06:28 AM
Will ospf start on its own after writing a config? Is there a mechanism to show errors in the config, or will it not allow you to enter a line if it has errors? I have done my best to write a config and when I do "show ip ospf int" I get (I expected to see eth0 up)
beacon is down
OSPF not enabled on this interface
cbq is down
OSPF not enabled on this interface
ecb is down
OSPF not enabled on this interface
eth0 is up
OSPF not enabled on this interface
eth1 is up
OSPF not enabled on this interface
eth2 is up
OSPF not enabled on this interface
gre0 is down
OSPF not enabled on this interface
ipacct is down
OSPF not enabled on this interface
lo is up
OSPF not enabled on this interface
tunl0 is down
OSPF not enabled on this interface
wlanbr is down
OSPF not enabled on this interface
oscarBravo
07-10-2006, 01:02 PM
It might help to paste your config in here.
We're moving away from OSPF to a statically-routed network. OSPF just wasn't stable enough, and was becoming our biggest support headache.
I should chime in and mention that I have happily moved from statically-routed to OLSR. OLSR isn't the ticking time-bomb that OSPF seems to be. Something can reboot or a link can go down and OLSR actually does what it's supposed to everywhere!
dc2005
07-10-2006, 01:43 PM
We're moving away from OSPF to a statically-routed network. OSPF just wasn't stable enough, and was becoming our biggest support headache.
I agree totally with Paul. Unless something has changed in the latest V3 release, OSPF just doesn't work reliably in StarOS. We have seen all the familiar problems already reported i.e. lost interfaces, lost routes etc after an activate changes. We have also recently moved from OSPF to a statically routed and the number of network outages (and downtime and irate customers!) have all decreased dramatically. It's a pity, because we do have a network with redundant backup routes but currently cannot be confident that failover will occur in the event of an outage. We're still using Wrap / V2 for most of our backhaul routers and will likely upgrade these to V3 when it's available for x86 but, unless I've missed something, it doesn't look like the new driver in V3 coupled with the latest version of Quagga solves the reported OSPF problems? I know we can try using OLSR with V3, so maybe this will be the ultimate solution to the problem.
cdavis
07-10-2006, 03:00 PM
Doesn't OLSR require V3 which is only available on a WAR board?
cdavis
07-10-2006, 03:12 PM
That's unfortunate. I feel now like I might be considering spending time learning/configuring/using something that may never work right anyway. Is anyone succesfully using OSPF with StarOS? I now have the ability to have redundant links back to my headend and would really like to use dynamic routing. Does anyone have any other suggestions? Buying a bunch of WAR boards to replace my WRAP boards (which were heavily suggested as the end to my pains if I would install and route my entire network) just isn't going to happen.
nickwhite
07-10-2006, 05:05 PM
I've been using OSPF on a combination of WAR boards, StarOS V2.11.4759 on both WRAP and desktop, and a Cisco 2600 router. I haven't seen any of the mentioned issues.
Nick
dc2005
07-10-2006, 05:45 PM
Doesn't OLSR require V3 which is only available on a WAR board?
Yes that's true - but I presume OLSR will soon be available in V3 for x86? Also, you would need to upgrade your V2 / Wrap boards to V3 and this is unlikely to be a free upgrade. Either way, you should carefully consider using OSPF at this point in time - read the previous posts in this topic to get a feel for some of the problems other people have encountered. If OLSR is working well, and the reports so far are positive, then I would be more inclined to go in this direction for now. For what it's worth, I have found the WAR boards with V3 to be an excellent platform and very reliable, both as access points and clients. I also agree with Lonnie that the WAR based solution now represents better value than any Wrap based alternative. However, and I'd be very happy to hear that I'm wrong on this, the OSPF reliability problems still seem be present in V3.
cdavis
07-11-2006, 02:58 PM
What version of OSPF is on the 4759 build? I wonder if that is why you aren't having any problems? I am still running the non-beta build but I could switch over as I build my ospf network.
I've been using OSPF on a combination of WAR boards, StarOS V2.11.4759 on both WRAP and desktop, and a Cisco 2600 router. I haven't seen any of the mentioned issues.
Nick
nickwhite
07-11-2006, 06:23 PM
What version of OSPF is on the 4759 build? I wonder if that is why you aren't having any problems? I am still running the non-beta build but I could switch over as I build my ospf network.
I believe quagga 99.3 which was beta at the time(?). The current unstable(beta) release of Quagga is 99.4 on 2006-05-10 (http://www.quagga.net/news2.php?y=2006&m=5&d=10). Also, not all of my units are running this version. Some are the official stable StarOS V2 and 2 different versions of StarV3 on several WAR boards - I believe that's three different versions of Quagga in there altogether, along with a Cisco 2600 router.
Nick
totalaccess
07-20-2006, 09:18 AM
I should chime in and mention that I have happily moved from statically-routed to OLSR. OLSR isn't the ticking time-bomb that OSPF seems to be. Something can reboot or a link can go down and OLSR actually does what it's supposed to everywhere!
what is OLSR?
lonnie
07-20-2006, 09:21 AM
http://www.olsr.org/ It is a MESH routing protocol that we have included with V3.
what is OLSR?
cdavis
08-03-2006, 10:13 AM
When I configure the gateway staros box I put in all of my network area definitions right? When I configure wrap boards anywhere else do I define all of the network areas again, just the networks that are directly connected to that wrap, or none because correct neighbor configuration will propogate all networks?