PDA

View Full Version : OSPF Quirkiness


bairdc
02-25-2005, 11:30 AM
Well, over the past week or so, I've migrated about half my network to OSPF from static routes. A couple of nights ago, I disabled all of the static routes except the default.

Now, I'm at the point where I should disable the static default routes, but I have to admit, I'm scared to death to do it. So far, in my experience, OSPF has proven to be "quirky" at best. Once it's running, it seems to run very well. However, right now my network is not having any problems. I have some serious doubts about OSPF's resilience when presented with some real-life network issues.

Basically, my fear stems from my experience in disabling static routes the other night. On at least two occasions, I disabled the static routes on a box, and then did the required "Activate Changes", only to find that a router two or three hops away suddenly had lost half of its OSPF-learned routes. In order to fix it, I had to log into the box having trouble (or one of its neighbors), and restart OSPF. The nasty thing there is that since all the static routes are gone, it's a bit more difficult to get access to a box if it loses its OSPF routes. It pretty much requires logging into a neighbor (assuming the neighbor still has its OSPF routes), and then SSH'ing from there to the affected box.

Anyway, If a simple "Activate Changes" on one box can break OSPF on another box up or downstream, then what is going to happen when a box reboots for some reason, or if a flakey wireless link causes a route to become unstable. I worry that instead of recalculating the routes on the fly and finding a route around the problem that OSPF will simply stop working on some routers, which will require me to log in to the affected routers, and restart OSPF. In other words, it will require me to babysit it. However, IMO, that's one of the reasons you run a routing protocol--so you don't have to babysit the network.

Anyway, as far as I can tell, I think I've followed all the guidelines posted by those who have had success (many thanks). All my wireless links are non-broadcast, with neighbors defined and I removed all the redistributes from each router. Like I said, OSPF seems to be very stable now, but then my network is currently very stable. I'm very nervous, however, about whether it will work as it should if any network instabilities are thrown into the mix.

I would love to have some input from those of you who have been running it for while. Have your OSPF implementations actually been faced with any network instabilites, and if so, how did it respond?

Craig

bairdc
03-03-2005, 10:46 AM
I assume the lack of response means that nobody has any suggestions on this. Since I posted last, I have removed some of my static default routes, and things have been fine until today. There were actually a couple of times that one box rebooted, due to either the hardware or ping watchdogs, and OSPF came back without any trouble.

However, today, I had an atheros link that was showing some extremely high latency, so I decided to try a channel change. As soon as I did an activate changes, things went south again. I ended up having to get into a couple of routers to stop and restart OSPF. Once I did that, everything was fine.

Anyway, so far, it seems fairly consistent that an "activate changes" causes OSPF to sometimes go crazy. I'm wondering if the cause of this could be something in the ordering of tasks that occur when you click activate changes. Specifically, I wonder: Does "activate changes" cause OSPF to stop and restart, and if so, does it happen before or after the interfaces are reset? What I'm thinking is perhaps Zebra has trouble if an interface is brought down and back up, and requires a restart if this happens. This would explain why it's sometimes necessary to restart OSPF after I do an "activate changes".

Lonnie, can you give any insight as to whether this could be a factor?

Thanks!

Craig