PDA

View Full Version : Yet more OSPF weirdness


bairdc
05-23-2006, 11:35 PM
Okay here's a new one. I'm suddenly seeing screwy OSPF stuff on ethernet links. Previously, I had thought that Atheros links were the only places where the OSPF inconsistencies reared their ugly head. Not so.

I've recently been deploying core routers at many of my tower sites. These core routers consist of a Mini-ITX based PC with a 4-port NIC (routerboard 24). Each port of that 4-port NIC then connects to a WRAP board which either performs as a customer-facing AP, or one side of a backhaul link.

Well, since I started deploying these core routers, I've found the typical inconsistent OSPF behavior, on these ethernet links between my 4-port cards and my WRAPs. The symptoms are practically identical to what you see on Atheros links. Now, I've got some other machines also plugged into these 4-port NICs, and interestingly, I've never seen the issue with one of them. It's only with WRAPs. And here's one more interesting tidbit: The WRAPs and these 4-port NICs both happen to use the same ethernet chipset. They're both Natsemi-based:

From the WRAP:

National Semiconductor Corporation DP83815 (MacPhyter) Ethernet Controller

From the 4-port NIC:

National Semiconductor Corporation DP83815 (MacPhyter) Ethernet Controller

However, I have ethernet links where these same 4-port cards are connected to Intel, 3Com, and RTL8139 cards with absolutely no issues.

I have found that in many cases, OSPF neighbors just plain won't even sync up over these links. If you stop/restart OSPF repeated times on one side, or the other, it tends to eventually work, but not consistently.

To fix this issue, I've found that I have to set the interface to non-broadcast, and specify neighbors. Once this is done, things are generally stable, although I still see occasional weirdness.

Strange strange strange strange...

Craig

Beebe
05-24-2006, 05:58 PM
I've found several situations where OSPF has been acting wierd and when I specified the ospf router-id as the IP on the interface on the Internet side of the router on all routers on the network, it has solved problems several times for me. I have no clue why, because I understand the correct way would be to specify the lowest numbered IP on the router.

I've built two seperate OSPF networks each consisting of about 10 wraps with cm9s all running OSPF, and both networks are very solid with the exception of one link, which has a very weak signal. That is solid unless I misconfigure something.

In fact I have it set up so that when someone connects with PPPoE, the routing for their single IP is propogated through the network. So you can connect through any of my towers and get the same IP. Works great.

But I don't have any redundant links on my network so maybe that's the difference between me and you?

Thanks,
Roger

bairdc
05-24-2006, 07:28 PM
Redundant links may indeed be the difference. I've put a lot of effort into making much of my backbone redundant. In fact, I've even got two exit points on my network where I connect to the Internet (in two different cities), and I inject a default route from both of them into OSPF. That means that I don't even run a static default route. It's *all* OSPF, including the default route. That way if one of my Internet links ever becomes unreachable, the other default route will take over, and things will flow out the other one. The problem is that I'll randomly lose various routes occasionally, sometimes including the default route. This problem of losing routes is usually preceeded by an "activate changes" or a reboot of some backbone router. But occasionally it seems to happen out of the blue.

I wish I could find a silver bullet like setting the router-id to the IP on the Internet-facing interface. It's obviously helped in your situation. My problem is that since my network is redundant, there really isn't a specific interface that faces the Internet. On many of my backbone routers, I have two or three interfaces that *could* get to the Internet.

It's very frustrating putting a lot of time and money into building a redundant network in order to obtain maximum reliablity, only to have opposite effect.

Craig

j0n
06-14-2006, 03:12 PM
Have either of you guys got a config you could share? I've a network with a loop in it, , two internet exit points, three stuf betworks, and now and again.. some of the nodes in the middle of the loop drop a few packets, its enough that customers are noticing and complaining. I hadn't bothered setting the 'router-id' manually at all, so i'm intrigued by your setting to the 'internet side ip' solution. I don't yet know enough about quagga/ospf to try and debug this further.

Regards

bairdc
06-14-2006, 09:42 PM
I find it odd that a packet loss problem would be caused by OSPF. Generally, the OSPF screwiness manifests itself in more than just lost packets. Usually it causes loss of either a few routes, or complete loss of almost all routes on one or more interfaces of the wacked-out router. These route losses usually result in extended outages of entire network segments, and generally tend to last about 30 minutes or so (unless you intervene).

I would be more likely to attribute a packet loss problem to a bad NIC or cable, or in the case of a wireless link, interference.

Having said that, If you're looking for OSPF configs, I'd be happy to share any that I might have, but I'm not totally clear on exactly what sort of configs you might want... There have already been a number of OSPF config examples posted in other threads. I don't know if anything I posted would add to what has already been posted.

Craig

j0n
06-15-2006, 02:56 AM
Sorry, it was late when I posted that last night, the route to the affected network dissapears for a very short space of time. This dissapearence 'ripples' through the network and then comes back again shortly thereafter, though it seems to have stopped today...hmmm.

Cheers
John

aldo
07-04-2006, 01:01 AM
Redundant links may indeed be the difference. I've put a lot of effort into making much of my backbone redundant. In fact, I've even got two exit points on my network where I connect to the Internet (in two different cities), and I inject a default route from both of them into OSPF. That means that I don't even run a static default route. It's *all* OSPF, including the default route. That way if one of my Internet links ever becomes unreachable, the other default route will take over, and things will flow out the other one. The problem is that I'll randomly lose various routes occasionally, sometimes including the default route. This problem of losing routes is usually preceeded by an "activate changes" or a reboot of some backbone router. But occasionally it seems to happen out of the blue.

I wish I could find a silver bullet like setting the router-id to the IP on the Internet-facing interface. It's obviously helped in your situation. My problem is that since my network is redundant, there really isn't a specific interface that faces the Internet. On many of my backbone routers, I have two or three interfaces that *could* get to the Internet.

It's very frustrating putting a lot of time and money into building a redundant network in order to obtain maximum reliablity, only to have opposite effect.

Craig

i agree with this craig we run a very similar configuration to you with 4 gateways all running ospf we found another wierd problem the other day in that the subnet 10.54.1.178/29 does not set correctly in ospf.. yet it is fine in ospf on freebsd 6.0