sun routing screwed after DVMRP router failure

From: Jan De Luyck (ml@kcore.org)
Date: Mon Jan 05 2004 - 11:17:56 EST


Hello list,

I'm managing a backbone of 50 sun servers and workstations that handle
incoming and outgoing traffic to the Reuters network (TIBCO).

The boxes are Sun Ultra5 stations, running Solaris 2.6.

The backend sends out a multicast stream containing updates to various
instruments on various exchanges around the world. Clients can subscribe on
this stream.

The stream goes like this:

SUN box --> "MDS" network (where servers picking up the stream are) --> CAM
network (where clients are).

Last week, we had a router failure (router reset it's card) on the router
which connects between MDS and CAM. This caused the multicast stream to go
haywire to CAM, but everything was fine on MDS.

The secondary router immediately picked up, all other traffic worked fine,
except our multicast.

We had to physically reboot the servers to get things straighteded out again.
This, unfortunately, we figured out only after 3 hours of downtime (and
restarting nearly every process that has something to do with this).

Is this a bug in the solaris tcp stack? Is there any fix I can apply to
prevent this in the future?

Thanks.

Jan

-- 
When in doubt, mumble; when in trouble, delegate; when in charge, ponder.
		-- James H. Boren
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:27:46 EDT