Question about very strange routing problem

From: Reed, Judith (jreed@navisite.com)
Date: Tue Mar 23 2004 - 11:03:17 EST


We have a set of 8 servers, all running tru64 v5.1a, pk2. They are all
connected to multiple nets, but all share a common net as well:

Server1 server2 server3 server4 server5 server6 server7 server8
    | | | | | | | |
----------------------------------------------------------------

server1/server2/server3 can all reach server8 and vice-versa.

server4/server5/server6/server7 can *NOT* reach server8, nor can server8
reach them.

server4/server5/server6/server7 can reach server1/server2/server3 and
vice-versa.

server1->server7 all know to go out the shared network to reach server8.

server8 knows to go out the shared network to reach server1->server3,
but tries to go out a different default route to reach server4->server7,
even when responding to pings/ssh/tcpdump packets coming in the shared
common network from the problematic servers.

server8 also knows the correct MAC address of *all* the servers on the
shared common network, and the correct IPs associated with those MAC
addresses.

There is nothing odd in /etc/routes or "netstat -r" output on *any* of
the servers, either the ones that *can* communicate or those who
*cannot*.

I'm baffled. If this wasn't a prod env. I'd reboot the d*mn server8, but
that's not an option. Anyone have any suggestions/insights???

Regards,

Judith Reed
jreed@navisite.com
Service delivery manager, Syracuse Data Center
315-453-2912 x5835



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:55 EDT