ge card doesn't seem to broadcast properly.

From: Gilad Kedem (gkedem@mercury.co.il)
Date: Thu Jun 12 2003 - 08:38:49 EDT


Hi everyone.

We've got this Enterprise 450 machine with a ge card running Solaris 8
108528-19. Let's call it bonaparte. At a certain point in time we noticed
that this machine fails to ping certain machines in it's own subnet whereas
other machines in the same subnet are pinged successfully. Name resolving
was fine but something went amiss.

While pinging lazarus for example, I typed and got the following:

bonaparte:/>tcpdump -eSv host lazarus
tcpdump: listening on ge0
13:09:09.550237 8:0:20:90:d1:d0 Broadcast arp 42: arp who-has
lazarus.ourdomain.com (Broadcast) tell bonaparte 13:09:10.550183
8:0:20:90:d1:d0 Broadcast arp 42: arp who-has
lazarus.ourdomain.com(Broadcast) tell bonaparte
13:09:11.550199 8:0:20:90:d1:d0 Broadcast arp 42: arp who-has
lazarus.ourdomain.com (Broadcast) tell bonaparte

This went on for a while. It's like bonaparte is not being told lazarus' MAC
address so it doesn't even send an echo request. Lazarus is pingable from
other machines in the subnet. Moreover, lazarus successfully pings
bonaparte:

13:29:02.459986 0:50:4:9d:1d:47 Broadcast arp 60: arp who-has bonaparte tell
lazarus.ourdomain.com 13:29:02.460075 8:0:20:90:d1:d0 0:50:4:9d:1d:47 arp
42: arp reply bonaparte is-at 8:0:20:90:d1:d0 13:29:02.466051
0:50:4:9d:1d:47 8:0:20:90:d1:d0 ip 74: lazarus.ourdomain.com > bonaparte:
icmp: echo request (ttl 128, id 10519, len 60) 13:29:02.466069
8:0:20:90:d1:d0 0:50:4:9d:1d:47 ip 74: bonaparte > lazarus.ourdomain.com:
icmp: echo reply (DF) (ttl 255, id 15865, len 60)

After this, bonaparte successfully pings lazarus for a while since it now
has lazarus' MAC address in its cache. After some time ping from bonaparte
to lazarus fails again since it is removed from bonaparte's arp table after
a certain time.

Another piece of information is that running rup from bonaparte prints out
only one line about bonaparte while running rup from other machines on the
subnet prints out a long list of machines on the subnet.

We replaced the cable, gigabit card and switch port to try and isolate the
source of the problem to no avail.

This information leads me to believe that there's some problem that has to
do with broadcasting.

Finally, we replaced the ge card with a hme card and the problem
disappeared; ping succeeds and rup shows proper information regarding the
machines on the network.

Any idea why the gigabit card is behaving in such a funny way?

Thanks,
Gilad.

gkedem staying at mercury dddottt co dddottt il.
-------------------------
Gilad Kedem
UNIX System Administrator
Mercury Interactive Ltd.
Tel - +972-3-5399492

________________________________________________________________________
This email has been scanned for all viruses.

Mercury Interactive Corporation
Optimizing Business Processes to Maximize Business Results

http://www.merc-int.com
________________________________________________________________________
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:34 EDT