When cat comes chasing...

From: WALI (hkhasgiwale@gmail.com)
Date: Fri Mar 23 2007 - 15:18:03 EST


We have 100 MBps EoATM link between two office buildings. Say A and B.
Server and majority of users are in Building A while a few (about 150) are
in Building B. Router on the Building B end is configured for QoS as there
is also Voice traffic floating across.

The connection between the two buildings has been recently upgraded to 100
MBps from initial 10Mbps. The gigabit interfaces on the two sides of
routers are set to Auto,Full Duplex.

Once every 2-3 days, users from building B starts to complain about slow
network connections to Servers lying in Building A. The usual ping from B
to A that takes <1ms, increases to 30-40ms. Ethereal shows no Broadcast
traffic. Building A users complain of no such problems either. 100 Mbps
connectivity between the two buildings remains under utilised. I have set
up an 'ntop' box in Building A with a mirror port to router interface on
this side. The max traffic "network load' graph is 3-4MB at peak time.

Crazy Solution: I take out any patch cable and re-inserts it, the problem
gets resolved. I reset any switch, the problem gets resolved. I disconnect
any uplink cable between the four switches or do a ARP reset thru command
line, the problem gets resolved for couple of hours or even days.

And something that I recently observed...I do nothing, the problem resolves
and ntop shows a sudden drop bringing network load suddenly down from the
maximum of 3MB.

But where could the problem lie?

I have ran Nessus, did find quite a few windows unpatched machines in
Building B that had lost their connection with WSUS, so did the patching.
Made sure that all the machines are running latest anti-virus definitions.
Sent a mail across to all users to get their laptops checked for latest
updates (few have returned although).

What else can I do next time the problem recurs. It's a mystery till now.
The switch support provider has upgraded the IOS and says there is nothing
wrong with the switch. The VoIP provider maintains there instruments are
fine. What else can help me here apart from routine wireshark/ethereal?

Yesterday, we forced the Switch provider to change the four switches from
one non-cisco type to another (again non-cisco) but quite renowned.

My concern is, if the problems recurs...the cat comes again out of the box,
it would be a big mystery to solve for which I have no clue.

Anyone...anything???

------------------------------------------------------------------------
This List Sponsored by: Cenzic

Need to secure your web apps?
Cenzic Hailstorm finds vulnerabilities fast.
Click the link to buy it, try it or download Hailstorm for FREE.

http://www.cenzic.com/products_services/download_hailstorm.php?camp=701600000008bOW
------------------------------------------------------------------------



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:57:40 EDT