SUMMARY:network time outs

From: Cohen, Andy (Andy.Cohen@cognex.com)
Date: Fri Jan 27 2006 - 11:55:09 EST


Sorry -- the previous posting was sent prematurely.

SUMMARY
=======

I got some helpful tips on how to get information about the nic and the
network. I haven't found out anything yet but I'm still working on it.

I did find a very useful web page about tuning network settings:
http://h30097.www3.hp.com/docs/internet/TITLE.HTM

Other suggestions:

======
1.
Reference the "lan_config" command to setup auto-negotiate parameters
(among other things).
 
2.
#hwmgr get attr -cat network

==========
What sort of switch are you using? Surprisingly few GbE switches can
buffer more than a few msec of data. What are the other systems and
what are they running? Do you have any 100 Mbps systems, those may be
interesting too.

Most of my testing has been with NFS where I can easily saturate a GbE
link, but a DS20 may be able to do it with a single process. If that's
the case, then the switch may start falling behind.

I'm not sure what the TCP window size is for ftp, rcp, etc. NFS over
UDP puts out a load that can be as great as the number of nfsiod threads
plus programs accessing NFS files times the 48 KB
(Tru64-Tru64) I/O size. NFS over TCP uses a single connection per
mount, but uses a large window size to try to make up for it.

========
System defaults are:

  % /sbin/sysconfig -q inet | grep space
  tcp_recvspace = 61440
  tcp_sendspace = 61440
  udp_recvspace = 42240
  udp_sendspace = 9216 [I'm not certain if this is enforced]

NFS does things differently:

  % dbx -k /vmunix
  (dbx) pd nfs_tcpsendspace
  500000
  (dbx) pd nfs_tcprecvspace
  500000

Tcpdump (or snoop or ethereal) traces can be extremely important.
When looking at possible lossage in routers, traces captured on both
sides of the switch are important. Tcpdump is a bit of a pain to get
going on Tru64, yell if you need a hand.

Instead of ftp or rcp, programs like ttcp or netperf
(http://www.netperf.org/netperf/NetperfPage.html) dispense with file
system overhead and give ou numbers that you can compare between systems
and between runs.

=========

And to get actual running speed:

hwmgr -get attrib -category network | grep -e "name" -e media_sp -e
duplex

Thanks everybody!
Andy

ORIGINAL QUESTION
=================
Hi,

We have a DS20E running T64 5.1 PK6. A year or two ago we installed a
gigabit nic (I don't know the model off-hand but can get it if need be).
Lately we've noticed that we get time outs when transferring large files
(several gb) from this machine to some other machine (e.g., ftp, rcp,
legato, etc.). Using collect/collgui I don't see any particular issue
with the nic/network (collissions, bandwidth %, etc.) and with ifconfig
I can't tell much about the nic's config. The boot message has this:

Avmunix: alt0 at pci0 slot 8
vmunix: alt0: DEGPA (1000BaseT) Gigabit Ethernet Interface, hardware
address: 00-60-CF-21-71-90
vmunix: alt0: Driver Rev = V2.0.16 NUMA, Chip Rev = 6, Firmware Rev =
12.4.12
vmunix: alt0: 1000 Mbps full duplex Link Up via autonegotiation

Transfers of smaller files even a few hundred mb seem to go just fine.

What sorts of things can I start poking at to see if I can find a
problem. Other machines on the network don't have this problem -- only
this one so we suspect some configuration setting. How can I find out
information about the nic (speed, duplexing, etc.) while the OS is
running (i.e., not from the boot-prompt)? If I remember correctly this
card can only be set to autonegotiate (usually we force the nics on our
unix boxes to a certain speed and duplexity).

Thank you all very much!

Andy



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:28 EDT