statd issue

From: sun_question question (sun_question@hotmail.com)
Date: Fri Aug 23 2002 - 13:03:36 EDT


Hello,

Files served via NFS for diskless client. Both are Solaris 2.6.
Server and client both hang while rebooting after a server crash.
Messages file...
Server LOg...
Aug 16 09:53:52 m0ui unix: SUNW,hme0: Link Down - cable problem?
Aug 16 09:53:52 m0ui unix: SUNW,hme0: Using Internal Transceiver
Aug 16 09:53:52 m0ui unix: SUNW,hme0: 100 Mbps half-duplex Link Up
Aug 16 09:54:30 m0ui unix: SUNW,hme0: Link Down - cable problem?
Aug 16 09:54:32 m0ui unix: SUNW,hme0: Using Internal Transceiver
Aug 16 09:54:32 m0ui unix: SUNW,hme0: 100 Mbps half-duplex Link Up
Aug 16 09:55:45 m0ui statd[158]: statd: host m0rt is not responding
Aug 16 11:23:20 m0ui DB OPEN - liven_proc@ngdb[1001]: Error logging for 'DB
OPEN
Aug 16 11:24:22 m0ui unix: SUNW,m64B0 is /pci@1f,0/pci@1/ATY,3DCHARGER@2
Aug 16 11:24:22 m0ui unix: m64#0: 1152x900, 4M mappable, rev 4756.3a

Log at client:

Aug 16 09:55:45 m0rt unix: WARNING: consconfig: Unable to configure input
device
Aug 16 09:55:45 m0rt unix: WARNING: consconfig: Using keyboard as input
device
Aug 16 09:55:45 m0rt unix: cpu 0 initialization complete - online
Aug 16 09:55:45 m0rt unix: dump on /dev/swap size 65520K
Aug 16 11:25:12 m0rt unix: Centaur Sbus Driver Ver 3.1m (MSize: 0x500000;
Bus Cl
ock: 22000KHz).

Question-1)
Can I assume the statd on server is hung and waiting to talk to statd of the
client which had server's filesystem mounted before crash?
Would NFS server server hang just because statd on the server can't talk to
statd on client?

The diskless client is waiting for server to come up and start NFS server so
it can mount it's root filesytem and reboot.
This creates a loop and both machines hang for hours.
Since in this setup we always reboot client whenever server reboots or
crashes.
Assuming it is statd hung, I was thinking of removing diskless client's host
entry from the /var/statmon/sm and therefore it won't look to talk to statd
of the client and reboot. Since server will reboot faster now the diskless
client will mount it's root filesyste and reboot happily.
To remove diskless client's entry I was going to edit
/etc/rc2.d/S73nfs.client.
Question2)
Can I put a small loop before statd starts up again on server and
client?Will this clear locks so that statd won't hang anymore?
Do I do this on client and remove servers' entry from sm directory also?
Is it safe to remove client list like this?...
#This is S73.nfsclient below ...
case "$1" in
'start')
if [ -f /var/statmon/sm/* ]
then
rm /var/statmon/sm/*
fi

if [ -x /usr/lib/nfs/statd -a -x /usr/lib/nfs/lockd ]
then
/usr/lib/nfs/statd > /dev/console 2>&1
/usr/lib/nfs/lockd > /dev/console 2>&1

Question-3)
Aug 16 09:53:52 m0ui unix: SUNW,hme0: Link Down - cable problem?
Aug 16 09:53:52 m0ui unix: SUNW,hme0: Using Internal Transceiver
Aug 16 09:53:52 m0ui unix: SUNW,hme0: 100 Mbps half-duplex Link Up
Aug 16 09:54:30 m0ui unix: SUNW,hme0: Link Down - cable problem?
Aug 16 09:54:32 m0ui unix: SUNW,hme0: Using Internal Transceiver
Aug 16 09:54:32 m0ui unix: SUNW,hme0: 100 Mbps half-duplex Link Up

Why do I see this messages multiple times?
I have tpe-link-test?=true
diag-switch?=false
boot-device=disk net
diag-device=net

Turning tpe-link-test to false and diag-device to disk will help?

_________________________________________________________________
Chat with friends online, try MSN Messenger: http://messenger.msn.com
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:50 EDT