mountd dies repeatedly - why?

From: walt zapor (walt@dwkcb.com)
Date: Mon May 24 2004 - 13:06:15 EDT


  Hi, I am new to Sun Mangers list and apologies in advance for any
incorrect 'netiquette regarding this
list. However, after many years of using Solaris, I have never seen this
problem and humbly ask
advice of the gurus on this list as I am at my wits end.

Having a problem with mountd dying for no apparent reason for the last
few days. We CAN restart manually, but it will die again, in a couple of
hours.
Filesystem is NFS, extensive use of automount though no problem there,
and auto* files have not been changed.
A symptom is:
automountd[207]: quasar server not responding: RPC: Program not registered

I need help with finding CLUES to what is causing it to die.

Am looking at rpcinfo and nfsstat for clues (see below). No information
in /var/adm/messages that relates to this issue. Have not changed any
hardware or software nor added users.

Note the system is not loaded heavily at all and has a hardware RAID
(100GB, 40% full), attached for several years. No hardware issues with
the RAID as we have checked. Network activity to server is OK, via ping
-s tests.
Have also checked switches and no errors reported.

Configuration is Ultrasparc server running Solaris 8, patches fairly up
to date. About 30 Arris users and about
100 projects stored. I am running the following tests on a regular basis.

'NFS TEST'
rpcinfo -T udp quasar nfs
 'MOUNTD' TEST
rpcinfo -T udp quasar mountd
 'LOCKMGR' TEST
rpcinfo -T udp quasar nlockmgr
 'RPCINFO -P' TEST

quasar# ./rpctest
NFS TEST
program 100003 version 2 ready and waiting
program 100003 version 3 ready and waiting
MOUNTD TEST
program 100005 version 1 ready and waiting
program 100005 version 2 ready and waiting
program 100005 version 3 ready and waiting
LOCKMGR TEST
program 100021 version 1 ready and waiting
program 100021 version 2 ready and waiting
program 100021 version 3 ready and waiting
program 100021 version 4 ready and waiting
LLOCKMGR TEST
rpcinfo: RPC: Program not registered
RPCINFO -P TEST
  program vers proto port service
   100000 4 tcp 111 rpcbind
   100000 3 tcp 111 rpcbind
   100000 2 tcp 111 rpcbind
   100000 4 udp 111 rpcbind
   100000 3 udp 111 rpcbind
   100000 2 udp 111 rpcbind
   100004 2 udp 1023 ypserv
   100004 1 udp 1023 ypserv
   100004 1 tcp 1023 ypserv
   100004 2 tcp 32771 ypserv
   100007 3 udp 32775 ypbind
   100007 2 udp 32775 ypbind
   100007 1 udp 32775 ypbind
   100007 3 tcp 32772 ypbind
   100007 2 tcp 32772 ypbind
   100069 1 udp 32776
   100007 1 tcp 32772 ypbind
   100069 1 tcp 32773
   100009 1 udp 1022 yppasswdd
   100028 1 tcp 32774 ypupdated
   100028 1 udp 32777 ypupdated
   100024 1 udp 32784 status
   100024 1 tcp 32778 status
   100133 1 udp 32784
   100133 1 tcp 32778
   100232 10 udp 32785 sadmind
   100011 1 udp 32787 rquotad
   100002 2 udp 32790 rusersd
   100002 3 udp 32790 rusersd
   100002 2 tcp 32794 rusersd
   100002 3 tcp 32794 rusersd
   100021 1 udp 4045 nlockmgr
   100021 2 udp 4045 nlockmgr
   100021 3 udp 4045 nlockmgr
   100021 4 udp 4045 nlockmgr
   100012 1 udp 32797 sprayd
   100008 1 udp 32800 walld
   100001 2 udp 32803 rstatd
   100001 3 udp 32803 rstatd
   100001 4 udp 32803 rstatd
   100083 1 tcp 32822
   100221 1 tcp 32826
   100235 1 tcp 32830
   100021 1 tcp 4045 nlockmgr
   100021 2 tcp 4045 nlockmgr
   100021 3 tcp 4045 nlockmgr
   100021 4 tcp 4045 nlockmgr
   100068 2 udp 32814
   100068 3 udp 32814
   100068 4 udp 32814
   100068 5 udp 32814
   100229 1 tcp 32852 metad
   100230 1 tcp 32856 metamhd
   100153 1 udp 32818
   100003 2 udp 2049 nfs
   100003 3 udp 2049 nfs
   100227 2 udp 2049 nfs_acl
   100227 3 udp 2049 nfs_acl
   100003 2 tcp 2049 nfs
   100003 3 tcp 2049 nfs
   100227 2 tcp 2049 nfs_acl
   100227 3 tcp 2049 nfs_acl
   150001 1 udp 844 pcnfsd
   150001 2 udp 844 pcnfsd
   150001 1 tcp 845 pcnfsd
   150001 2 tcp 845 pcnfsd
   300598 1 udp 32901
   300598 1 tcp 32901
805306368 1 udp 32901
805306368 1 tcp 32901
   100249 1 udp 32902
   100249 1 tcp 32907
1289637086 5 tcp 32977
1289637086 1 tcp 32977
   100005 1 udp 33701 mountd
   100005 2 udp 33701 mountd
   100005 3 udp 33701 mountd
   100005 1 tcp 35188 mountd
   100005 2 tcp 35188 mountd
   100005 3 tcp 35188 mountd

quasar# nfsstat

Server rpc:
Connection oriented:
calls badcalls nullrecv badlen xdrcall dupchecks
282213 0 0 0 0 39196
dupreqs 0 Connectionless:
calls badcalls nullrecv badlen xdrcall dupchecks
831 0 0 0 0 1
dupreqs 0
Server nfs:
calls badcalls 282554 0 Version 2: (13 calls)
null getattr setattr root lookup readlink
13 100% 0 0% 0 0% 0 0% 0 0% 0 0%
read wrcache write create remove rename 0
0% 0 0% 0 0% 0 0% 0 0% 0 0%
link symlink mkdir rmdir readdir statfs 0
0% 0 0% 0 0% 0 0% 0 0% 0 0%
Version 3: (282553 calls)
null getattr setattr lookup access readlink
315 0% 116851 41% 8138 2% 44222 15% 32410 11% 0 0%
read write create mkdir symlink mknod
37444 13% 17666 6% 2105 0% 0 0% 0 0% 0 0%
remove rmdir rename link readdir readdirplus
1645 0% 0 0% 11 0% 70 0% 5555 1% 9562 3%
fsstat fsinfo pathconf commit 65 0% 224 0%
593 0% 5677 2%
Server nfs_acl:
Version 2: (0 calls)
null getacl setacl getattr access 0 0% 0
0% 0 0% 0 0% 0 0% Version 3: (0 calls)
null getacl setacl 0 0% 0 0% 0 0%
Client rpc:
Connection oriented:
calls badcalls badxids timeouts newcreds badverfs
1361 0 0 0 0 0
timers cantconn nomem interrupts 0 0
0 0 Connectionless:
calls badcalls retrans badxids timeouts newcreds
3 1 0 0 0 0
badverfs timers nomem cantsend 0 0
0 0
Client nfs:
calls badcalls clgets cltoomany 1284 1
1284 0 Version 2: (2 calls)
null getattr setattr root lookup readlink 0
0% 1 50% 0 0% 0 0% 0 0% 0 0%
read wrcache write create remove rename 0
0% 0 0% 0 0% 0 0% 0 0% 0 0%
link symlink mkdir rmdir readdir statfs 0
0% 0 0% 0 0% 0 0% 0 0% 1 50%
Version 3: (1272 calls)
null getattr setattr lookup access readlink 0
0% 506 39% 9 0% 216 16% 313 24% 0 0%
read write create mkdir symlink mknod
71 5% 42 3% 14 1% 2 0% 0 0% 0 0%
remove rmdir rename link readdir readdirplus
20 1% 1 0% 3 0% 6 0% 7 0% 10 0%
fsstat fsinfo pathconf commit 5 0% 4 0% 3
0% 40 3%
Client nfs_acl:
Version 2: (1 calls)
null getacl setacl getattr access 0 0% 0
0% 0 0% 1 100% 0 0% Version 3: (9 calls)
null getacl setacl 0 0% 9 100% 0 0%

-- 
Walt Zapor
DWKCB Inc. Architects
215-368-5806
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:28:43 EDT