NFS Filesystem Disappearing

From: Darryl E. Marsee (fcldem@nersp.nerdc.ufl.edu)
Date: Fri Sep 05 2003 - 14:33:21 EDT


Greetings. I have an odd thing happening on a Netra X1 running
Solaris 8 (latest recommeded patch cluster applied). We have a
partition exported using NFS from an AIX 4.3.3 box on another subnet
to this Netra. At boot time, the Netra mounts the partition just
fine, using this vfstab entry:

 xxxxx:/xxxxx - /xxxxx nfs - yes rw,hard,bg,intr,timeo=3000,retrans=10

However, about three days later, that partition suddenly becomes
unavailable, with the following message on the Netra:

 nfs: [ID 664466 kern.notice] NFS getattr failed for server
      xxxxx: error 16 (RPC: Failed (unspecified error))

A df gives the following:

 df: (/xxxxx ) not a block device, directory or mounted resource

If I try to remount it, I get the message:

 NFS fsstat failed for server xxxxx: error 5 (RPC: Timed out)

And the only way to clear it is to reboot the Netra, afterwhich it
mounts just fine for three days and occurs again. The server never
becomes unreachable; you can ping it, ssh to it, do a showmount -e of
it, etc, from the Netra just fine even though the NFS mount fails, and
snooping the network shows packets are going back and forth between
the machines, but I still get this time out message when trying the
mount. We've tried mounting it without the timeo and retrans options,
soft instead of hard, and without the intr. Didn't make any
difference.

What's really odd about it is that (a) this mount has been working
fine on the Netra for over half a year and has just started this
behavior the past two weeks, and (b) the partition is also exported to
a Sunfire 280R, also running Solaris 8, also on the same subnet as the
Netra, also plugged into the same router, and it has never lost the
connection to that paritition. Not once. Only the Netra does, every
three days.

I've search the archives, and a couple of people posted problems
somewhat similar, but their solutions of (1) export it public, and (2)
just reboot to fix it, are not acceptable. A search of the Sun Patch
site didn't bring up anything that looked directly applicable that
wasn't already in the recommended cluster.

Any ideas are greatly appreciated. I'll summarize once a solution
is found, of course.

Regards,

Darryl Marsee
Florida Center for Library Automation
dmarsee@ufl.edu
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:27:04 EDT