SUMMARY: NFS warnings: Directory attributes ... are of the wrong dir

From: Frank Otto (Frank.Otto@pci.uni-heidelberg.de)
Date: Fri Feb 09 2007 - 08:13:59 EST


Hello list,

to repeat the problem in short:
On a Tru64 5.1A system that mounts some TB-sized filesystems via NFS
(where the NFS server uses XFS as the underlying filesystem), we
repeatedly see the warning:

"NFS3 LOOKUP: Directory attributes from <server> are of the wrong dir."

Thanks go to Ric Werme and Ann Majeske, who are both of the
opinion that the warning can safely be ignored, and the only adverse
effects to be expected is that the corresponding directory attributes
fail to be cached, which might have an impact on system performance.
(Though both state that they are not absolutely certain of this and
that one should probably consult an NFS expert.) Ric Werme also points
out that the issue should be fixed in V5.1B.

More details follow.

Ann Majeske (AM), who has access to the Tru64 NFS code, writes:
AM> [...] I found the message
AM> in the routine nfs3_lookup(), which is described as "Remote file system
AM> operations having to do with directory manipulation.". That routine first
AM> looks for the information it needs in an internal cache, and if it finds the
AM> information in the cache the message will not print out. This probably
AM> accounts for why the message is not printing out regularly. But, there is
AM> also a limit set so that the message will not print out at anything less
AM> than a 10 minute interval. So, I don't think that looking at the timing of
AM> when the message prints out will tell you anything. If the cache lookup
AM> fails, the nfs3_lookup() routine does an rfs3call() to do an
AM> NFSPROC3_LOOKUP. It looks like this is a request for information from the
AM> nfs server. If the fileid field of the attributes passed back from the nfs
AM> server doesn't match what we expect and at least 10 minutes have passed
AM> since the last time the message was printed out, the message is printed out
AM> again. The only potential negative thing I see in the code if the fileid
AM> fields don't match is that the new data might not be cached, so there may be
AM> a performance impact.
AM>
AM> So, it looks like this is an informational message, most likely caused by
AM> the Linux system not setting the fileid field in the attributes to a
AM> consistent value. This could either be because of a bug on the Linux
AM> system, or more likely because the Linux system hijacked part or all of the
AM> fileid field to use it to pass some other type of information.

Regarding the attribution of this problem to the Linux server, I have
a different feeling, as Ric Werme (RW) points out that there was a
signed/unsigned issue with the filedid field in V5.1A, and that is
was fixed in V5.1B:

RW> [...] The NFS V3 LOOKUP command passes a file handle for a directory and file
RW> name within that directory. A successful return includes the file handle
RW> for the file and attributes (the stuff that "ls -l" prints) for both directory
RW> and file.
RW>
RW> The message comes from a check added long ago for a PC-based server that
RW> did alway return the right directory attributes. In your case there may
RW> be an issue with XFS returning file IDs (inode numbers, or what "ls -i"
RW> prints) that are bigger than 2^31 and a comparison of fileids is failing.
RW> (Tru64 uses 32 bit fileids). IIRC, changes in V5.1B fix a signed/unsigned
RW> comparison and tuck extra file ID bits in the NFS rnode.

Looking at the inodes that XFS uses on the server, I see exactly this:
Many directories have inodes that are bigger than 2^31. (XFS seems to
use unsigned 32bit numbers for the inodes.)

Ric goes on to confirm what Ann also stated:

RW> There should be no real impact on your system, other than not updating
RW> the directory's attribute cache and the annoying message. The system may
RW> limit it to one per 10 minutes. [...]

Ric also suggested that there might be a patch even for V5.1A, however,
I have been unable to find one. Since upgrading to V5.1B is out of the
question for us, I guess we will now simply live with the warning and
not worry about it, especially since NFS I/O performance is not an
issue for the machine in question.

Thanks again to Ann and Ric for their very helpful answers.

Best regards,
Frank



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:33 EDT