SUMMARY LONG ATTEMPT: NFS server getting "NFS3 Server servername is not responding" message

From: O'Brien, Pat (pobrien@mitidata.com)
Date: Thu Oct 03 2002 - 09:53:22 EDT

Next message: jreed@appliedtheory.com: "(NON)SUMMARY: question about 5.1a cluster patch install (fwd)"
Previous message: Goetzman, Dan: "TruCluster DNS slave and client"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Finding the needle in the hayloft may be easier to describe than attempting
to find a solution to this issue. I am sorry this has taken so long to try
and put together, though the issue has continuosly being addressed since
first reporting. MY issue has not been totally resolved, though it occurs
with less regularity and has migrated from a network issue to a resource
problem. Many may find a nfs issue become a performance tuning issue. It
has been suggested my resource problem is disks which are not fast enough or
lack of memory( I have 8gb ram). Installing tcpdump, and puting the gigabit
network cards into promiscuos mode, we have determined this is NOT a network
issue, but rather the recieving system is recieving data so fast, it can not
write the data out to disk fast enough. I am going to try and summarize some
of the many investigative steps collected from some on this list, insiders
at compaq, many past and/or present peers, and many others.:
begin with a good generic kernel networking parms:
socket:
        sb_max = 4194304
        somaxconn = 32767
        sominconn = 32767

inet:
        inifaddr_hsize = 512
        ipport_userreserved = 65535
        tcp_keepcnt = 12
        tcp_recvspace = 131072
        tcp_sendspace = 131072
        udp_recvspace = 131072
        udp_sendspace = 16384
        udp_ttl = 60
vm:
        ubc_maxpercent = 70 ( this was a last ditch attempt which helped a
little)

now set your number of nfsd and nfsiods to proper values:
>From a insider at compaq:
As we discussed, here are the network tuning parameters we should look at
for an NFS server. First, we usually adjust upward the number of udp and/or
tcp nfsd threads. We'd like to see some idle threads, so
monitor thread usage with " ps -Am -O THREAD | grep nfs" The last number on
each entry is the total cpu time. We want to see some threads (around 25%
of the total) with no cpu time. Since the threads are used in order, if all
threads are showing cpu time, that means that at some time all threads were
busy at the same time. Increase until some are always idle (up to a max of
128 combined udp and tcp threads). We've got 64 udp and 32 tcp threads
currently. We can increase as necessary until there are some idle
threads. Adjust nfsiod threads on the client in the same manner, looking for
some idle threads

We should also monitor the number of full sockets on the server with:

# netstat -p udp

If the full sockets parameter is incrementing, we should increase the
socket buffer size.

more notes:

NFS Performance Tuning

NFS Server Problems - "nfsstat -s"

1. Check for "badcalls". A non-zero or high number here may indicate
    rpc calls are timing out. Increase timeo and retrans on the clients
    mount statement. It could also indicate a user is in too many
    groups, or authentication problems.

2. See if "nullrecv" is greater that 0. If so, the NFS requests aren't
coming fast enough to keep the nfsd's busy. Reduce the number of
nfd's.

3. See if "symlink" is greater that 10%. If so, the clients are making
    excessive use of symbolic links that are on filesystems exported by
    the server. Replace the symbolic links with a directory, and mount
    both the underlying filesystem and the link's target on the client.

4. Look at "getattr". If it is greater that 60%, check for possible
    non-default attribute cache values on NFS clients. A very high
    percentage of getattr requests indicates that the attribute cache
    window has been reduced or set to zero with the 'noac' mount option.

5. If "null" is greater that 1%, the automounter has been configured to
    mount replicated filesystems, but the timeout values for the mount are
    too short. The null procedure calls are made by the automounter to
    locate a server for the filesystem; too many null calls indicates that
    the automounter is retrying the mount frequently. Increase the mount
    timeout parameter on the automounter command line.

NFS Client Problems - "nfsstat -c"

1. If "timeouts" is high, the client's RPC requests are timing out
before the server can answer them, or the requests are not reaching
the server. Check "netstat -i" output for network problems.

2. If "badxids" is close to being equal to "timeouts", the RPC requests
    that have been retransmitted are being handled by the server, and the
    client is receiving duplicate replies. Increase the 'timeo' parameter
    for this NFS mount to alleviate the request retransmission, or tune
    the server to reduce the average request service time.

3. If "timeouts" is high, but "badxids" is close to zero, this
    indicates that the network is dropping part of NFS requests or replies
    in between the NFS client and server. REduce the NFS buffer size
    using the 'rsize' and 'wsize' mount parameters to increase the
    probability that NFS buffers will transit the network intact.

4. If "badcalls" is greater that zero, RPC calls on soft-mounted
    filesystems are timing out. If a server has crashed, then badcalls can
    be expected to increase, but if badcalls grows duing "normal"
    operation, then soft-mounted filesystems should use a larger 'timeo' or
    'retrans' value to prevent RPC failures.

General Network

1. "netstat -s"
A non-zero count for "fragments dropped after timeout" in the "ip"
section, indicates the problem exists.

If fragments dropped are a problem, set the size of NFS read and write
buffers to 1K or 4K bytes

mount -o rsize=1024,wsize=1024 server:/dir /mnt

2. "netstat -I ln0 -s"
Check for "send failures, reasons include" and "receive failures,
reasons include:" sections.

3. "netstat -i"
Check for collisions > 5% of Opkts and any Ierrs or Oerrs. Any of
this could indicate a network hw or configuration problem.

4. Run "rpcinfo -p server" from the client and vice versa from the
server. Make sure server has mountd and nfs.

5. See if telnet or ping to and from the server and client are slow.
If so then it's a network hw or configuration issue.

6. Try rebooting the client and/or server.

7. Check IP routing tables.

8. Check system load average with "uptime". High system load will
slow everything, including network transmissions, causing timeouts.

9. Check for duplicate IP addresses.

10. Increase retrans from 4 to 10.

I would not perform this task nor the next one.

11. Increase timeo from 11 to 20.

12. Check /var/adm/syslog.dated/<date>/daemon.log for errors.

13. Install any Tulip, FDDI, AdvFS or NFS patches that may be applicable.

14. Use a network analyzer or sniffer if netstat shows ierrs, oerrs or
high collisions.

15. If the NFS client is a Sun system running Solaris V2.5, upgrade to
at least V2.5.1.

16. Have the clients try mounting the filesystems with nfsv2 option.
This is especially critical for PC clients, or UNIX systems not
supporting NFS Version 3.

17. Make sure it isn't a DNS problem, instead of NFS.

18. Make sure all nodes are either full-duplex or half-duplex. You
shouldn't mix these modes in the same LAN.

19. Make sure all nodes are either 10M or 100M speed, unless there is
     some hub or bridge device and can handle various speeds.

20. If the filesystme is AdvFS, a high percentage of fragmentation can
     cause this. This is caused by running short of BMT space.

21. If telnets work, but pings fail, flush the routing tables.
     /sbin/init.d/route stop
     /usr/sbin/route flush
     /sbin/init.d/route start

If using automount:

1. Run "automount -T -T" for debug information.

2. The "Network Administration and Problem Solving" Guide says this in
Appendix D, page D-7 says this about "server (PID n@mountpoint) not
responsing still trying" error messages.

    Explanation: An NFS request to the automount daemon with PID n serving
    mount point has times out. The automount daemon might be overloaded or
    not running.

User Action: If the condition persists, reboot the client. You can
also
do the following:

    1. Exit all processes that are using the automounted directories.
    2. Kill the current automount process.
    3. Restart the automount porcess from the command line.

3. This is noted in the automount man page, RESTRICTIONS section.

    "Because automount is singlethreaded, any request that is delayed by a
     slow or nonresponding NFS server will delay all subsequent automount
     requests until the delayed request has been completed."

> original question
>
> We have a nfs server with several clients over udp. these
> clients have
> normally performed heavy 50~80 GB file(s) copys to the nfs
> server. After
> upgrading to 5.1, we began reciveing floods of "nfs3 server
> servername not
> responding still trying" followed by "nfs3 server servername
> is ok". the
> server does utilize gigbit ethernet but a client gets
> messages over 100 fd.
> All interfaces have been re-checked. In the begining the
> server was running
> with 16 nfsd, which have been increased to 64. this reduced
> the quantity a
> couple notches, however they can still be reproduced with a 3
> gb file in a
> few minutes. the client still had the default 7 nfsiod which has been
> increased to 16 and then 24 with the condition getting worse.
> reseting the
> client number of nfsiod to 4 does seem to eliminate the
> issue, and also
> throttles network i/o back. we have reset this back to the default 7.
> netstat is not showing any dropped connections or full
> sockets or reaching
> peak network threads from netstat -m. nfsstat does continue
> the log a few
> badxid's and timeo, but below a percentage point or so which
> I have been led
> to believe is ok. reviewing the mount -l output data, we see
> that the nfs
> read & write memory buffers are larger than the prior 4.x version by a
> factor of 6. we have tried reducing these buffers, but this
> seems to make
> the issue worse. I am currently thinking about increase udp
> send & recieve
> buffers to something larger but less than sb_max and or the nfs mount
> buffers.
>
> BUT, I have this knawing feeling I am looking in the wrong area. I am
> wondering if this message means something else like maybe my
> nfs server
> disks may not be up to the job. with some testing I see that
> minimally
> configured (64mb of cache) hsz70 are worse than (256mb cache)
> hsg80. I
> would expect though if this was true, to log something for a
> scsi error
> which I am not.
>
>
> Any thoughts or brilliant ideas welcomed.
>
> pmob
>

Next message: jreed@appliedtheory.com: "(NON)SUMMARY: question about 5.1a cluster patch install (fwd)"
Previous message: Goetzman, Dan: "TruCluster DNS slave and client"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:55 EDT