5.1aPK2: Problems with NLD/rpc.lockd

From: Tobias Burnus (tobias.burnus@physik.fu-berlin.de)
Date: Mon Jun 17 2002 - 17:32:04 EDT


Dear Tru64 Unixers,

we applied about a fortnight ago PK2 on our TruCluster where we
immediatly encountered problems with ACLs and Autof
(see email by Wolfram Klaus on 10 June, 2002).

Starting from this morning we have massive NLD (NFS Locking Daemon)
or rpc.lockd problems. These are very visible using mail programs
such mutt which do (fcntl) locking since those programs simply hang. Using
the 'nolock' mount option or running locally on the TruCluster those problem
are not present. (The problem happens with both Linux and Tru64 unix 5.1a
clients.)

Additionally one finds messages of this type in the logs:
[TruCluster] cluserver_lockd[...]: Can't create client handle to [Tru64 Client] NLMv4: RPC: Port mapper failure - RPC: Timed out
[Tru64 client]lockd[...]: Can't create client handle to mail NLMv4: RPC: Program not registered

Restarting the rpc.lockd, rpc.statd, portmapper and even the complete
cluster didn't help.

I found on this list the following interesting email which may point to
the problem:
http://www.xray.mpe.mpg.de/mailing-lists/tru64-unix-managers/2002-04/msg00110.html
| Robert Mulley wrote:
| > "Patch kit 5 should be publicly available soon. Although it does seem
| > to create a problem with caa and cluster_lockd, which another patch in
| > the kit is meant to fix. Wait and see."

This relates to 5.1-PK5 not to 5.1a-PK2 but if one looks at the Patch Kit
Table one sees that both patches are released at about the same time
(April vs. May). (If this is indeed the problem then I don't get it why we
encountered this problem no sooner than today.)

With warm regards,

Tobias



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:44 EDT