again problems with login to cluster

From: Harald Baumgartner (hmb@mpe.mpg.de)
Date: Thu Jun 30 2005 - 11:48:50 EDT


Dear Managers,

cluster o21,o22,o23,o24 Tru64-Unix V4.1a, patchkit 1

again i had had problems to login after:

Jun 30 08:54:09 o21 CAAD[524944]: `o20_spooler` on `o21` went OFFLINE unexpectedly
Jun 30 08:54:09 o21 CAAD[524944]: `o20_outbound_gw` on `o21` went OFFLINE unexpectedly
Jun 30 08:54:09 o21 CAAD[524944]: `o20_supervisor` on `o21` went OFFLINE unexpectedly
Jun 30 08:54:09 o21 CAAD[524944]: Attempting to stop `o20_supervisor` on member `o21`
Jun 30 08:54:09 o21 CAAD[524944]: Attempting to stop `o20_outbound_gw` on member `o21`
Jun 30 08:54:09 o21 CAAD[524944]: Attempting to stop `o20_spooler` on member `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Stop of `o20_supervisor` on member `o21` succeeded.
Jun 30 08:54:12 o21 CAAD[524944]: Restarting `o20_supervisor` on `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Stop of `o20_outbound_gw` on member `o21` succeeded.
Jun 30 08:54:12 o21 CAAD[524944]: Restarting `o20_outbound_gw` on `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Attempting to start `o20_supervisor` on member `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Stop of `o20_spooler` on member `o21` succeeded.
Jun 30 08:54:12 o21 CAAD[524944]: Restarting `o20_spooler` on `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Attempting to start `o20_outbound_gw` on member `o21`
Jun 30 08:54:12 o21 CAAD[524944]: Attempting to start `o20_spooler` on member `o21`
Jun 30 08:55:13 o21 CAAD[524944]: Start of `o20_supervisor` on member `o21` succeeded.
Jun 30 08:55:13 o21 CAAD[524944]: Successfully restarted `o20_supervisor` on `o21`
Jun 30 08:55:13 o21 CAAD[524944]: Start of `o20_outbound_gw` on member `o21` succeeded.
Jun 30 08:55:13 o21 CAAD[524944]: Successfully restarted `o20_outbound_gw` on `o21`
Jun 30 08:55:13 o21 CAAD[524944]: Start of `o20_spooler` on member `o21` succeeded.
Jun 30 08:55:13 o21 CAAD[524944]: Successfully restarted `o20_spooler` on `o21`
Jun 30 08:56:13 o21 CAAD[524944]: `o20_supervisor` on `o21` went OFFLINE unexpectedly
Jun 30 08:56:13 o21 CAAD[524944]: Attempting to stop `o20_supervisor` on member `o21`
Jun 30 08:56:13 o21 CAAD[524944]: `o20_spooler` on `o21` went OFFLINE unexpectedly
Jun 30 08:56:13 o21 CAAD[524944]: Attempting to stop `o20_spooler` on member `o21`
Jun 30 08:56:13 o21 CAAD[524944]: `o20_outbound_gw` on `o21` went OFFLINE unexpectedly
Jun 30 08:56:13 o21 CAAD[524944]: Attempting to stop `o20_outbound_gw` on member `o21`
Jun 30 08:56:15 o21 CAAD[524944]: Stop of `o20_supervisor` on member `o21` succeeded.
Jun 30 08:56:15 o21 CAAD[524944]: `o20_supervisor` ran out of restarts on `o21`
Jun 30 08:56:15 o21 CAAD[524944]: `o20_supervisor` failed on `o21`, relocating.
Jun 30 08:56:15 o21 CAAD[524944]: Stop of `o20_spooler` on member `o21` succeeded.
Jun 30 08:56:15 o21 CAAD[524944]: `o20_spooler` ran out of restarts on `o21`
Jun 30 08:56:15 o21 CAAD[524944]: Stop of `o20_outbound_gw` on member `o21` succeeded.
Jun 30 08:56:15 o21 CAAD[524944]: `o20_outbound_gw` ran out of restarts on `o21`
Jun 30 08:56:15 o21 CAAD[524944]: `o20_spooler` failed on `o21`, relocating.
Jun 30 08:56:15 o21 CAAD[524944]: `o20_outbound_gw` failed on `o21`, relocating.
Jun 30 08:56:16 o21 CAAD[524944]: Attempting to start `o20_spooler` on member `o22`
Jun 30 08:56:16 o21 CAAD[524944]: Attempting to start `o20_outbound_gw` on member `o22`
Jun 30 08:56:16 o21 CAAD[524944]: Attempting to start `o20_supervisor` on member `o22`
Jun 30 08:57:19 o21 CAAD[524944]: Start of `o20_supervisor` on member `o22` succeeded.
Jun 30 08:57:19 o21 CAAD[524944]: Start of `o20_outbound_gw` on member `o22` succeeded.
Jun 30 08:57:19 o21 CAAD[524944]: Start of `o20_spooler` on member `o22` succeeded.

o20_spooler... are Advanced Printing System which is not really in use - some
user uses it.

i got Last successful login for xyz: Mon Jun 20 10:23:47 MEST 2005 from ds
Last unsuccessful login for xyz: Mon May 30 10:55:05 MEST 2005 from lapyy

/etc/motd will not be shown!

after rebooting member3 (o23) at ~11:50 this login problem is gone, but still
i cannot start o20_spooler.....( i stopped it at the end with: caa_stop o20_spooler...):

Jun 30 11:51:44 o22 CAAD[1049266]: Stop of `cluster_lockd` on member `o22` succeeded.
Jun 30 11:51:44 o22 CAAD[1049266]: Attempting to start `cluster_lockd` on member `o21`
Jun 30 11:51:44 o22 CAAD[1049266]: Start of `cluster_lockd` on member `o21` succeeded.
Jun 30 11:52:36 o22 mountd[1270012]: startup
Jun 30 11:52:36 o22 CAAD[1049266]: Resource cluster_lockd is already running on member o21
Jun 30 11:52:36 o22 statd[1268815]: startup
Jun 30 11:52:37 o22 statd[1268815]: rpc.statd: open current directory failed: No suchfile or directory
Jun 30 11:52:37 o22 lockd[1270666]: Can't create client handle to o22.xray.mpe.mpg.deSTATv1: RPC: Program not registered
Jun 30 11:52:42 o22 lockd[1270666]: rpc.lockd: Cannot contact status monitor!
Jun 30 12:10:07 o22 cluserver_statd[1283837]: startup
Jun 30 12:17:53 o22 CAAD[1049266]: `o20_spooler` on `o22` went OFFLINE unexpectedly
Jun 30 12:17:53 o22 CAAD[1049266]: Attempting to stop `o20_spooler` on member `o22`
Jun 30 12:17:53 o22 CAAD[1049266]: `o20_supervisor` on `o22` went OFFLINE unexpectedly
Jun 30 12:17:53 o22 CAAD[1049266]: `o20_outbound_gw` on `o22` went OFFLINE unexpectedly
Jun 30 12:17:53 o22 CAAD[1049266]: Attempting to stop `o20_outbound_gw` on member `o22`
Jun 30 12:17:53 o22 CAAD[1049266]: Attempting to stop `o20_supervisor` on member `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Stop of `o20_spooler` on member `o22` succeeded.
Jun 30 12:17:55 o22 CAAD[1049266]: Restarting `o20_spooler` on `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Attempting to start `o20_spooler` on member `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Stop of `o20_outbound_gw` on member `o22` succeeded.
Jun 30 12:17:55 o22 CAAD[1049266]: Restarting `o20_outbound_gw` on `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Attempting to start `o20_outbound_gw` on member `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Stop of `o20_supervisor` on member `o22` succeeded.
Jun 30 12:17:55 o22 CAAD[1049266]: Restarting `o20_supervisor` on `o22`
Jun 30 12:17:55 o22 CAAD[1049266]: Attempting to start `o20_supervisor` on member `o22`
Jun 30 12:18:57 o22 CAAD[1049266]: Start of `o20_spooler` on member `o22` succeeded.
Jun 30 12:18:57 o22 CAAD[1049266]: Successfully restarted `o20_spooler` on `o22`
Jun 30 12:18:57 o22 CAAD[1049266]: Start of `o20_outbound_gw` on member `o22` succeeded.
Jun 30 12:18:57 o22 CAAD[1049266]: Successfully restarted `o20_outbound_gw` on `o22`
Jun 30 12:18:57 o22 CAAD[1049266]: Start of `o20_supervisor` on member `o22` succeeded.
Jun 30 12:18:57 o22 CAAD[1049266]: Successfully restarted `o20_supervisor` on `o22`
Jun 30 12:19:36 o22 CAAD[1049266]: Attempting to stop `o20_spooler` on member `o22`
Jun 30 12:19:38 o22 CAAD[1049266]: Stop of `o20_spooler` on member `o22` succeeded.
Jun 30 12:19:45 o22 CAAD[1049266]: Attempting to stop `o20_supervisor` on member `o22`
Jun 30 12:19:47 o22 CAAD[1049266]: Stop of `o20_supervisor` on member `o22` succeeded.
Jun 30 12:19:52 o22 CAAD[1049266]: Attempting to stop `o20_outbound_gw` on member `o22`
Jun 30 12:19:54 o22 CAAD[1049266]: Stop of `o20_outbound_gw` on member `o22` succeeded

before reboot of o23 the cluster-lockd = /usr/sbin/rpc.lockd -c
was on o23 but the caa_stat -t told me on o21,o22...

Any suggestions?

Why does o20_spool... crash?

Nfs-server misconfigured?

                        Thanks, H. Baumgartner

-- 
Harald Baumgartner Max-Planck-Inst. fuer Extraterrestrische Physik
                   Postfach 1312, D-85741, Garching bei Muenchen, Germany
                   Phone :(Country Code 49) 89 30000-3346 or -4346 
                   e-mail: hmb@mpe.mpg.de   (Fax: 49-89/30000-3569)
		   http://www.xray.mpe.mpg.de/~hmb/


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:20 EDT