SUMMARY: System panic

From: Jim Fitzmaurice (jpfitz@fnal.gov)
Date: Wed Sep 18 2002 - 16:31:45 EDT


According to the HP/Compaq software and hardware engineers (I talked with
both, before they decided what the problem was), I have a Memory Channel
card on this system that is starting to go bad, and should be replaced.

----- Original Question -----

> This is a 4100 running Tru64 v5.1 (PK-5) part of a 3 member cluster
> running TruCluster v5.1. The multiple security patch,
> T64V51B19-C0136901-15143-ES-20020817, was rolled in early yesterday
morning,
> without significant problems. (I always have a minor problem switching
> because clu_upgrade is not Kerberos friendly. and we run Kerberos.) This
> morning the system experienced the following error/panic, and rebooted:
>
> Sep 12 07:57:30 d0ola vmunix: rmerror_int: failover: mchan0 error_type =
> 0xe0000004 error_count = 0x1 time = 0x479183d808cb4
> Sep 12 07:57:30 d0ola vmunix: mcerr = 0x12020008 lcsr = 0xc07b
> mcport = 0x16440000
> Sep 12 07:57:30 d0ola vmunix: rm_crash_node_mask: caller =
> 0xfffffc00006e14d0, nodes_to_crash = 0x10, time = 0x479183d808cb4
> Sep 12 07:57:30 d0ola vmunix: panic (cpu 0): rm_lock_global_error: no good
> rail or can't get locks
> Sep 12 07:57:30 d0ola vmunix: rmerror_int: dismissed because of panic
>
> The strange thing about this is the cluster is the NFS/NIS server for out
> network and at the exact same time this system panicked, two Linux based
> NFS/NIS clients locked up. They had to be hard-booted to get the systems
> back up, one initially had problems mounting NFS drives, and the other
came
> up with the time skewed.
>
> I haven't seen this error before. Has anyone else? And how could it
> effect clients of a 3 member cluster where two of the members are just
fine?
>
> James Fitzmaurice
> D0 Online Systems Manager
> Fermi National Accelerator Laboratory
> (630) 840-4011
> jpfitz@fnal.gov
>
> UNIX is very user friendly, It's just very particular about who it makes
> friends with.
>
>



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:53 EDT