Problems booting 1st member node of cluster after replacing a fai led disk.

From: Hill, NM \(Nick\) (N.M.Hill@rl.ac.uk)
Date: Mon Jun 16 2003 - 13:55:03 EDT


Hi,

I have a cluster question for people (it is an Alphaserver SC but I believe
this is a basic cluster question)

Background:

6 node ES40 cluster, QSW cluster interconnect, first two nodes have fibre to
HSG80 where the cluster disks are.

On the first node in the cluster a disk in the ES40 internal bay failed. On
replacing the disk I did an hwmgr -scan scsi to pick up the new drive so I
could swap the device name with the failed one. hwmgr -show scsi failed to
show the new drive. While thinking about this and looking at the devices on
the cluster the node hung.

After pushing the reset button on the node a show device at the console
showed up everything I would expect to see including the replacement disk.
On booting the node the console shows all the normal boot messages including
the nodes being added to the cluster, quorum being established and activity
resuming. At the point where it would normally mount the cluster root I get
the message "waiting for cluster mount to complete" after which nothing
else happens. The other nodes in the cluster seem to be functioning OK with
disk activity on the HSG80 disks now being served by the second node.

I have tried booting off the alternate boot disk and get exactly the same
behaviour. If I boot off the non cluster Tru64 disk used during installation
it boots OK to single user and an hwmgr -show scsi shows all the local disks
and those I should be able to see on the HSG80.

Any ideas anyone?

One thought I had was if something that has happened with the persistent
reservations on the HSG80. All the units on the HSG show as having a
persistent reservation. I had a problem with this feature during the install
process where the HSG80 served disks all seemed to disappear at some point
due to persistent reservations. I did think about running cleanPR to clear
them but as yet haven't done so.

???

Nick Hill
Rutherford Appleton Labs
e-Science Centre
n.m.hill@rl.ac.uk
01235 445423

The contents of this e-mail are sent in confidence and are for the use
of the intended recipient only. If you are not the intended recipient do not
take action on it or show it to anyone else, but return it to the sender and
delete your copy of it.



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:23 EDT