SUMMARY : Problems with 5.1a cluster create/first boot

From: Browett, Darren (dbrowett@coquitlam.ca)
Date: Mon Nov 04 2002 - 12:57:40 EST


First, sorry for the late summary, but it has been a long 3 weeks.

Orginally I was having a problem with my 5.1a cluster booting for the
first time,
I posted the question (included below) and received replys from the
following

David Dewolfe ( who discussed the issue with me via phone)
Raul Sossa,
Jason Orendorf
Paul Thompson
Robert Collins

The reply's mostly concerned Firmware Rev's and patch kits.

The problem turned out to be a bad KGPSA, with a very low firmware rev.

As a test one day, I decided to leave the 5.1a in the state described
below, after
a time it caused the SAN/swith/Alpha/Tru64 4.0f Cluster to Hang, and the
only way
out of it was to reboot the SAN, Swith, alpha's.

The lesson for myself in this whole mess, is to double,triple check all
firmware rev levels.

Darren (who how has a two member 5.1a cluster up and running)

------------------------------------------------------------------------
-------

Original Question

Hi Managers

I am in the process of building a new cluster, I have successfully

1. Configured HSG for cluster root,usr,var and boot disks
2. installed all software AND patches on HSG80
3. able to boot from "install" disk from HSG80
4. ran clu_create to create the cluster

But when the system boots for the first time, I see activity on the HSG,
disk activity is happening,
then it halts a the line :

ic2 Server Management Hardware Present
DRD Cancelling register against 50 due to expired time - retrying
operation

cam_logger :SCSI event packet
cam_logger: hardware_id=50 bus 2 target 1 lun 3
cdisk_handle = pr_ccb

request has been cancelled due to errors
hard error detected
Hardware id=50
DEC HSG80 V86F
active CCB at time of error

I am not sure where to look for a hardware error, as the HSG appears to
running fine, plus I as
able to boot from the install disk in the first place.

Will Summarize

------------------------------------------------------------------------
--------------------------------------------------
Darren Browett P.Eng This
message was transmitted
Data Administrator using
100% recycled electrons
Information and Communication Technology
City of Coquitlam
P:(604)927 - 3614
E:dbrowett@city.coquitlam.bc.ca
------------------------------------------------------------------------
---------------------------------------------------



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:58 EDT