SUMMARY: KZPCC and Hardware Component Management

From: Richard Jackson (rjackson@portal.gmu.edu)
Date: Wed Aug 07 2002 - 15:19:59 EDT


Hello,

I received some feedback from friendly tru64-unix-managers list readers and
HP/Compaq/DEC Customer Support Center staff. Thank you.

Selden E Ball Jr <SEB@LNS62.LNS.CORNELL.EDU>
Raul Sossa S. <RSossa@datadec.co.cr>
HP CSC Staff

I have a few questions related to the KZPCC-CE (3-port PCI RAID)
controller and the 'new' Tru64 UNIX 5.x hardware component management.
The KZPCC-CE is installed in a ES45 running Tru64 UNIX 5.1A patch kit #2.

-------------------------------------------------------------------------------
QUESTION:
1. Is it possible to print the KZPCC-CE configuration (e.g., for disaster
recovery)? The KZPAC-CB/KZESC-BA/KZPSC-BA RAID Configuration Utility (RCU)
allowed me to 'print' the configuration to a text file on a floppy to be
printed later. The Compaq StorageWorks KZPCC-CE and KZPCC-AC User Guide,
dated Aug 2001, on page 3-1 states the use of SMOR is restricted to
Compaq AlphaServers with graphics console setting and it is not supported
under a serial console.

ANSWER:
Try using a laptop computer attached to the ES45 serial port while
using KEATERM. Another suggestion is to install the SWCC KZPCC Agent
for Tru64 UNIX on the ES45 and then install the corresponding client in
a Microsoft Windows NT 4.0 or Windows 2000 machine. You'll be able to
print this info and manage the controller. Use the SWCC software at
http://www.compaq.com/alphaserver/products/storage/kzpcc.html or on the
CD included with the controller kit, Compaq Ultra2 Backplane RAID Controller
DS-KZPCC.

I have not tried either, yet.

-------------------------------------------------------------------------------
QUESTION:
2. Is it possible to save the KZPCC-CE configuration (e.g., for disaster
recovery)? The KZPAC-CB/KZESC-BA/KZPSC-BA RAID Configuration Utility (RCU)
allowed me to save the configuration to a floppy.

ANSWER:
I am told it is not possible.

-------------------------------------------------------------------------------
QUESTION:
3. If the KZPCC-CE fails and must be replaced or the configuration is lost,
how do I quickly restore the configuration? If the configuration must be
re-applied via SMOR, doesn't the SMOR 'Set System Config' initialize the
devices (i.e., the user data is lost)?

ANSWER:
Unfortunately, the replacement KZPCC-CE forces an initialize and the user
data must be restored from backup. Ouch!

-------------------------------------------------------------------------------
QUESTION:
4. How in the world is the KZPCC logical device defined in the SRM
console? That is, SMOR may report the RAID devices as ID 0, 1, and 2.
However, SRM may report dza526.0.0.2004.1, dza528.0.0.2004.1, and
dza532.0.0.2004.1. I understand if I have DZXabc, X is the controller
ID. How is the abc defined. For example, I had a RAID 1 (ID 0), JBOD
(ID 1), and RAID 0+1 (ID 2) devices. I purchased another disk drive
and converted the JBOD into RAID 1. SMOR reported the new RAID 1
device as ID 1 (HBA:0 Channel:0 Id:1 LUN:0) (this is good and what I
expected) but the SRM console and operating system treated the new RAID
device as ID/LUN 3. The SMOR ID appears to be ignored by the SRM and
Tru64 UNIX 5.1A.

ANSWER:
Why the SMOR ID appears to be ignored and how the SRM defines the device
name are a mystery.

-------------------------------------------------------------------------------
QUESTION:
5. Tru64 UNIX 5.1A pk #2 dsfmgr man page example 5 has
        /sbin/dsfmgr -R delete hwid 25
Shouldn't this be
        /sbin/dsfmgr -R hwid 25
ANSWER:
Yes, the dsfmgr man page is wrong. I have reported this issue to HP.
As a side note, http://www.tru64unix.compaq.com/docs/updates/V51A/TITLE.HTM,
Compaq Technical Update for Tru64 UNIX Version 5.1A, March 26, 2002: Replacing
SCSI Devices, has incorrect syntax, too.

-------------------------------------------------------------------------------
QUESTION:
6. Under Tru64 UNIX 4.0G or lower the DEC/Compaq/HP Field Service Engineers
would replace failed external SCSI tape drives (DLT) while the system is
running (no reboot, no system change). Under Tru64 UNIX 5.1A, must we now
do the following to retain the same device special file;
-------------
replace the tape drive
reboot
hwmgr -delete component -id XX
hwmgr -refresh component
dn_setup -init
dsfmgr -K
reboot
-------------
That is, if the same SCSI target is used, then is the hardware component
gymnastics and reboots necessary?

ANSWER:
The good news is a reboot may not be necessary. However, Hardware Component
Management gynastics are necessary. The preferred method is probably what is
described in the Tru64 UNIX Version 5.1A System Administrator book, section
5.4.4.11 Replacing a Failed SCSI Device. However, the instructions are
geared for a failed disk drive. I was given these instructions;

. remove old broken device (e.g., tape0)
. install replacement device
. hwmgr -scan scsi (find the new device)
. hwmgr -show scsi (list the devices)
. dsfmgr -K (create device special files, eg tape1)
. dsfmgr -e tape1 tape0 (exchange device special files)
. hwmgr -delete scsi -did 54 (delete old tape0 did from hwmgr show)

NOTE: the 'hwmgr -delete' step, burning incense, praying to your deity, and
sacrificing farm animals are all optional.

-------------------------------------------------------------------------------
QUESTION:
7. What value does hardware component management add that justifies the
added aggravation for the system administrator? Is the value the
ability to move a device from one target ID to another target ID and
continue to use the same device special file? If so, do system
administrators perform that task more often than replacing hardware?

ANSWER:
The benefit to non-cluster systems (i.e., standalone) is debatable.
Hardware Component Management benefits the cluster environment. Both
Context Dependent Symbolic Links (CDSLs) (e.g., /usr/sbin/cdslinvchk)
and hardware component management are forced upon non-cluster systems
(some may find /cluster/members/member0/tmp annoying, for example). On
the bright side it is a job security enhancement.

-------------------------------------------------------------------------------

-- 
Regards,						   /~\ The ASCII
Richard Jackson						   \ / Ribbon Campaign
Computer Systems Engineer,				    X  Against HTML
Information Technology Unit, Technology Systems Division   / \ Email! 
Enterprise Servers and Operations Department
George Mason University, Fairfax, Virginia


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:48 EDT