Summary: Continueing KZPSC-BA Raid Panic

From: Ron Bramblett (bramblet@fuller.com)
Date: Tue Aug 24 2004 - 11:34:38 EDT


Original Question

I am still seeing this message upon reboot or in
the /var/adm/syslog.dated/DATE/kern.log

The system rebooted automatically on Sunday Morning. It was fine when I left
on Friday Night.

Aug 22 03:03:13 alfred vmunix: xcr0 at pci0 slot 8
Aug 22 03:03:13 alfred vmunix: re0 at xcr0 unit 0 (unit status = CRITICAL,
raid level = 5)
Aug 22 03:03:13 alfred vmunix:  (WRITE BACK cache operation SUPPORTED if
battery backup enabled)

How do I tell what is causing this problem? I reseated the controller,
restored the data, failed and rebuilt a drive in the raid set.

Answers and fixes

Kris Smith -- is wondering about the age of the battery backup if it has one.
No battery backup available. Thanks any way.

Warren Strum -- Same on battery backup.

Martin R Andersen -- Battery backup

Blake Brehl -- Check to see if drive failed
I'm not absolutely sure about this, but I think I've had CRITICAL's when
we've had 1 drive in a mirror set pair failed, and it mounted but was only
writing out to the one drive.  Maybe the 4 drive RAID set is operating on
only 3 good drives?

Try a show dev at the >>> SRM prompt, it may show something like
"compromised" on the particular RAID device.  .....which may drive the
CRITICAL message instead of ONLINE.

Also check at the RAID utility.  Run >>>ARC, Run a Maintenance Utility,
A:\RA200RCU (need the floppy for this), View Configuration and see if all
individual drives are Optimal.

Here is my 2 cents. (Ron Bramblett) --
I agree with Blake except I ran sys_check on this system yesterday. In the
swxcr manager properties it will tell you which drive failed. You have to
read it carefully but it is there. That is why I was running degraded
(actually I still am -- hard drive #0 is now failing)

swxcrmon is supposed to be pretty good but I haven't gotten any info on it
yet. I just am keeping a closer watch on my logs and running sys_check and
looking at the warnings more and more until I get back to a online status. I
have asked that we replace the controller card (on our own -- T/S could not
find the one I have in my machine -- even tried to bring me an eisa card
after I told them I have a pci card) I have also asked that we buy a battery
backup but don't have any answers yet.

Robert Collins (HP Escalation Team)
        Here is how to make diskette for swxcrmgr from CD.
You can use this procedure to make a floppy from the Firmware Update CD:

Copy the appropriate files from CD-ROM to Floppy diskette on a PC or Alpha
running Windows NT as described below.
Insert Firmware CD-ROM in drive
Insert formatted floppy diskette in drive A:
Select My Computer
Select CD-ROM drive in one window
Select Floppy drive in the other window
Select the directory \utility\swxcrmgr\ in cd window
To create the RCU floppy copy the following files to the floppy from the
\utility\swxcrmgr\ directory
RA200RCU.EXE
RA200SRL.EXE
RCUREL.TXT
00readme.1st
If you have the 3.5 or earlier CD the directory structure
is  \cdrom\utility\swxcrmgr\
Alpha Firmware CD's V3.8 and earlier will use SWXCRMGR.EXE and SRLMGR.EXE.
 
Close the floppy and CD window.
You now have a Standalone Raid Configuration Utility [RCU] floppy to run the
utility.

Here is how you run it:

Alpha Systems:

Invoke ARC console menu ( refer to Alpha system information on correct
method), usually at prompt P00>> ARC

Insert StorageWorks RAID Array 230/Plus Software RAID Configuration Utility
for Alpha Systems diskette into your floppy drive or insert the Alpha System
Firmware Update CD.
Select Run a program option from Alpha Boot menu. System displays Program to
run: prompt.
After Program to run: type:
A:RA200RCU for floppy or for CD type
CD:\utility\swxcrmgr\ra200rcu.exe- V3.9 CD and later
CD:\utility\swxcrmgr\swxcrmgr- V3.6 thru V3.8 CD and earlier (Not used for
KZPAC)
CD:\cdrom\utility\swxcrmgr\swxcrmgr.exe  - V3.5 CD and earlier (Not used for
KZPAC)
Press Enter. RCU checks drives, then displays Main Menu

Thanks everyone for helping. Still fighting but think I know now how to fix
it. Starting to ruin my reputation but when a drive fails (even 3 drives fail
what else can we do) all 3 drives were bought at the same time 3 years ago.

-- 
Ron Bramblett
Fuller Brush Company
Systems Adminstrator
Almost 100 years strong and still going ...


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:06 EDT