Monitoring disk and system health

From: Eiler, James A. (James.Eiler@alcoa.com)
Date: Tue Jan 24 2006 - 12:02:38 EST


Managers,

I've got a DS25 running T64 V5.1B, PK 4. It's got a Smart Array 5300A
RAID Controller (KZPDC-BE). This system is currently being used for S/W
development, but will ultimately be running in a lights-out environment.

   Question 1: What's the best way to monitor the health of the disk
drives?

   Question 2: What's the best way to monitor the overall health of the
system?

On some of our older systems we're using swxcrmon (for the KZPAC
controller) to send an email when it detects problems. But I haven't
yet found a similar utility for the KZPDC.

I've looked at the documentation for WEBES/System Event Analyzer (SEA),
but SEA doesn't appear to support the KZPDC.

Currently, when the DS25 system boots, a message goes streaming by on
the console that says something like:

   1720 Slot 2 Drive Array - S.M.A.R.T. Hard Drives Detect Imminent
Failure:
   SCSI Port 1: SCSI ID 1
   Do not replace drive unless all other drives in the array are
on-line!
   Backup data before replacing drives if RAID 0 being used.

Sure would be nice to get automatically notified of such issues.

THANKS!

Jim



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:28 EDT