[HPADM] RE: Summary EMS errors

From: Rasheed Tamton (Contractor) (Rasheedt@stc.com.sa)
Date: Sat Sep 07 2002 - 06:58:00 EDT


Please find below the output of the resdata command as asked by Bill.

Thanks
Rasheed.
--------------------------------------
CURRENT MONITOR DATA:

Event Time..........: Sat Sep 7 04:34:24 2002
Severity............: SERIOUS
Monitor.............: ha_disk_array
Event #.............: 100271
System..............: nce_5rs

Summary:
     Storage array controller at hardware path 10/0.1.0 : Device
connectivity
     or hardware failure

Description of Error:

     The device aborted the command. The initiator may be able to recover
by
     retrying the command.

Probable Cause / Recommended Action:

     The device may have been powered off and may be being powered on.

     Alternatively, one or both of the terminators on the SCSI bus may be
     missing. Install the terminators in their proper locations at the ends
of
     the SCSI bus.

     Alternatively, the SCSI cable may have become detached from the
device.
     Re-attach the cable.

     Alternatively, the SCSI cable may have failed. Replace it.

     Alternatively, the device may be in a state where it could not process
     this, or any, request. Cycle power to the device.

     Alternatively, there could be more than one device having the same
address
     on the SCSI bus. Make all the addresses on the SCSI bus unique.

     Alternatively, the total length of all cable segments on the SCSI bus
     exceeds 25 meters. Replace one or more cable segments until the total
     length is less than this value.

     Alternatively, if all of the above fail to correct the problem, the
device
     has experienced a hardware failure. Contact your HP support
representative
     to have the device checked.

     Alternatively, if messages corresponding to this condition appear in
the
     log for more than one device on the SCSI bus, the device adapter may
be in
     a state from which it cannot extract itself. Perform a system
shutdown,
     cycle power to the computer and wait for it to reboot.

     If, after reboot, messages corresponding to this condition continue to
     appear in the log for this SCSI bus, contact your HP support
     representative to have the adapter checked.

Additional Event Data:
     System IP Address...: xxx.xx.xx.xx
     Event Id............: 0x3d7957a000000004
     Monitor Version.....: B.01.00
     Event Class.........: I/O
     Client Configuration File...........:
     /var/stm/config/tools/monitor/default_ha_disk_array.clcfg
     Client Configuration File Version...: A.01.00
          Qualification criteria met.
               Number of events..: 1
     Associated OS error log entry id(s):
          0x3d79579b00000003
     Additional System Data:
          System Model Number.............: 9000/800
          OS Version......................: B.10.20
          System Serial Number............: unavailable
          System Software ID..............: 1825387421
          EMS Version.....................: A.03.20
          STM Version.....................: A.29.00
     Latest information on this event:
           http://docs.hp.com/hpux/content/hardware/ems/scsi.htm#100271

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S
v-v-v-v-v-v-v-v-v-v-v-v-v

Component Data:
     Physical Device Path.......: 10/0.1
     Inquiry Vendor ID..........: DGC
     Inquiry Product ID.........: C2300WDR5
     Serial Number..............: 3916a78227
     Firmware Version...........: HP02
     Device Class...............: 13

Product/Device Identification Information:

     Logger ID.........: sdisk
     Product Identifier: C2300 Array
     Product Qualifier.: DGCC2300WDR5
     SCSI Target ID....: 0x01
     SCSI LUN..........: 0x00

I/O Log Event Data:

     Driver Status Code..................: 0x00000005
     Length of Logged Hardware Status....: 22 bytes.
     Offset to Logged Manager Information: 24 bytes.
     Length of Logged Manager Information: 44 bytes.

Hardware Status:

     Raw H/W Status:
          0x0000: 00 00 00 02 70 00 0B 00 00 00 00 0A 00 00 00 00
          0x0010: 00 00 00 00 00 00

     SCSI Status...: CHECK CONDITION (0x02)
          Indicates that a contingent allegiance condition has occurred.
Any
          error, exception, or abnormal condition that causes sense data to
be
          set will produce the CHECK CONDITION status.
     
SCSI Sense Data:

     Undecoded Sense Data:
          0x0000: 70 00 0B 00 00 00 00 0A 00 00 00 00 00 00 00 00
          0x0010: 00 00
     
     SCSI Sense Data Fields:
          Error Code : 0x70
          Segment Number : 0x00
          Bit Fields:
               Filemark : 0
               End-of-Medium : 0
               Incorrect Length Indicator : 0
          Sense Key : 0x0B
          Information Field Valid : FALSE
          Information Field : 0x00000000
          Additional Sense Length : 10
          Command Specific : 0x00000000
          Additional Sense Code : 0x00
          Additional Sense Qualifier : 0x00
          Field Replaceable Unit : 0x00
          Sense Key Specific Data Valid : FALSE
          Sense Key Specific Data : 0x00 0x00 0x00
                       
          Sense Key 0x0B, ABORTED COMMAND, indicates that the device
aborted
          the command. The initiator may have been able to recover by
trying
          the command again.
                       
          The combination of Additional Sense Code and Sense Qualifier
(0x0000)
          indicates: No additional sense information.

SCSI Command Data Block:

     Command Data Block Contents:
          0x0000: 28 00 00 00 01 20 00 00 04 00
     
     Command Data Block Fields (10-byte fmt):
          Command Operation Code...(0x28)..: READ
          Logical Unit Number..............: 0
          DPO Bit..........................: 0
          FUA Bit..........................: 0
          Relative Address Bit.............: 0
          Logical Block Address............: 288 (0x00000120)
          Transfer Length..................: 4 (0x0004)

Manager-Specific Information:

     Raw Manager Data:
          0x0000: 00 00 08 00 00 00 00 00 00 3B 78 38 00 00 00 00
          0x0010: 13 36 08 00 13 36 09 00 13 DF D0 00 01 00 12 7B
          0x0020: 00 0A 28 00 00 00 01 20 00 00 04 00

     Manager Specific Data Fields:
          Data Residue...........: 0x00000800
          Sense Status...........: 0x00000000
          Request ID.............: 0x003B7838
          Additional I/O Status..: 0x00000000
          BUSP Struct Pointer....: 0x13360800
          Target Struct Pointer..: 0x13360900
          LUN Struct Pointer.....: 0x13DFD000
          Target ID..............: 0x01
          LUN ID.................: 0x00
          Sense Data Length......: 0x12
          Tag....................: 0x7B
          Retry Count............: 0x00

-----Original Message-----
From: Rasheed Tamton (Contractor)
Sent: Wednesday, September 04, 2002 9:25 AM
To: 'hpux-admin@dutchworks.nl'
Subject: Summary [HPADM] EMS errors

Thanks for Bill Hassell & Jim McDonald (Pls see below their comments). I
have already informed HP (because it is under their support) the output of
the resdata command few weeks ago, and they asked me again and again to run
the same command and send them the output. I did it more than three/four
times and did not get any concrete response and this message repeats always.

As it is a diskarray with much data, we are worried some thing wrong might
happen. That was why I approached the list and wanted to know whether I can
ignore these messages or not.

Thanks again,
Rasheed.

--
Can't tell a thing. These messages simply tell
  you to run the resdata program as stated in
  the message. It will have the extensive details
  needed to decode te message.
  You'll need to copy this message on the command
  line and see what it reports.
--
Try doing what it says to do:
Run
/opt/resmon/bin/resdata -R 232194054
-r/storage/events/disk_arrays/High_Availability/10_0.1 -n 146210828 -a
and Run
/opt/resmon/bin/resdata -R 71499778
-r/storage/events/tapes/SCSI_tape/56_52.0.0 -n 71499820 -a
This will print a more inforamative diagnosis, it should include a
reference to a HP error code web-page for more info.
It the box/es are on HP support take it up with them after running the
commands
---
My original question:
Hi Admins,
I get the below messages on my syslog.log file some times. Can anyone
please advise me as to whether these are real error messages or not. Is
there any proactive things I have to do from my end. Both are HP-UX 10.20
systems. First one with Informix and the second with Oracle.
Aug 31 04:34:35 nce_5rs EMS [2231]: ------ EMS Event Notification ------
Value: "SERIOUS (4)" for Resource:
"/storage/events/disk_arrays/High_Availability/10_0.1"     (Threshold:  >
= "
3")    Execute the following command to obtain event details:
/opt/resmon/bin/resdata -R 232194054 -r
/storage/events/disk_arrays/High_Availability/10_0.1 -n 146210828 -a
Aug 31 02:10:31 riy10p01 EMS [1091]: ------ EMS Event Notification ------
Value: "CRITICAL (5)" for Resource:
"/storage/events/tapes/SCSI_tape/56_52.0.0"     (Threshold:  >= " 3")
Execute the following command to obtain event details:
/opt/resmon/bin/resdata -R 71499778 -r
/storage/events/tapes/SCSI_tape/56_52.0.0 -n 71499820 -a
Thanks in advance,
Rasheed.
--
             ---> Please post QUESTIONS and SUMMARIES only!! <---
        To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
       Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
 
 Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
            http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 11:02:19 EDT