Connectivity has been lost for device

From: Notari, Ed (NotariE@usa.redcross.org)
Date: Thu Jun 27 2002 - 11:27:42 EDT


The following EVM logs and SCU EDT have been provided for the following
problem...

Yesterday morning and this morning (1:30 AM), when NETWORKER kicks off it's
backup session, there are SCSI errors. The end result of these errors is to
disconnect the tape drive and thus suspend the backup. I haven't changed
anything on the system, hardware or software or NETWORKER for a few month's
now. All has been smooth sailing. I have searched the archives and found
quasi-similar problems, but none that lead to device disconnects.

Yesterday I halted the system for five minutes and physically turned off
everything, including the external DLT7000 in question. After a normal
power-up, I then manually restarted the NETWORKER backup which actually
completed the FULL backup which it aborted in the A.M. This A.M. the
scheduled backup failed immediately after trying to start.

The Questions...
1) What is the cause/solution?
2) Can a disconnected device be re-connected?

If any other material is needed please let me know.

***************** BEGIN SCU EDT *********************

CAM Equipment Device Table (EDT) Information:

Bus/Target/Lun Device Type ANSI Vendor ID Product ID Revision N/W
-------------- ----------- ------ --------- ---------------- -------- ---
 0 0 0 Direct SCSI-3 IBM DDYS-T18350N S96H W
 0 1 1 Direct SCSI-2 IFT 3101 0223 W
 0 2 0 Direct SCSI-2 SEAGATE ST118273LW 6246 W
 0 3 0 Direct SCSI-3 SEAGATE ST39204LW 0002 W
 1 4 0 Direct SCSI-3 IBM DDYS-T18350N S96H W
 1 6 0 CD-ROM SCSI-2 DEC RRD46 (C) DEC 0557 N
 2 1 0 Direct SCSI-3 IFT SR2000 0315 W
 3 2 0 Direct SCSI-3 IFT SR2000 0315 W

***************** BEGIN EVM LOGS *********************

======================= Binary Error Log event =======================
EVM event name: sys.unix.binlog.hw.scsi

    Binary error log events are posted through the binlogd daemon, and
    stored in the binary error log file, /var/adm/binary.errlog. This
    event is used to report all SCSI device errors, including disk,
    tape, HSZ raid events, and adapter errors.

======================================================================

Formatted Message:
    SCSI event

Event Data Items:
    Event Name : sys.unix.binlog.hw.scsi
    Priority : 200
    Timestamp : 27-Jun-2002 01:30:36
    Host IP address : 10.161.2.64
    Host Name : epi
    Format : SCSI event
    Reference : cat:evmexp.cat:300

Variable Items:
    subid_class (INT32) = 199
    subid_num (INT32) = 0
    subid_unit_num (INT32) = 0
    subid_type (INT32) = 0
    binlog_event (OPAQUE) = [OPAQUE VALUE: 832 bytes]

============================ Translation =============================
Sequence number of error: 43
Time of error entry: 27-Jun-2002 01:30:36
Host name: epi

SCSI CAM ERROR PACKET
SCSI device class: DISK
Bus Number: 0
Target number: 0
Lun Number: 0

Name of routine that logged the event: cdisk_act_mon_thread
Event information: Possible SCSI Bus I/O loading issue, device responses are
exceeding expected times
Informational event: Information Message Detected (recovered)
Event information: Hardware ID = 58
Device Name: IBM DDYS-T18350N S96H
Event information: Active CCB at time of error
Event information: CCB request is in progress

                ############### Entry End ###############

======================================================================

======================= Binary Error Log event =======================
EVM event name: sys.unix.binlog.hw.scsi

    Binary error log events are posted through the binlogd daemon, and
    stored in the binary error log file, /var/adm/binary.errlog. This
    event is used to report all SCSI device errors, including disk,
    tape, HSZ raid events, and adapter errors.

======================================================================

Formatted Message:
    SCSI event

Event Data Items:
    Event Name : sys.unix.binlog.hw.scsi
    Priority : 400
    Timestamp : 27-Jun-2002 01:30:43
    Host IP address : 10.161.2.64
    Host Name : epi
    Format : SCSI event
    Reference : cat:evmexp.cat:300

Variable Items:
    subid_class (INT32) = 199
    subid_num (INT32) = 0
    subid_type (INT32) = 54
    binlog_event (OPAQUE) = [OPAQUE VALUE: 320 bytes]

============================ Translation =============================
Sequence number of error: 45
Time of error entry: 27-Jun-2002 01:30:43
Host name: epi

SCSI CAM ERROR PACKET
SCSI device class: UNKNOWN
Bus Number: 0
Target number: 7
Lun Number: 7

Name of routine that logged the event: itpsa SCSI HBA
Event information: HTH intr. on bus 0, SBCL = 0x2e

                ############### Entry End ###############

======================================================================

======================= Binary Error Log event =======================
EVM event name: sys.unix.binlog.hw.scsi

    Binary error log events are posted through the binlogd daemon, and
    stored in the binary error log file, /var/adm/binary.errlog. This
    event is used to report all SCSI device errors, including disk,
    tape, HSZ raid events, and adapter errors.

======================================================================

Formatted Message:
    SCSI event

Event Data Items:
    Event Name : sys.unix.binlog.hw.scsi
    Priority : 400
    Timestamp : 27-Jun-2002 01:30:43
    Host IP address : 10.161.2.64
    Host Name : epi
    Format : SCSI event
    Reference : cat:evmexp.cat:300

Variable Items:
    subid_class (INT32) = 199
    subid_num (INT32) = 0
    subid_unit_num (INT32) = 32
    subid_type (INT32) = 1
    binlog_event (OPAQUE) = [OPAQUE VALUE: 384 bytes]

============================ Translation =============================
Sequence number of error: 47
Time of error entry: 27-Jun-2002 01:30:43
Host name: epi

SCSI CAM ERROR PACKET
SCSI device class: TAPE
Bus Number: 0
Target number: 4
Lun Number: 0

Name of routine that logged the event: ctape_async
Event information: Bus reset notification
Hardware detected event: Hard Error Detected
Event information: Hardware ID = 63
Device Name: QUANTUM DLT7000 276A
======================================================================

============================ EVM Log event ===========================
EVM event name:
sys.unix.hw.state_change.unavailable.tape._hwcomponent.SCSIWWID04100022QUANT
UMDLT7000CX924S9813._hwid.63

    This event is posted by the hardware support subsystem to indicate
    that a component is in the unavailable state. A component in this
    state cannot be reached by the operating system, and it cannot be
    determined if the inability to reach it is due to a problem with
    the component itself or with another component in the access path.

    Action: Contact your service provider.

======================================================================

Formatted Message:
    Component State Change: Component "SCSI-WWID:04100022:"QUANTUM DLT7000

    CX924S9813"" is in the unavailable state (HWID=63)

Event Data Items:
    Event Name :
sys.unix.hw.state_change.unavailable.tape._hwcomponent.
                        SCSIWWID04100022QUANTUMDLT7000CX924S9813._hwid.63
    Cluster Event : True
    Priority : 500
    PID : 694
    PPID : 673
    Event Id : 256
    Member Id : 0
    Timestamp : 27-Jun-2002 01:30:43
    Host IP address : 10.161.2.64
    Host Name : epi.bionet.org
    Format : Component State Change: Component "$_hwcomponent" is

                        in the unavailable state (HWID=$_hwid)
    Reference : cat:evmexp.cat:800

Variable Items:
    current_state (STRING) = "unavailable"
    previous_state (STRING) = "available"
    category (STRING) = "tape"
    _hwcomponent (STRING) =
            "SCSI-WWID:04100022:"QUANTUM DLT7000 CX924S9813""
    _hwid (UINT64) = 63
    initiator (STRING) = ""

======================================================================

============================ EVM Log event ===========================
EVM event name: sys.unix.hw.no_connections.tape._hwid.63

    This event is posted by the hardware support subsystem to report
    that all connection paths to the device identified in the event
    have been lost, and no access to the device is possible until a
    path is restored.

======================================================================

Formatted Message:
    Connectivity has been lost for device (HWID=63 lid=6 btl=0/4/0)

Event Data Items:
    Event Name : sys.unix.hw.no_connections.tape._hwid.63
    Cluster Event : True
    Priority : 500
    PID : 694
    PPID : 673
    Event Id : 257
    Member Id : 0
    Timestamp : 27-Jun-2002 01:30:43
    Host IP address : 10.161.2.64
    Host Name : epi.bionet.org
    Format : Connectivity has been lost for device (HWID=$_hwid
                        lid=$lid btl=$port_id/$target_id/$lun_id)
    Reference : cat:evmexp.cat:800

Variable Items:
    _hwid (UINT64) = 63
    arch (STRING) = "SCSI"
    lid (UINT32) = 6
    port_id (UINT32) = 0
    target_id (UINT64) = 4
    lun_id (UINT64) = 0

======================================================================

Thanks,

Ed Notari
Senior Research Associate
Transmissible Diseases Department
Jerome H. Holland Laboratory
American Red Cross
(301) 738-0646 / FAX (301) 738-0495



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:45 EDT