From: Eugene Schmidt (fereug@acute.co.za)
Date: Fri Oct 08 2004 - 19:43:32 EDT
Hi Sun Managers
Hope someone has seen this one and can help please?
Customer has an E4500, Solaris 8 with newly attached 2 x EVA disk arrays via
two QLogic 2200 SBus HBA's. Tesing was 100% and fast.
Secure Path 3.0D is loaded for channel failover.
Started experiencing hangs today. What had changed? Was rebooted this
morning. No changes prior to reboot.
Initially no errors in /var/adm/messages, but after a second reboot, errors
started appearing:
Oct 8 11:00:41 proddb scsi: [ID 243001 kern.warning] WARNING:
/swsp@0,2/ssd@0,1 (ssd5):
Oct 8 11:00:41 proddb SCSI transport failed: reason 'aborted':
retrying command
Oct 8 11:09:00 proddb scsi: [ID 243001 kern.warning] WARNING:
/swsp@0,2/ssd@0,0 (ssd4):
Oct 8 11:09:00 proddb SCSI transport failed: reason 'aborted':
retrying command
Oct 8 11:58:52 proddb scsi: [ID 243001 kern.warning] WARNING:
/swsp@0,2/ssd@0,0 (ssd4):
Oct 8 11:58:52 proddb SCSI transport failed: reason 'aborted':
retrying command
Oct 8 12:11:13 proddb scsi: [ID 243001 kern.warning] WARNING:
/swsp@0,2/ssd@0,0 (ssd4):
Disks c7t0d0 c7t0d1 hanging. C6 performs beautifully.
Switch logs and EVA logs shows nothing.
No other error messages except the shown above.
Mounting disk readonly and putting heavy I/O on it emulates problem.
Also, iostat shows disk as 100% busy, with no I/O passing thru. hsx dev -
current path - has same hung state:
"9 9 17 66
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 hsx1
....
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 hsx813
.....
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0
0.0 0.8 0.0 0.4 0.0 0.0 0.0 13.9 0 1 c0t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t6d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d1
0.0 4.2 0.0 18.6 0.0 0.0 0.0 0.4 0 0 c6t0d2
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d3
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c7t0d0
0.0 0.0 ...
"
Below lenghty config files as installed by install script.
Promise a summary.
Thx
E Schmidt
==========
"spmgr" display shows the following config:
# spmgr display
Server: acproddb10 Report Created: Fri, Oct 08 16:34:46 2004
Command: spmgr display
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Storage: 5000-1FE1-5002-81C0
Load Balance: Off Auto-restore: Off
Path Verify: On Verify Interval: 30
HBAs: qla2200-0 qla2200-2
Controller: P5849D5AAPW01O, Operational
P5849D5AAPW038, Operational
Devices: c6t0d0 c6t0d1 c6t0d2 c6t0d3
TGT/LUN Device WWLUN_ID
#_Paths
0/ 0 c6t0d0 6005-08B4-0001-3879-0000-D000-0150-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW01O no
hsx-1-37-1 qla2200-0 no Active
hsx-3655-36-1 qla2200-2 no
Available
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW038 no
hsx-204-38-1 qla2200-0 no
Standby
hsx-3858-39-1 qla2200-2 no
Standby
TGT/LUN Device WWLUN_ID
#_Paths
0/ 1 c6t0d1 6005-08B4-0001-3879-0000-D000-0153-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW01O no
hsx-2-37-2 qla2200-0 no
Standby
hsx-3656-36-2 qla2200-2 no
Standby
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW038 no
hsx-205-38-2 qla2200-0 no Active
hsx-3859-39-2 qla2200-2 no
Available
TGT/LUN Device WWLUN_ID
#_Paths
0/ 2 c6t0d2 6005-08B4-0001-3879-0000-D000-0156-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW01O no
hsx-3-37-3 qla2200-0 no Active
hsx-3657-36-3 qla2200-2 no
Available
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW038 no
hsx-206-38-3 qla2200-0 no
Standby
hsx-3860-39-3 qla2200-2 no
Standby
TGT/LUN Device WWLUN_ID
#_Paths
0/ 3 c6t0d3 6005-08B4-0001-3879-0000-D000-0164-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW01O no
hsx-4-37-4 qla2200-0 no
Standby
hsx-3658-36-4 qla2200-2 no
Standby
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPW038 no
hsx-207-38-4 qla2200-0 no Active
hsx-3861-39-4 qla2200-2 no
Available
Storage: 5000-1FE1-5002-2510
Load Balance: Off Auto-restore: Off
Path Verify: On Verify Interval: 30
HBAs: qla2200-0 qla2200-2
Controller: P5849D5AAPC09X, Operational
P5849D5AAPC09E, Operational
Devices: c7t0d0 c7t0d1 c7t0d2 c7t0d3
TGT/LUN Device WWLUN_ID
#_Paths
0/ 0 c7t0d0 6005-08B4-0001-24D1-0000-A000-0193-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09X no
hsx-813-33-1 qla2200-0 no
Standby
hsx-4467-32-1 qla2200-2 no
Standby
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09E YES
hsx-1016-34-1 qla2200-0 no Active
hsx-4670-35-1 qla2200-2 no
Available
TGT/LUN Device WWLUN_ID
#_Paths
0/ 1 c7t0d1 6005-08B4-0001-24D1-0000-A000-0196-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09X no
hsx-814-33-2 qla2200-0 no Active
hsx-4468-32-2 qla2200-2 no
Available
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09E no
hsx-1017-34-2 qla2200-0 no
Standby
hsx-4671-35-2 qla2200-2 no
Standby
TGT/LUN Device WWLUN_ID
#_Paths
0/ 2 c7t0d2 6005-08B4-0001-24D1-0000-A000-0199-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09X no
hsx-815-33-3 qla2200-0 no
Standby
hsx-4469-32-3 qla2200-2 no
Standby
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09E YES
hsx-1018-34-3 qla2200-0 no Active
hsx-4672-35-3 qla2200-2 no
Available
TGT/LUN Device WWLUN_ID
#_Paths
0/ 3 c7t0d3 6005-08B4-0001-24D1-0000-A000-01A7-0000 4
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09X no
hsx-816-33-4 qla2200-0 no Active
hsx-4470-32-4 qla2200-2 no
Available
Controller Path_Instance HBA Preferred?
Path_Status
P5849D5AAPC09E no
hsx-1019-34-4 qla2200-0 no
Standby
hsx-4673-35-4 qla2200-2 no
Standby
======== END OF OUTPUT ============
Entries in /etc/system:
* Start of CPQhsv edits. DO NOT DELETE THIS LINE
forceload: drv/clone
set maxphys=8388608
set sd:sd_max_throttle=32
set sd:sd_io_time=180
* End of CPQhsv edits. DO NOT DELETE THIS LINE
* Start of HPfcraid edits. DO NOT DELETE THIS LINE
forceload: drv/clone
forceload: drv/ssd
set maxphys=8388608
set sd:sd_max_throttle=32
set sd:sd_io_time=180
set ssd:ssd_max_throttle=32
set ssd:ssd_io_time=180
* End of HPfcraid edits. DO NOT DELETE THIS LINE
set shmsys:shminfo_shmmax=4194304000
------- EOF ---------------
Entries in /kernel/drv/ssd.conf:
#
# Copyright (c) 1995-1999 by Sun Microsystems, Inc.
# All rights reserved.
#
#ident "@(#)ssd.conf 1.9 99/07/29 SMI"
name="ssd" parent="SUNW,pln" port=0 target=0;
....
name="ssd" parent="SUNW,pln" port=0 target=15;
name="ssd" parent="SUNW,pln" port=1 target=0;
name="ssd" parent="SUNW,pln" port=1 target=1;
.....
ditto port=1 to port=5, with target=0 thru target=15
.....
name="ssd" parent="SUNW,pln" port=5 target=15;
name="ssd" parent="sf" target=0;
name="ssd" parent="fp" target=0;
name="ssd" parent="ifp" target=127;
name="ssd" parent="scsi_vhci" target=0;
---EOF --------------
/kernel/drv/hsx.conf:
#
# Compaq StorageWorks Secure Path
# hsx.conf - Hardware Configuration file for hsx, a Disk Array Block
# SCSI Target driver. Refer to the driver.conf(4) manpage
# for more information on the syntax of this file.
#
# name "hsx" - required
# class "scsi" - required
# target SCSI target-ID
# lun SCSI logical unit number
# qdepth depth of command queue (1,..,64)
# parent restrict parent HBA
# preferred this path is preferred for a controller when load
# balancing is disabled
#
# If no "parent=" qualifier is present, all SCSI-HBA adapters in
# the system will attempt to attach an HSX instance at the indicated
# target/lun on the SCSI bus.
#
# HSX will only attach device instances for Compaq StorageWorks HSx80
# disk array targets. The SD device will also want to claim these
# targets. Explicit use of "parent=" in sd.conf may be required to
# resolve conflicts.
#
# Each HSX instance found will result in a path being provided via
# the misc/path driver.
name="hsx" parent="qla2200" target=37 lun=0 qdepth=32;
name="hsx" parent="qla2200" target=37 lun=1 qdepth=32;
name="hsx" parent="qla2200" target=37 lun=2 qdepth=32;
name="hsx" parent="qla2200" target=37 lun=3 qdepth=32;
name="hsx" parent="qla2200" target=37 lun=4 qdepth=32;
name="hsx" parent="qla2200" target=37 lun=5 qdepth=32;
.... etc,
For targets = 32 to 39 (although not in sequence) , lun= 0 thru 202
============= EOF
Contents of /kernel/drv/qla2300.conf
# Number of times to retry a SCSI queue full error.
# Range: 0 - 255
hba0-queue-full-retry-count=16;
# Amount of time to delay after a SCSI queue full error before
# starting any new I/O commands.
# Range: 0 - 255 seconds
hba0-queue-full-retry-delay=2;
# Maximum fibre channel frame size.
# Range: 512, 1024 or 2048 bytes
hba0-max-frame-length=1024;
# Maximum number of commands queued on each logical unit.
# Range: 1 - 65535
hba0-execution-throttle=16;
# Number of port login retry attempts.
# Range: 0 - 255
hba0-login-retry-count=8;
# Enable/disable the use adapter hard loop ID address on the fibre
# channel bus.
# 0 = disable, 1 = enabled
hba0-enable-adapter-hard-loop-ID=0;
# Adapter hard loop ID address to use on the fibre channel bus.
# Range: 0 - 125
hba0-adapter-hard-loop-ID=0;
# Enable/disable the use LIP reset for loop reset.
# 0 = disable, 1 = enabled
hba0-enable-LIP-reset=0;
# Enable/disable the use LIP full login for loop reset.
# 0 = disable, 1 = enabled
hba0-enable-LIP-full-login=1;
# Enable/disable the use of target reset for loop reset.
# 0 = disable, 1 = enabled
hba0-enable-target-reset=0;
# Amount of time to delay after a loop reset for starting any new
# I/O commands.
# Range: 0 - 255 seconds
hba0-reset-delay=5;
# Number of times to retry a port that is not responding.
# Range: 0 - 255
hba0-port-down-retry-count=90;
# Maximum number of LUNs to scan for, if a device does not
# support SCSI Report LUNs command.
# Range: 1 - 256
hba0-maximum-luns-per-target=8;
# Connection options.
# 0 = loop only
# 1 = point-to-point only
# 2 = loop preferred, otherwise point-to-point
# 3 = point-to-point preferred, otherwise loop
hba0-connection-options=1;
# Fibre Channel tape support enable/disable.
# 0 = disable, 1 = enabled
hba0-fc-tape=1;
# PCI latency timer.
# Range: 0 - 0xF8
# Default: 0x40
hba0-pci-latency-timer=0x40;
# During link down conditions enable/disable the reporting of
# errors.
# 0 = disabled, 1 = enable
hba0-link-down-error=1;
# Amount of time to wait for loop to come up after it has gone down
# before reporting I/O errors.
# Range: 0 - 240 seconds
hba0-link-down-timeout=10;
# Persistent binding only option.
# 0 = Reports to OS discovery of binded and non-binded devices
# 1 = Reports to OS discovery of persistent binded devices only
hba0-persistent-binding-configuration=1;
# Fast error reporting to Solaris, enabled/disabled.
# 0 = disabled, 1 = enable
hba0-fast-error-reporting=0;
# Enable extended logging.
# 0 = disabled, 1 = enable
hba0-extended-logging=0;
#####################################################################
# WARNING: Beginning of Configuration Data stored by the QLogic #
# Applications. Consult documentation before editing #
# any data passed this text. #
#####################################################################
# CPQ installation changes made.
# CPQswsp: start of Secure Path edits. Caution: do not remove! This line is
used by pkgadd/pkgrm.
hba0-SCSI-target-id-37-fibre-channel-port-name="50001FE1500281C9";
hba2-SCSI-target-id-37-fibre-channel-port-name="50001FE1500281C9";
hba0-SCSI-target-id-38-fibre-channel-port-name="50001FE1500281CC";
hba2-SCSI-target-id-38-fibre-channel-port-name="50001FE1500281CC";
hba0-SCSI-target-id-36-fibre-channel-port-name="50001FE1500281C8";
hba2-SCSI-target-id-36-fibre-channel-port-name="50001FE1500281C8";
hba0-SCSI-target-id-39-fibre-channel-port-name="50001FE1500281CD";
hba2-SCSI-target-id-39-fibre-channel-port-name="50001FE1500281CD";
hba0-SCSI-target-id-33-fibre-channel-port-name="50001FE150022519";
hba2-SCSI-target-id-33-fibre-channel-port-name="50001FE150022519";
hba0-SCSI-target-id-34-fibre-channel-port-name="50001FE15002251C";
hba2-SCSI-target-id-34-fibre-channel-port-name="50001FE15002251C";
hba0-SCSI-target-id-32-fibre-channel-port-name="50001FE150022518";
hba2-SCSI-target-id-32-fibre-channel-port-name="50001FE150022518";
hba0-SCSI-target-id-35-fibre-channel-port-name="50001FE15002251D";
hba2-SCSI-target-id-35-fibre-channel-port-name="50001FE15002251D";
# CPQswsp: end of Secure Path edits. Caution: do not remove! This line is
used by pkgadd/pkgrm.
=========== EOF =====================
/kernel/drv/swsp.conf
# Compaq StorageWorks Secure Path
# swsp.conf - Configuration file for swsp
#
# use swsp.conf to configure which arrays can be controlled by Secure Path
# add one entry of the following form per array:
# name="swsp" class="root" portid=0 reg=0x0,0x(instance+1),0x1
# instance=(instance #) array-name="ARRAY_WWID";
#
# configurable parameters can be set globally, or on an array basis by
# adding one of path-verify, path-verify-period load-balance or auto-restore
# to the line defining the array instance, or on a line by itself (for
global)
#
# path-verify=?
# 1= path-verification enabled
# 0= path-verification disabled
# path-verify-period=X
# X = number of seconds between path verification attempts
#
# load-balance=?
# 1= enabled
# 0= disabled
#
# auto-restore=?
# 1= enabled
# 0= disabled
#
path-verify=1;
name="swsp" class="root" portid=0 reg=0x0,0x1,0x1 instance=0
array-name="5000-1FE1-5002-81C0";
wwlid-0-0="6005-08B4-0001-3879-0000-D000-0150-0000@0,0";
wwlid-0-1="6005-08B4-0001-3879-0000-D000-0153-0000@0,1";
wwlid-0-2="6005-08B4-0001-3879-0000-D000-0156-0000@0,2";
wwlid-0-3="6005-08B4-0001-3879-0000-D000-0164-0000@0,3";
name="swsp" class="root" portid=0 reg=0x0,0x2,0x1 instance=1
array-name="5000-1FE1-5002-2510";
wwlid-1-0="6005-08B4-0001-24D1-0000-A000-0193-0000@0,0";
wwlid-1-1="6005-08B4-0001-24D1-0000-A000-0196-0000@0,1";
wwlid-1-2="6005-08B4-0001-24D1-0000-A000-0199-0000@0,2";
wwlid-1-3="6005-08B4-0001-24D1-0000-A000-01A7-0000@0,3";
======================== EOF ========================================
=====================================================================
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:33 EDT