[SUMMARY] SCSI warnings on Solaris 8 x86

From: Rami (rami.aubourg@ifrance.com)
Date: Fri Nov 14 2003 - 07:29:37 EST


Rami wrote:
Hello, gurus,

Fast response as usual.

Thanks to:
Ian Pease,
Paul Wilkinson,
DJ

The consensus seems to be that it shouldn't probably be a disk.
suggestion: use iostat -EN to check the disks. However, the server is
still down, as there's no one at the remote location with access to the
server room at the remote location to restart it this morning.

Could be a problem related to the controller, cabling, or more probably
cooling problems due to the fact that it hasn't been undusted for 3 years.

Rami

> Hello, gurus,
>
> I have an intel LB440GX board with 3 SCSI disks and Solaris 8 for intel
> running without a glitch since over 3 years at a distant site.
> Filesystems are mounted ufs, logging.
>
> However, yesterday morning, it was impossible to attain it.
>
> According to the person at the location, the monitor screen was black,
> so the only solution was to reboot, which went fine, and everything was
> back in order until this morning.
>
> Error messages in /var/adm/messages were as follows:
>
> ************************
> Nov 13 05:32:37 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:32:37 server SCSI transport failed: reason 'timeout':
> retrying command
> Nov 13 05:32:37 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@1,0 (sd2):
> Nov 13 05:32:37 server SCSI transport failed: reason 'reset':
> retrying command
> Nov 13 05:32:37 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:32:37 server SCSI transport failed: reason 'reset':
> retrying command
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c (cadp0):
> Nov 13 05:33:38 server timeout: abort request, target=0 lun=0
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c (cadp0):
> Nov 13 05:33:38 server timeout: abort device, target=0 lun=0
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c (cadp0):
> Nov 13 05:33:38 server timeout: reset target, target=0 lun=0
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c (cadp0):
> Nov 13 05:33:38 server timeout: reset bus, target=0 lun=0
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c (cadp0):
> Nov 13 05:33:38 server timeout: early timeout, target=0 lun=0
> Nov 13 05:33:38 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:33:38 server SCSI transport failed: reason 'timeout':
> retrying command
> Nov 13 05:33:51 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:33:51 server Error for Command: write(10)
> Error Level: Fatal
> Nov 13 05:33:51 server scsi: [ID 107833 kern.notice] Requested
> Block: 8880882 Error Block: 8880882
> Nov 13 05:33:51 server scsi: [ID 107833 kern.notice] Vendor: SEAGATE
> Serial Number: 3BT1NCBS
> Nov 13 05:33:51 server scsi: [ID 107833 kern.notice] Sense Key: Not
> Ready
> Nov 13 05:33:51 server scsi: [ID 107833 kern.notice] ASC: 0x4
> (<vendor unique code 0x4>), ASCQ: 0x1, FRU: 0x2
> Nov 13 05:37:44 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:37:44 server SCSI transport failed: reason 'timeout':
> retrying command
> Nov 13 05:37:44 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:37:44 server SCSI transport failed: reason 'reset':
> retrying command
> Nov 13 05:37:44 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@1,0 (sd2):
> Nov 13 05:37:44 server SCSI transport failed: reason 'reset':
> retrying command
> Nov 13 05:38:07 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:38:07 server Error for Command: write(10)
> Error Level: Fatal
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Requested
> Block: 8778130 Error Block: 8778130
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Vendor: SEAGATE
> Serial Number: 3BT1NCBS
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Sense Key: Not
> Ready
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] ASC: 0x4
> (<vendor unique code 0x4>), ASCQ: 0x1, FRU: 0x2
> Nov 13 05:38:07 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
> Nov 13 05:38:07 server Error for Command: write(10)
> Error Level: Fatal
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Requested
> Block: 8880882 Error Block: 8880882
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Vendor: SEAGATE
> Serial Number: 3BT1NCBS
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] Sense Key: Not
> Ready
> Nov 13 05:38:07 server scsi: [ID 107833 kern.notice] ASC: 0x4
> (<vendor unique code 0x4>), ASCQ: 0x1, FRU: 0x2
> Nov 13 05:38:43 server scsi: [ID 107833 kern.warning] WARNING:
> /pci@0,0/pci9004,53@c/sd@0,0 (sd1):
>
> *************************************
>
> I thought first of a disk going bad, but the write error clearly
> concerns 2 disks (sd1 and sd2), which would be uncommon. I'm leaning
> more towards a controller problem or such, or even an electrical problem.
> Besides, after the reboot, everything continued working fine the whole
> day. Until this morning when the server's unattainable anew. I'm a bit
> clueless about this, as I don't know whether I should change a disk or not.
>
> Any ideas would be welcome.
>
>
> Rami
>

>

_____________________________________________________________________
Envie de discuter en "live" avec vos amis ? Tilicharger MSN Messenger
http://www.ifrance.com/_reloc/m la 1hre messagerie instantanie de France
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:27:28 EDT