fibre channel failures

From: Jarkko Airaksinen (JAiraksinen@zed.com)
Date: Wed Nov 22 2006 - 06:35:18 EST


Good day, Gurus,

I'm experiencing quite a lot of write errors on my fibre channel disks.
The setup is:

-Fujitsu-Siemens PrimePower450, 4 x 1.1G processors, 16G main memory

-SunOS superserver 5.8 Generic_117350-33 sun4us sparc FJSV,GPUZC-M

-2 x QLogic2300 FC adapters with the latest firmware in them

-2 x Cisco MDS9048 FC switches

-Sun SAN 4.4.6 + mpxio

-EVA8000 disk cabinet with several disks (vxfs) presented to Solaris

Only the disks in EVA give errors. The most frequent errors I receive
are:

Nov 19 14:50:44 superserver scsi: [ID 243001 kern.warning] WARNING:
/scsi_vhci (scsi_vhci0):

Nov 19 14:50:44 superserver
/scsi_vhci/ssd@g600508b400104b600000e000015e0000 (ssd15): Command
Timeout on path /pci@80,2000/SUNW,qlc@2/fp@0,0 (fp2)

Nov 19 14:50:44 superserver scsi: [ID 243001 kern.warning] WARNING:
/scsi_vhci/ssd@g600508b400104b600000e000015e0000 (ssd15):

Nov 19 14:50:44 superserver SCSI transport failed: reason 'timeout':
retrying command

Then I might get these:

Nov 19 17:33:57 superserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/ssd@g600508b400104b6000007000018a0000 (ssd7):

Nov 19 17:33:57 superserver Error for Command: write(10)
Error Level: Retryable

Nov 19 17:33:57 superserver scsi: [ID 107833 kern.notice]
Requested Block: 659148544 Error Block: 659148544

Nov 19 17:33:57 superserver scsi: [ID 107833 kern.notice] Vendor:
HP Serial Number: A81000070000

Nov 19 17:33:57 superserver scsi: [ID 107833 kern.notice] Sense
Key: Aborted Command

Nov 19 17:33:57 superserver scsi: [ID 107833 kern.notice] ASC:
0x4b (data phase error), ASCQ: 0x0, FRU: 0x0

And in the worst case, these:

Nov 19 17:34:02 superserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/ssd@g600508b400104b6000007000018a0000 (ssd7):

Nov 19 17:34:02 superserver undecodable sense information: 0x9d 0x79
0xb 0x0 0x8e 0xde 0x2 0x7 0x0 0x0 0x0 0x0 0x4b 0x0 0x0 0x0-(assumed
fatal)

Nov 19 17:34:02 superserver vxfs: [ID 130881 kern.warning] WARNING:
msgcnt 10 vxfs: mesg 038: vx_dataioerr - /dev/dsk/dwh_backup01 file
system file data write error in block 329698944

The disks that fail are completely random, also the adapter that was
used is random. The only patterns I've found is that the error is always
"write(10)" and that they are always kern.warnings or kern.notices and
they are always Retryable (even if they fail).

Now our data loads & backups fail randomly. I posted here another
problem with the disks a while ago and it yielded one rather scary
response; according to him/her F-S doesn't give any guarantee that the
QLC adapters work in their server. However I'm a bit reluctant to
believe that

Any ideas would be more than welcome.

Will summarize.

Br,

Jarkko

__________________________________________________________________________

La informacion incluida en el presente correo electronico es CONFIDENCIAL,
siendo para el uso exclusivo del/os destinatario/s arriba mencionado/s. Si
usted recibe y lee este correo electronico y no es el destinatario senalado,
el empleado o el agente responsable de entregar el mensaje al destinatario, o
ha recibido esta comunicacion por error, le informamos que esta totalmente
prohibida cualquier divulgacion, distribucion, uso o reproduccion del mismo, y
le rogamos que nos lo notifique inmediatamente respondiendo al mensaje
original a la direccion arriba mencionada y eliminando el mensaje a
continuacion.

The information contained in this e-mail is CONFIDENTIAL and is intended only
for the use of the addressee named above.If the reader of this message is not
the intended recipient or the employee or agent responsible for delivering the
message to the intended recipient, or you have received this communication in
error, please be aware that any diffusion, distribution or duplication of this
communication is strictly forbidden, and please notify us immediately by
return to the original message at the address above eliminating it
afterwards.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:41:13 EDT