T3+ disk errors.

From: Anthony Miller (anthony__miller@yahoo.com)
Date: Sun Jul 09 2006 - 21:53:36 EDT


Hi All,

I'm having headaches over this T3+ unit.

The unit is configured as an 8 disk raid 5 device with
the 9th disk as standby.

First we had multiple errors of the form:

Jul 06 08:29:03 ISR1[1]: N: u1d6 sid 944896 stype 2024
disk error 3

Which would indicate a faulty disk 6.

This disk was replaced and reconstruction started
followed shortly thereafter by the following errors:

Jul 06 11:33:34 ISR1[1]: N: u1d6 sid 944896 stype
2024 disk error 3
Jul 06 11:33:36 LPCT[1]: N: u1d6: Not ready on loop 1
Jul 06 11:33:36 LPCT[1]: N: u1d6: Bypassed on loop 1
Jul 06 11:33:36 LPCT[1]: E: u1d6: Not present
Jul 06 11:33:36 TMRT[1]: E: u1d6: Missing; system
shutting down in 30 minutes
Jul 06 11:33:37 ISR1[1]: N: u1d6 sid 944896 stype 2024
disk error 3
Jul 06 11:33:37 ISR1[1]: N: u1d6 sid 944892 stype 2024
disk error 3
Jul 06 11:33:37 LPCT[1]: N: u1d6: Not ready on loop 2
Jul 06 11:33:37 LPCT[1]: N: u1d6: Bypassed on loop 2
Jul 06 11:33:46 LT00[1]: N: u1d6 Reconstruction to
standby disk started
Jul 06 11:35:36 LPCT[1]: N: u1d6: Bypassed on loop 1
Jul 06 11:35:36 LPCT[1]: N: u1d6: Bypassed on loop 2
Jul 06 11:35:36 ISR1[1]: N: u1ctr ISP2200[0] Received
LIP(f7,f7) async event
Jul 06 11:35:37 ISR1[1]: N: u1ctr ISP2200[1] Received
LIP(f7,f7) async event
Jul 06 11:35:39 LPCT[1]: N: u1d6: Not bypassed on loop
1
Jul 06 11:35:40 LPCT[1]: N: u1d6: Not bypassed on loop
2
Jul 06 11:35:50 ISR1[1]: N: u1ctr ISP2200[1] Fatal
timeout on u1d1
Jul 06 11:35:50 ISR1[1]: N: u1ctr ISP2200[1]
QLCF_ABORT_ALL_CMDS: Command Timeout Pre-Gauntlet
Initiated
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d1 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d2 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1d3 SVD_CHECK_ERROR: Cmd
Aborted (path = 1)
Jul 06 11:35:50 ISR1[1]: N: u1ctr ISP2200[1] Received
LIP(f7,ef) async event
Jul 06 11:35:53 ISR1[1]: N: u1d9 sid 1305262 stype
2024 disk error 3
Jul 06 11:36:01 ISR1[1]: N: u1d9 sid 936401 stype 2024
disk error 3
Jul 06 11:42:21 ISR1[1]: N: u1d9 sid 1017153 stype
2024 disk error 3
Jul 06 20:41:42 ISR1[1]: N: u1d9 sid 1280447 stype
2024 disk error 3
Jul 06 20:41:45 ISR1[1]: N: u1d9 sid 1280448 stype
2024 disk error 3
Jul 06 20:41:47 ISR1[1]: N: u1d9 sid 1280688 stype
2024 disk error 3
Jul 06 20:41:48 ISR1[1]: N: u1d9 sid 1281168 stype
2024 disk error 3
Jul 06 20:41:51 ISR1[1]: W: u1d3 SCSI Disk Error
Occurred (path = 0x1)
Jul 06 20:41:51 ISR1[1]: W: Sense Key = 0x3, Asc =
0x11, Ascq = 0x0
Jul 06 20:41:51 ISR1[1]: W: Sense Data Description =
Unrecovered Read Error
Jul 06 20:41:51 ISR1[1]: W: Valid Information =
0x12bd255b
Jul 06 20:41:51 ISR1[1]: N: u1d3 SVD_DONE: Command
Error = 0x3
Jul 06 20:41:51 ISR1[1]: N: u1d3 sid 2452927 stype
1003 disk error 3
Jul 06 20:41:51 SX11[1]: W: u1ctr read failed during
recon stripe scb=126cdb0
Jul 06 20:41:51 SX11[1]: N: u1ctr Internal Command
error (Multiple Disk Failed)
Jul 06 20:41:51 SX11[1]: N: u1ctr Internal Command
error (Terminated by system)
Jul 06 20:41:51 LNXT[1]: W: u1ctr recon failed in vol
(v0)
Jul 06 20:41:54 ISR1[1]: N: sid 1281169 stype 2024
disk error 3
Jul 06 20:41:54 LT00[1]: N: u1d6 Reconstruction to
standby drive failed
Jul 06 20:41:54 LT00[1]: W: u1d6 Recon attempt failed
Jul 06 20:41:54 ISR1[1]: N: sid 1282120 stype 2024
disk error 3
Jul 06 20:41:55 ISR1[1]: N: sid 1000363 stype 2024
disk error 3
Jul 06 20:42:02 ISR1[1]: N: sid 1001324 stype 2024
disk error 3

Disk 6 was replaced again, but the same errors
occured.

Disk 6 remains disabled.

It would seem that disk 3 is also faulty, however,
until we can get disk 6 enabled, we can't replace disk
6.

My Questions are:

1. If disk 6 is replaced, why does an error on disk 3
affect reconstruction, since disk 6 should be
reconstructed from disk 9.

2. Any reason why we get errors on disk 6, disk 1,2,3
and disk 9 ?

3. How can you tell if disk 9 (standby) is in fact
being used?

Thanks for any help or advise.

Anthony
Check out gigs in your area on the comprehensive Yahoo! Music Gig Guide
http://au.music.yahoo.com/gig-guide
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:40:21 EDT