Re: Help : Lost one of our SVM volumes

From: Richard Skelton (Richard.Skelton@infineon.com)
Date: Mon Dec 19 2005 - 09:44:31 EST


Hi Managers

I think that something in the FCAL loop over temperatured and shut down
for over six hours (while it cooled down)
I don't think I have a disk problem so I have run metareplace -e d80
c4t7d0s2 so at around 18:00 I will see if the volume is OK

Richard Skelton wrote:

> Hi Managers,
> Over the weekend we lost one of our SVM volumes d80
> Looking at the messages file it looks like the fcal controlled went
> offline for a while and when it came back SVM replaced a disk in
> maintenance with the hot spare.
> I have no more spare drives and SVM still wants me to replace more
> drives.
> The system may have been over temperature during Sunday morning but I
> have no temperature monitoring on this system.
>
> How can I recover from this situation?
>
> messages:-
>
> Dec 18 06:49:17 ccs001 socal: [ID 403145 kern.info]
> ID[SUNWssa.socal.link.5010]
> socal1: port 1: Fibre Channel is OFFLINE
> Dec 18 06:50:03 ccs001 scsi: [ID 243001 kern.warning] WARNING:
> /sbus@a,0/SUNW,so
> cal@d,10000/sf@1,0 (sf3):
> Dec 18 06:50:03 ccs001 Offline Timeout
> Dec 18 06:50:03 ccs001 scsi: [ID 243001 kern.info]
> /sbus@a,0/SUNW,socal@d,10000/
> sf@1,0 (sf3):
> Dec 18 06:50:03 ccs001 target 0x7 al_pa 0xda lun 0 offlined
> Dec 18 06:50:03 ccs001 scsi: [ID 243001 kern.info]
> /sbus@a,0/SUNW,socal@d,10000/
> sf@1,0 (sf3):
> Dec 18 06:50:03 ccs001 target 0x4 al_pa 0xe1 lun 0 offlined
> Dec 18 06:50:03 ccs001 scsi: [ID 243001 kern.info]
> /sbus@a,0/SUNW,socal@d,10000/
> sf@1,0 (sf3):
> Dec 18 06:50:03 ccs001 target 0x6 al_pa 0xdc lun 0 offlined
> Dec 18 06:50:03 ccs001 scsi: [ID 243001 kern.info]
> /sbus@a,0/SUNW,socal@d,10000/
> sf@1,0 (sf3):
> Dec 18 06:50:03 ccs001 target 0x5 al_pa 0xe0 lun 0 offlined
> Dec 18 06:50:03 ccs001 scsi: [ID 107833 kern.warning] WARNING:
> /sbus@a,0/SUNW,so
> cal@d,10000/sf@1,0/ssd@w21000020376c8948,0 (ssd13):
> Dec 18 06:50:03 ccs001 ssdrestart transport failed (fffffffe)
> Dec 18 06:50:03 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d82: wri
> te error on /dev/dsk/c4t6d0s2
> Dec 18 06:50:03 ccs001 scsi: [ID 107833 kern.warning] WARNING:
> /sbus@a,0/SUNW,so
> cal@d,10000/sf@1,0/ssd@w21000020379862b2,0 (ssd12):
> Dec 18 06:50:03 ccs001 ssdrestart transport failed (fffffffe)
> Dec 18 06:50:03 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d82: rea
> d error on /dev/dsk/c4t7d0s2
> Dec 18 06:50:03 ccs001 last message repeated 1 time
> Dec 18 06:50:03 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d82: B_FAILF
> AST I/O retry
> Dec 18 06:50:03 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d82: wri
> te error on /dev/dsk/c4t6d0s2
> Dec 18 06:50:03 ccs001 md: [ID 680156 kern.info] NOTICE: md: d82:
> B_FAILFAST I/O
> retry, 2 buf(s) dequeued
> Dec 18 06:50:07 ccs001 md_mirror: [ID 104909 kern.warning] WARNING:
> md: d82: /de
> v/dsk/c4t6d0s2 needs maintenance
> Dec 18 06:50:11 ccs001 md_mirror: [ID 104909 kern.warning] WARNING:
> md: d82: /de
> v/dsk/c4t7d0s2 needs maintenance
> Dec 18 06:50:11 ccs001 scsi: [ID 107833 kern.warning] WARNING:
> /sbus@a,0/SUNW,so
> cal@d,10000/sf@1,0/ssd@w210000203747e732,0 (ssd14):
> Dec 18 06:50:11 ccs001 transport rejected (-2)
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 md_mirror: [ID 104909 kern.warning] WARNING:
> md: d81: /de
> v/dsk/c4t5d0s2 needs maintenance
> Dec 18 06:50:11 ccs001 md_mirror: [ID 990438 kern.warning] WARNING:
> md: d81: /de
> v/dsk/c4t5d0s2 last erred
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 last message repeated 1 time
> Dec 18 06:50:11 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:11 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:11 ccs001 last message repeated 1 time
> Dec 18 06:50:11 ccs001 md_stripe: [ID 241980 kern.notice] NOTICE: md:
> d82: hotsp
> ared device /dev/dsk/c4t6d0s2 with /dev/dsk/c3t3d0s2
> Dec 18 06:50:12 ccs001 scsi: [ID 107833 kern.warning] WARNING:
> /sbus@a,0/SUNW,so
> cal@d,10000/sf@1,0/ssd@w2100002037d0a281,0 (ssd15):
> Dec 18 06:50:12 ccs001 transport rejected (-2)
> Dec 18 06:50:12 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:12 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: wri
> te error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:12 ccs001 last message repeated 1 time
> Dec 18 06:50:12 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:12 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: wri
> te error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:12 ccs001 last message repeated 1 time
> Dec 18 06:50:12 ccs001 ufs_log: [ID 702911 kern.warning] WARNING:
> Error writing
> ufs log
> Dec 18 06:50:12 ccs001 ufs_log: [ID 127457 kern.warning] WARNING: ufs
> log for /e
> xport/work changed state to Error
> Dec 18 06:50:12 ccs001 ufs_log: [ID 616219 kern.warning] WARNING:
> Please umount(
> 1M) /export/work and run fsck(1M)
> Dec 18 06:50:12 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
> Dec 18 06:50:12 ccs001 md_mirror: [ID 842313 kern.info] NOTICE: md:
> d81: B_FAILF
> AST I/O retry
> Dec 18 06:50:12 ccs001 md_stripe: [ID 641072 kern.warning] WARNING:
> md: d81: rea
> d error on /dev/dsk/c4t5d0s2
>
>
>
> metstat:-
>
> d80: Mirror
> Submirror 0: d81
> State: Needs maintenance
> Submirror 1: d82
> State: Needs maintenance
> Pass: 1
> Read option: roundrobin (default)
> Write option: parallel (default)
> Size: 142245693 blocks
>
> d81: Submirror of d80
> State: Needs maintenance
> Invoke: after replacing "Maintenance" components:
> metareplace d80 c4t5d0s2 <new device>
> Hot spare pool: hsp001
> Size: 142245693 blocks
> Stripe 0: (interlace: 32 blocks)
> Device Start Block Dbase State Hot Spare
> c3t2d0s2 0 No Okay c4t5d0s2
> 2889 No Last Erred
>
> d82: Submirror of d80
> State: Needs maintenance
> Invoke: metareplace d80 c4t7d0s2 <new device>
> Hot spare pool: hsp001
> Size: 142245693 blocks
> Stripe 0: (interlace: 32 blocks)
> Device Start Block Dbase State Hot Spare
> c4t6d0s2 0 No Okay c3t3d0s2
> c4t7d0s2 2889 No Maintenance
>
> hsp001: 1 hot spare
> c3t3d0s2 In use 71127180 blocks
>

-- 
Cheers
Richard Skelton
Richard.Skelton@infineon.com
Infineon Technologies UK Ltd
Infineon House
Great Western Court
Hunts Ground Road
Stoke Gifford
Bristol
BS34 8HP
Tel +44(0)117 9528808
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:37:55 EDT