Sol 2.6 logging metadevice has problems

From: Gerhard den Hollander (gerhard@jasongeo.com)
Date: Tue May 21 2002 - 03:27:58 EDT

Next message: Srinivas_Arella@Satyam.com: "E250 hanging at boot"
Previous message: Zareh: "FYI: Solaris Fingerprint Database (sfpDB)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Apologies if this gets through twice.
I sent this out over an hour ago, and still ahven't seen it back on this
list ...

> situation:
>
> We'vegot a (hardware) raid kit
> 8 disks, 1 hot spare, hardware raid5 on that set.
>
> We had a diskerror on the raid, replaced a disk and (automatically) had a
> rebuild of the raidset.
>
> Unfortunateky, the rebuild did not go 100% correct and we had some
> correuption on the disks.
>
> This raidset is seen by Solaris (2.6) as 1 big disk,
> I've split the disk in 2,
> a 20G s0
> and a 420G s6
>
> I;ve created alogging metadevice on top of that
>
> metatstat tells me:
>
> d99: Trans
> State: Okay
> Size: 880787456 blocks
> Master Device: c4t0d0s6
> Logging Device: c4t0d0s0
>
> Master Device Start Block Dbase
> c4t0d0s6 0 No
>
> c4t0d0s0: Logging device for d99
> State: Okay
> Size: 2097152 blocks
>
> Logging Device Start Block Dbase
> c4t0d0s0 18434 No
>
>
> After the diskchange, rebuild and fsck I remounted the disk, and the
> machine (and disk) worked fine for a whole day.
> just after I got home I got aq phonecall that the server had crashed,
> a quick check showed a syslog entry that mentioned
>
> >
> > unix: panic[cpu1]/thread=0x303bbe80: free:
> freeing free block, dev:0x1540063, block:37288, ino:23756161, fs:/whopper
> >
>
> (and /whopper is the /dev/md/rdsk/d99 )
>
> after the reboot, the machine crashed some 20 minutes later again.
> james.jason.nl unix: panic[cpu1]/thread=0x608e5ba0: fs = /whopper update:
> ro fs mod
> May 14 18:00:27 james.jason.nl unix: syncing file
> systems...panic[cpu1]/thread=0x608e5ba0: fs = /whopper update: ro fs mod
>
>
> Now I've mounted the disk readonly (and not shared it)
> and the system is stable.
>
> I've done some basic testing (fsck, checking text files to see fi they are
> inded testfiles, running some other integrity checks on the on-dick data)
> and it all seems to work fine .
>
>
> Does anyone have any idea what might have caused this ?
> is c4t0d0s0 (the logging device) corrupt ?
>
> Is the whole filesystem corrupt ?
>
> What is the best way to revive ?
>
> should I run newfs on /dev/rdsk/md/d99 ?
> should I run fsck on c4todos6 (the raw device)
> should I throw away the metadevice and rebuilt ?

Kind regards,
--
Gerhard den Hollander Phone :+31-10.280.1515
Global IT Support manager Direct:+31-10.280.1539
Jason Geosystems BV Fax :+31-10.280.1511
(When calling please note: we are in GMT+1)
gdenhollander@jasongeo.com POBox 1573
visit us at http://www.jasongeo.com 3000 BN Rotterdam
JASON.......#1 in Reservoir Characterization The Netherlands

      This e-mail and any attachment is/are intended solely for the named
  addressee(s) and may contain information that is confidential and privileged.
       If you are not the intended recipient, we request that you do not
         disseminate, forward, distribute or copy this e-mail message.
      If you have received this e-mail message in error, please notify us
           immediately by telephone and destroy the original message.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers

Next message: Srinivas_Arella@Satyam.com: "E250 hanging at boot"
Previous message: Zareh: "FYI: Solaris Fingerprint Database (sfpDB)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:21 EDT