Failed mirrored drives -- how to rebuild mirror with replacement drive?

From: Cohen, Andy (Andy.Cohen@cognex.com)
Date: Tue Aug 16 2005 - 15:17:24 EDT


Hi,

We have a DS20E running Tru64 5.1. There are two internal disks that
are mirrored by LSM. Overnight one of them failed which brought the
server down (see my previous email). We've rebooted off the one good
remaining drive. In the meantime we've removed the bad drive and have
received a replacement for it (DS-RZ2ED-LS). I'm told that this is
hot-swappable so I can put it in with the sytem up and rebuild the
mirror.

My questions are:

1) How do I make sure that once the new blank drive is installed the
system doesn't think that's the good drive and resync the existing drive
to this new, blank drive thereby wiping out the entire system disk? Is
that even a possibility?

2) How do I rebuild the mirror?

Here's the volprint output:

root@odin==> volprint
Disk group: rootdg

TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0
PUTIL0
dg rootdg rootdg - - - - -
-

dm dsk0f-AdvFS - - - - NODEVICE -
-
dm dsk0h - - - - NODEVICE -
-
dm dsk1d-AdvFS dsk1d - 4267761 - - -
-
dm dsk1f-AdvFS dsk1f - 12878154 - - -
-
dm dsk1h dsk1h - 4301507 - - -
-
dm root01 - - - - NODEVICE -
-
dm root02 dsk1a - 636421 - - -
-
dm swap01 - - - - NODEVICE -
-
dm swap02 dsk1b - 12354045 - - -
-
dm usr01 - - - - NODEVICE -
-

v rootvol root ENABLED 636421 - ACTIVE -
-
pl rootvol-01 rootvol DISABLED 636421 - NODEVICE -
-
sd root01-01p rootvol-01 DISABLED 16 0 NODEVICE -
-
sd root01-01 rootvol-01 DISABLED 636405 16 NODEVICE -
-
pl rootvol-02 rootvol ENABLED 636421 - ACTIVE -
-
sd root02-02p rootvol-02 ENABLED 16 0 - -
-
sd root02-02 rootvol-02 ENABLED 636405 16 - -
-

v swapvol swap ENABLED 12354045 - ACTIVE -
-
pl swapvol-02 swapvol ENABLED 12354045 - ACTIVE -
-
sd swap02-02 swapvol-02 ENABLED 12354045 0 - -
-
pl swapvol-01 swapvol DISABLED 12354045 - NODEVICE -
-
sd swap01-01 swapvol-01 DISABLED 12354045 0 NODEVICE -
-

v usrvol fsgen ENABLED 4267761 - ACTIVE -
-
pl usrvol-02 usrvol ENABLED 4267761 - ACTIVE -
-
sd dsk1d-01 usrvol-02 ENABLED 4267761 0 - -
-
pl usrvol-01 usrvol DISABLED 4267761 - NODEVICE -
-
sd usr01-01 usrvol-01 DISABLED 4267761 0 NODEVICE -
-

v vol-dsk0f fsgen ENABLED 12878154 - ACTIVE -
-
pl vol-dsk0f-02 vol-dsk0f ENABLED 12878154 - ACTIVE -
-
sd dsk1f-01 vol-dsk0f-02 ENABLED 12878154 0 - -
-
pl vol-dsk0f-01 vol-dsk0f DISABLED 12878154 - NODEVICE -
-
sd dsk0f-01 vol-dsk0f-01 DISABLED 12878154 0 NODEVICE -
-

To my untrained eye it looks like dsk0 was the failed drive.

Would the following be the series of commands I would issue to rebuild?

1. Dissassociate and remove all plexes associated with failed disk.

#> volplex -o rm dis rootvol-01 swapvol-01 usrvol-01

2. Remove the failed objects from LSM.

#> voldg rmdisk root01 swap01 usr01 dskoh ## what about 'dsk0f-AdvFS' ?

#> voldisk rm dsk0a dsk0b dsk0g dsk0d

3. Replace disk and scan.

#> hwmgr -scan scsi

#> dsfmgr -e dskX dsk0 (where dskX=newly scanned disk)

#> disklabel -rw dsk0

4. Remirror the boot disk.

#> volrootmir -a dsk0

Anything here incorrect? Did I miss anything?

Many, many thanks!
Andy



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:22 EDT