Replacing D1000 disks

compiled by Mike Spooner

The D1000 is just-a-bunch-of-disks, it does not do "internal volume configuration" or anything like that.

The disks are also addressed geographically, rather than individually (unlike FC-AL drives), so physically replacing a drive is indistinguishable from physically repairing it, as far as VxVM and Solaris are concerned.

The D1000 electrically and power-wise allows for the removal of spinning, powered-up drives (but see step 5 below), and the insertion of drives into a powered-up chassis.

The D1000 is not smart enough to automatically recognise when a new disk has been inserted (or at least does not automatically report same to Solaris), so when using VxVM, you have to tell VxVM when "the disk has been replaced".

Thus the procedure is:

  1. If you have any part of a non-mirrored/non-RAID5 filesystem on any of the slices of that disk, unmount the affected filesystems.
  2. If you have any processes accessing non-mirrored/non-RAID5 volumes on any of the slices on that disk (eg: Oracle database using "raw" slices or "raw" volumes), stop those processes.
  3. If you have any part of non-mirrored/non-RAID5 raw swap volumes on any of the slices of that disk, remove them from the swap pool using "swap -d".
  4. Tell VxVM that you are about to remove the drive: from the vxdiskadm menu, choose "Remove a disk for replacement".
  5. Remove the faulty disk drive (usual spud-bracket technique), but leave it "half-in/half-out" of the slot for 30 seconds, to allow it to spin down. The wait is only really neccessary if you are going to reuse the drive later, rather than chucking it in the bin.
  6. Insert new disk drive (usual spud-bracket technique).
  7. Tell VxVM to scan for attached drives (this allows VxVM to pick up the geometry of the new drive, and spin it up if needed). The command is: "vxdctl enable".
  8. Tell VxVM that the disk has now been replaced: from the vxdiskadm menu, choose "Replace a failed or removed disk".
  9. Add back the "raw" swap areas removed in step 3, using swap -a.
  10. Restart the processes stopped in step 2.
  11. Remount the filesystems unmounted in step 1.