SLVM broken, it seems

From: Jeff Woolsey (jlw+sun@jlw.com)
Date: Thu Nov 11 2004 - 04:21:58 EST


OK, I'm confused/baffled. I have a colo'ed system whose SLVM
(DiskSuite) database replicas do not survive a reboot.

SunOS newwww 5.9 Generic_117171-07 sun4u sparc SUNW,Ultra-2
JASS has been run on this system.

A week ago, it had two identical 9GB disks in it, one with mounted
filesystems and the other spare. I spent an evening setting up
SLVM on it, and everything set up just fine. A week later a pair
of 73GB disks became available, so I detached half the mirrors,
metacleared things, and undid half the metadbs to free up one disk.
Reboot with one 73GB disk in its place, make slightly larger
filesystems, ufsdump|ufsrestore to the new thing, undo all of the
rest of the SLVM setup, reboot, setup SLVM on the 73GB with new
databases and metadevices. Replace remaining 9GB with 73GB, clone
the partitions, add metadbs, make metadevices, metaroot, and start
syncing things up. When that's all done, reboot again, and the
system panics because it can't mount root. boot -a and take all
the defaults and the system comes up far enough to edit vfstab to
replace each filesystem with ordinary slices, and it comes up.
metadb reports that there are no existing databases. Huh?

I took the two 9GB disks home, as I have an identical system there,
and managed to reproduce the problem using 9GB disks, and I thought
the cause was not rebooting between all the metainits and the
metaroot. Worked on the test machine, but not on the colo machine.

Tried many things, including copying off the metadb slice and
comparing that after a reboot (no differences, nothing overwritten),
enabling the meta daemons in inetd.conf (JASS-or-whatever had
disabled everything), forceloading misc/md_stripe etc. (no good),
not doing metaroot (no good). None of the slices overlap. Having
a swap device or not doesn't matter.

Somehow the list of where the metadbs are is not available to the
boot process. Comparing against a working system, the working
device names in /kernel/drv/md.conf have disk labels, and the problem
system has what look like WWNs (OpenBoot version is 3.19, however
things worked earlier with the 9GB disks (and nothing enabled in
inetd.conf), same OS rev, same OpenBoot rev).

So, what blindingly obvious thing can I not see?

-- 
Jeff Woolsey {woolsey,jlw}@{jlw,jxh}.com,first.last@gmail.com
"A toy robot!!!!" -unlucky Japanese scientist
"And Leon's getting laaaarrger!"  -Johnny
"Delete! Delete! OK!" -Dr. Bronner on disk space management
"I didn't get a 'Harrumph!' out of _that_ guy." -Gov Le Petomaine
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:42 EDT