Solaris 9 Volume Manager metadb issues / layout advise / T3's not being seen

From: Shin (shin@solarider.org)
Date: Wed Sep 15 2004 - 20:18:08 EDT


Hi,

Apologies for the convoluted Subject field; just trying to cover a
lot of ground in a small space :-)

Please bear with me as I've included a lot of info and there are a
number of questions inline. Apologies if it is all a bit much. I
wanted to provide as much info as I could; so the questions got a
bit buried.

I have a V240 with a Sun 2Gb PCI Fibre card in it. This is connected
to a fibre switch which has a number of devices such as a 3510 and
3xT3's connected to it. Everything is configured so that only this
V240 see's the 3 T3's.

The V240 see's 5 disk devices:

0 - 36Gb (hdd0) - the system disk c0t0d0
1 - 36Gb (hdd1) - future use for mirroring the system disk - c0t1d0
2 - T3 (c2t50020F230000BB5Cd0 <SUN-T300-0200 cyl 34530 alt 2 hd 112 sec 128>
                     /pci@1e,600000/SUNW,qlc@3/fp@0,0/ssd@w50020f230000bb5c,0)
3 - ditto (c2t50020F230000C11Dd0)
4 - ditto (c2t50020F230000C195d0)

I decided to use Solaris Volume Manager (SVM) to manage the T3's by
building a Raid 0 (stripe) across them and then using soft partitions
to break things up from there for actual usage.

Question: Is this best approach? Ie Raid 0 stripe over the
underlying Raid 5's of the T3's. And then soft partitioning?

Disk-0 has been partitioned with a 100Mb s3 to use for the state
database. Disk-2, disk-3 and disk-3 are partitioned such that they
all have a 100Mb s3 for the state database replicas and the
remainder of the disk is on s7.

[T3's are configured with the first 8 disks in a Raid 5 config and
with disk 9 as a hot spare]

I proceeded thus:

1. create a state database (2 replicas on system disk, s3)

metadb -a -f -c2 c0t0d0s3

2. And a replica on each of the T3's

metadb -a c2t50020F230000BB5Cd0s3
metadb -a c2t50020F230000C11Dd0s3
metadb -a c2t50020F230000C195d0s3

3. Create a raid 0 stripe on the 3xT3's

metainit d3 1 3 c2t50020F230000BB5Cd0s7 c2t50020F230000C11Dd0s7 c2t50020F230000C195d0s7 -i 64k

The metastat output is

root@doyle 67 % metastat d3
d3: Concat/Stripe
    Size: 1484421120 blocks (707 GB)
             Stripe 0: (interlace: 128 blocks)
                      Device Start Block Dbase Reloc
                                c2t50020F230000BB5Cd0s7 0 No Yes
                      c2t50020F230000C11Dd0s7 0 No Yes
                                c2t50020F230000C195d0s7 0 No Yes

Device Relocation Information:
Device Reloc Device ID
c2t50020F230000BB5Cd0 Yes id1,ssd@w60020f200000bb5c3f6f0e31000b5fb4
c2t50020F230000C11Dd0 Yes id1,ssd@w60020f200000c11d4145d5df0008e1de
c2t50020F230000C195d0 Yes id1,ssd@w60020f200000c1954146e7e70002e92a

I had read that it was best to use a block size that was the same as
the block size of the underlying devices. According to my docs the
T3's have a block size of 64K hence I passed the -i 64k to metainit.
However according to the metastat output above it says 128 blocks.
Shouldnt this be 64? After experimenting I found that if I used -i
32k then I got 64 blocks in the output from metastat.

Question: Should I be using -i 32k or some other value? What is the
best way to determine this value?

4. create soft partitions d31 and d32

metainit d31 -p d3 100gb
metainit d32 -p d3 200gb

5. Then I run newfs and mounted them all up.

So far so good.

However when I put the machine through a reboot; it stopped with the
following error:

"
Insufficient metadevice database replicas located.

Use metadb to delete databases which are broken.
Ignore any "Read-only file system" error messages.
Reboot the system when finished to reload the metadevice database.
After reboot, repair any broken database replicas which were deleted.

Type control-d to proceed with normal startup,
"

I went in on single user and on running format I could only see the
2 36Gb disks, c0t0d0 and c0t1d0

>From my understanding of SVM as it couldnt see more than 50% of the
state databases it was complaining. Ie it could only see the 2 on
c0t0d0s3.

I figured that this might be because the system could not see the
T3's this early in the boot process and tried to fiddle with
/etc/system to force load the necessary Qlogic drivers for the fibre cards
and associated fibre drivers. I ended up with the following:

forceload: drv/ifp
forceload: drv/fp
forceload: drv/fctl
forceload: drv/qlc
forceload: drv/fcp
forceload: drv/ssd
forceload: drv/fcip
forceload: drv/fcsm
forceload: drv/sd
forceload: drv/glm

However this still didnt seem to work. So I was either missing the
right driver or misunderstanding something (probably the latter).

Question: How do I get the T3's seen so that all the state databases are
seen and not just the one on c0t0d0s3?

Question: Should I just stick 3 state databases on c0t0d0s3 and have
done with it?

Also I want to use the second internal disk c0t1d0 to mirror the
system disk.

Question: Should I setup the mirroring of the system disk c0t0d0 to
c0t1d0 and stick with the 3 databases on s3; as I will always have a
copy on the second disk; or should I use metadb to use s3 on c0t1d0
as a state database replica and then mirror the rest of the slices?
Not sure exactly how one does the latter.

Question: If I do setup mirroring of the system disk should I set
that up first before going ahead and doing the Raid 0 stripe and the
soft partitions or should I do it the other way around.

Question: Is it even sensible to try and have a state database on
each of the T3's?

I guess none of this would be a problem if the T3 database was seen
properly. So getting that seen early would solve it?

Also am I likely to run into the same issue when attempting to use
LUNs from the 3510 on a different host?

How would you do all of the above if you were using SVM with a
similar hardware configuration.

Your advise on how to proceed, general design/layout advise etc will
be gratefully received. I will of course send out a summary.

TIA
Shin
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:27 EDT