SunFire V440 disk replacement problems

From: Gene Beaird (bgbeaird@sbcglobal.net)
Date: Mon Feb 05 2007 - 20:24:45 EST


Hello all,

I seem to be having a run of bad luck with Sunfire V440 servers lately. I
have had to replace disk0 on two different systems in the last two weeks.
Each time, the server has crashed. I don't understand why. They are
supposed to be hot-swappable.

On neither box were they hardware mirrored, only mirrored via SVM. I
verified this before starting, according to the V440 Server Admin and V440
hardware replacement manuals, raidctl indicated that there were no raid
disks.

On both systems, I have issued the cfgadm -c unconfigure c1::c1t0d0 command
to remove the disk from system control. After swapping the drive, I tried
issuing cfgadm -c configure c1::c1t0d, got some sort
of error, something like 'target invalid', or 'target unavailble'.
On both systems I had to issue a devfsadm before the system would 'see' the
drive.

On the first system, the machine crashed immediately upon inserting the new
drive. On system No. 2, it stayed up during the disk swap, but crashed when
I issued the devfsadm command.

On system No. 1, after we rebooted the machine on disk1, I did a devfsadm
and it successfully saw AND pulled the new drive into cfgadm. That is, I
could see it when I ran cfgadm -al. Once I got system No. 2 back up on
disk1, I could see the new drive with cfgadm - al. I guess it had scanned
the device and added the drive to the device tree before crashing. No idea
though.

I will add that on system No. 2, I could not get the thing to successfully
boot on disk1 until we powered off the system and reseated all the system
drives. After that, it came up with no issues. I was able to easily
re-mirror the new drive and it has been happy since.

On System No. 1, after the system booted, I noticed that another disk,
disk3, which is a mirrored data disk, was offline. I have a case open with
Sun to replace that drive. The current plan is to power off the box and
either reseat all the drives, or replace disk3 and reseat all the others
before bringing it back up.

So, has anyone else had such issues with V440s? Is this really a system
that needs to be powered off before replacing disks? What did I miss? I am
really getting tired of 26-hour disk swaps on production systems. Thank
you.

Regards,

Gene Beaird,
Houston, Texas

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.432 / Virus Database: 268.17.24/668 - Release Date: 2/4/2007
1:30 AM
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:41:35 EDT