i/o error when mounting Advfs filesets

From: kdea@alpine-la.com
Date: Thu Jul 27 2006 - 18:41:02 EDT


Dear Managers,

We had almost exactly the same problem as he had about a year ago. There
was no solution at the time, and we ended up losing all our data. I would
like to see if anyone else has seen this problem since and

I have an ES45 with an attached MSA1000 SAN with approximately 1.7TB of
disk. The machine was given a graceful restart yesterday, and came back up
without mounting any of the AdvFS partitions. We have one large domain,
broken up into 8 different filesets. The only message I get is "i/o
error". This happens with a restart, a "mount -a" or a manual "mount"
command.

We can see the logical drive using disklabel. I can see all the devices
correctly with wwidmgr on the console or hwmgr on the OS, everything shows
up and is configured correcly. The WWID addresses all match up correctly.
In fact, we broke the spare drive off the set, and successfully made a new
set, new disklabel, new AdvFS domain, new AdvFS fileset, mounted it, and
copied data on it. As far as I can tell, there is no hardware problem.

I've used all the AdvFS tools at my disposal. Using a verify gets an i/o
error. Advscan and fixfdmn dosen't work. I'm sure that salvage will
work, unfortunately, we don't have the time and extra diskspace to recover
the entire domain.

The MSA1000 is at Firmware 4.48 build 342. It has two controllers,
connected to two HBA set as to failover. The disk set is set to 13 disk in
one RAID5 set, plus one disk as a spare. The ES45 is Firmware 7.0-3, and
the operating system is 5.1B-3 (latest patch kit about March 2006). This
has been working stable for a year before this happened.

--
Kevin Dea
UNIX System Administrator
Alpine Electronics Research of America


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:31 EDT