SUMMARY: AdvFS problems

From: Bryan Dunlap (bcd@pacific.mps.ohio-state.edu)
Date: Tue Apr 08 2003 - 19:04:41 EDT


Many thanks to all the helpful folks on the list, including

Alan Rollow
Dr Thomas Blinn
Hines, Bruce D
Brusche, Johan
Chris Bryant

No doubt others of you replied, but I haven't seen the messages
because the host is also our mail server, and I've had it down moving
the filesets to other disks today.

The question was:

   I'm having problems on a fileserver, Digital Unix 4.0E on an Alphaserver
   2100. Started Sunday in the early morning hours, happened again twice
   Monday night/Tuesday morning. Many errors similar to the
   following:

     vmunix: AdvFS I/O error:
     vmunix: Volume: /dev/rz16c
     vmunix: Tag: 0xfffffffa.0000
     vmunix: Page: 8404
     vmunix: Block: 147088
     vmunix: Block count: 16
     vmunix: Type of operation: Read
     vmunix: Error: 5

   The volume, page and block vary but repeat; all are error 5 and nearly all
   the operations are read. The three volumes are members of the same domain
   (but all domains effectively become inaccessible).

   Is this likely a disk hardware problem? Perhaps the domain needs to be
   rebuilt? Any advice welcome.

Use of the commands

   /sbin/advfs/verify
   uerf -R -r 199 -o full -f /var/adm/binary.errlog > errs.txt

was most helpful. verify showed errors on 2 of about 20 filesets in
the domain; it would not successfully repair them. The uerf command
showed a large number of soft errors on one volume in the domain, and
a handful of hard errors. The disk is definitely failing. Things
were still stable enough to copy the filesets to new domains on newer
drives with vdump/vrestore. About 3 SCSI errors were reported during
the copy of ~53 GB.

        ==BD



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:15 EDT