SUMMARY: AdvFs problems

From: Colin Bull (c.bull@videonetworks.com)
Date: Wed Aug 20 2003 - 04:35:28 EDT


Thanks to Hans Verhoef, Johan Brusche, JudithReed, Dr Thomas.Blinn, Piotr
Grzybowski

The suggestions were
        check binary.errlog
        verify -f
        issue a "file" command on the "raw" or "rdisk" version of each device name
        reboot server

and also the possibility of a new version of firmware with a fix for disk
errors not cleared by AdvFs. I have not pursued this.

I was puzzled as to why 2 filesystems would have corruption at the same
time. One
is on an internal disk, and one is on a SAN JBOD. I still do not know why.
The HSG80 contoller does not see any errors.

The internal disk problem appears to have been resolved simply by
rebooting the second server.

The SAN problem was partly solved by this, I could access some of the files
but others gave I/O errors.
binary.errlog reported -
        bs_osf_complete: metadata write failed
        AdvFS Domain Panic; Domain devphy Id 0x3c67caa2.0208459f
        An AdvFS domain panic has occurred due to either a metadata write error or
an in
        ternal inconsistency. This domain is being rendered inaccessible.

My first task was to copy all files I could access to a different
filesystem.
I then tried verify -f, removed the disk physically and replaced it and
tried
verify -f once more and then ran the salvage command.

These achieved no more than I had already gotten. So now I will
re-initialise
the disk, restore from last backup ( couple of weeks old ), replace newer
files from my copy directory ( ignoring zero length files) and
start again.

>
> Tru64 5.1A
>
> We have a 2 node cluster on a HSG80 SAN. On Saturday I
> noticed an overnight
> batch run
> on Serv1 had only done the first half of job. When starting
> to check I
> found an
> Advfs file system on domain devphy gave an I/O error.
>
> I rebooted and the domain disappeared from the list of domains on that
> server. The
> other server stills shows it.
> The domain is devphy and fileset dba, which is system wide.
>
> serv1# /sbin/showfdmn devphy
> showfdmn: unable to get info for domain 'devphy'
>
> On hunting around, there is also a CDSL that gives an I/O
> error on serv2 ,
> yet is OK on serv1. But I cannot cd to the CDSL directory on
> either server.
> The parent
> directory does not exist. I have tried to run fixfdmn on
> serv2 and get the
> response -
>
> Can't fix domain with mounted filesets
> # umount /usr/informix/export
> /usr/cluster/members/member2/informix/export: No such device
>
> Any suggestions for next move ?
>
> Incidentally, we are not in USA, and have not had any power cuts.
>
> Colin Bull
> c.bull@videonetworks.com
>



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:33 EDT