SUMMARY: ufs large filesystem recovery

From: Dominic Hargreaves (dom+tum@astro.ox.ac.uk)
Date: Thu May 08 2003 - 09:10:48 EDT


The problem was that a large UFS filesystem (~450 GB) produced errors from
fsck. Also, it seemed that the disklabel had been overwritten. As always,
multiple red herrings colluded to make the problem more interesting.

Thanks to replies from:

tsh@mrc-lmb.cam.ac.uk,
"Alan Rollow - Dr. File System's Home for Wayward Inodes." <alan@desdra.cxo.cpqcorp.net>
"Dr Thomas.Blinn@HP.com" <tpb@doctor.zk3.dec.com>
"Brusche, Johan" <johan.brusche@hp.com>

Thomas and Alan pointed out that this was a very large filesystem and that
fsck might have problems allocating memory at 1 byte for every fragment on
the filesystem. Thomas also pointed out very helpfully that disklabel -p
printed out the /prototype/ disklabel for the disk, not the actual one.
Using the correct "disklabel -r" form yielded familiar results.

In fact, that machine had ample memory (1GB real, 2GB total) and I could see
no obvious limits in ulimit or /etc/sysconfigtab. However, at this point I
realised that I could mount the dirty filesystem read-only
(mount -r -d /dev/rz19a) and this allowed access to the data. Since I believe
the system was fairly idle at the time of the crash, I believe this was a
reasonable strategy. I will now recreate several smaller filesystems on the
disk and test that I am able to fsck them, before restoring the data.

If someone could point me at some other possible cause of the process limit/
fsck error message, I would be interested, but the panic is now over :)

Regards,

-- 
Dominic Hargreaves || Astrophysics Deputy Systems Manager


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:18 EDT