SUMMARY: cannot remove corrupt file

From: kdea@alpine-la.com
Date: Thu Apr 03 2003 - 14:58:54 EST


Hi Managers,

Let me first thank Alan Rollow and Michael Polnick for their help. They
both suggested using the scm utility to reassign the inodes and fsck
afterwards. I have posted both suggestions as they may help future
readers. I however did cop out on the problem, the data on the disk was
scrach space and not critical, and we did have a maintainance agreement,
so we simply swapped it out the bad disk with a brand new one.

---begin Alan's comments---
You may have more than one bad sector. I would:

                 o Scan the disk to find all the blocks that can't be
                    read and note them. scu(8) has access to a SCSI
                    command that tells the disk to get its data, but
                    transfer the data, allowing it to run at internal
                    disk transfer speeds. I don't recall with the
                    interface to the command is verify or scan. One
                    just reads, but the others writes and then reads.

                    BE VERY CAREFUL...

                    Alternatively, use dd(1) to read the raw partition.

                 o Having noted the blocks that are bad, use icheck and
                    ncheck to see what files and file system data
structures
                    they belong to. The failure of clri(8) almost
certainly
                    meant that there was a bad block in the relevant inode
                    space. That's going to affect a bunch of files.

                 o Once you essay the damage, line up your backups, so
                    you can restore anything that is about to disappear.
                    For sufficiently wide spread damage, sometimes it is
                    easier just to restore the whole file system.

                 o Use scu(8) to reassign the bad blocks to better
blocks.
                    This probably won't be able to recover a good copy of
                    data, which will guarantee the data is now corrupt,
but
                    they'll be readable.

                 o Run fsck(8) and let it repair the damage.

                 o Restore and/or rename missing or lost files.

                 Whether the crash caused the corruption, the corruption
caused
                 the crash or both were cause by some 3rd problem is hard
to
                 tell.
---end Alan's comments---

---begin Michael's comments---
you can manually reassign defective blocks with scu.
The partition must be umounted.

scu -f /dev/rrz5c
scu> verify media options dpo # if you would check the whole partition
                                  takes some time
scu> reassign lba <block number>

after that you should check with fsck.
---end Michael's comments---

---original message---

We had a crash recently while a user was editing a file, resulting in a
corrupt file that cannot be viewed, "ls -la", or delete. I've already
unmounted it and ran fsck on it, and it shows some unreadable blocks:

CANNOT READ: BLK 65923744
CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 65923746, 65923749,

When I ran:

# icheck -b 65923746 /dev/rrz5c
/dev/rrz5c:
65923746 arg; frag 1 of 8, inode=7776704, class=inodes 7776704-7776768
files 303551 (r=302724,d=719,b=0,c=0,sl=108,sock=0,fifo=0)
used 22918478 (i=7362,ii=969,b=2795861,f=484942)
free 11855350 (b=1480423,f=11966)
missing 37

I tried to remove the offending inode using clri, but I get an i/o error.

# clri /dev/rrz5c 7776704
clearing 7776704
clri: /dev/rrz5c: I/O error

Is there anything I can do next to delete the file, and mark the bad
sector?
The machine is an DS20 running 4.0f pk 7. The filesystem is ufs.

--
Kevin Dea
UNIX System Administrator
Alpine Electronics Research of America


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:14 EDT