SUMMARY: problem with vrestore

From: Cohen, Andy (Andy.Cohen@cognex.com)
Date: Wed Apr 30 2003 - 14:16:32 EDT


Thanks to everybody who helped me with this. It does seem to be a hardware
problem as evidenced most by:

uerf -R -o full|more:

----- EVENT INFORMATION -----

EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 2362.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Wed Apr 30 13:59:35 2003
OCCURRED ON SYSTEM loki
SYSTEM ID x0006001B
SYSTYPE x00000000

----- UNIT INFORMATION -----

CLASS x0001 TAPE
SUBSYSTEM x0000 DISK
BUS # x0000
                              x0030 LUN x0
                                        TARGET x6

----- CAM STRING -----

ROUTINE NAME ctape_ready

----- CAM STRING -----

ERROR TYPE Hard Error Detected

----- CAM STRING -----

DEVICE NAME DEC TLZ09 (C)DEC0167

----- CAM STRING -----

                                        Active CCB at time of error

----- CAM STRING -----

                                        CCB request completed with an error
ERROR - os_std, os_type = 11, std_type = 10

and 'tcopy /dev/rmt0h' gave

        read error, file 0, record 0: I/O error'

Suggestions were:

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Are you up to date on your patch kits, such as they are? I do recall some
problems with vrestore, though I don't recall the exact details. That
message you gave does indicate some sort of error, usually attributed to
hardware. You've run a cleaning tape thru the drive? It's not impossible
that you have a read problem, though everything with a magnetic head does a
read after write as part of the error checking, so it would be a low
percentage for that possibility.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
what happens if you run a tar/cpio/dd operation to the tape drive? also do
you see anything in your binary errlog or syslog pertaining to tape drive
errors? low level driver errors are not always indicated on stderr but
should hopefully pop up in your system errlogs.

you may just need to run a tape cleaner cartridge through the drive - may
need to use a couple of fresh cleaners. we have had "good" tape cleaner
cartridges cop an attitude for some random unexplicable reason.

you may also have a defective new DDS tape.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
I have seen problems with a TLZ09 where I could write a tape using the
maximum blocking factor but not read it back correctly.

The /usr/field/tapex utility (part of the system exercisers optional
software) can do a pretty decent diagnostic test on a tape drive and
its media; it has a reference page.

You could try something like "tar" to write the tape and read it, but
the cam_status does suggest you have a hardware problem of some kind.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

        If the SCSI driver got an error reading the original
        tapes, then it would have logged in the binary event
        log. Depending on the system, it may be readable with
        uerf(8), which can format SCSI errors well enough. You
        will want to use the "-o full" option.

        Some tape drives have read-after-write built in. Some
        don't. The TLZ09 could be in the "don't" group, which
        means it can't notice content problems.

        A simple test would be to write some known data with a
        comparible tape, then read it back to see if it is the
        same or different. The tape exerciser in /usr/field
        has this as one of its tests.

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Running tcopy is a good way to see if any data got onto the tape.

You should see blocking information and size for each tape file set.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
This is the script we use on version 4.

mt -f /dev/rmt0h rewind
/sbin/vdump -0uf /dev/nrmt0h /
/sbin/vdump -0uf /dev/nrmt0h /usr
mt -f /dev/rmt0h rewoffl

When you list the tape you have to

mt -f /dev/rmt0h rewind
/sbin/vdump -tvf /dev/nrmt0h <-- repeat this command for every save set
on the tape.
mt -f /dev/rmt0h rewoffl <-- to rewind and unload the tape.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
I don't see you using the norewind device, however I add another disk and
vdump piped into a vrestore /, usr and var. I then have a backout plan or a
bootable disk during upgrade disasters.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Do you have more than one tape device on the system/cluster? An mt stat may
show if it's your drive that is faulty.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Try just one file-set. BTW, is that "vrestore" from 5.1A? That might be
totally
incompatible... Try TAR.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Probably a bug. Try two things:

1) Back up only ONE file system to the tape, and don't mess with the
no rewind thing.

2) Try sending the backup to a file on disk and see if you can read
that, or pipe the vdump output to vrestore -t directly; if you can do
the vdump to a disk file or a pipe and read it with vrestore, then it
is almost certainly a bug that you can't read the tape.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
You may need to rewind the tape to the correct position.
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Can you try to tar a short file to the drive and then tar it back as a test?

Thanks to:
Eric Tang
Tom Blinn
Nikola Milutinovic
Rich Glazier
Pat O'Brien
Rich Copeland
Oisin McGuinness
Alan Rollow
Bluejay Adametz
Mark Deiss
Bryan Lavelle
Brian Staab

ORIGINAL QUESTION:
==================

I'm preparing to upgrade a 1000A to 5.1A (from 4.0E). I backed up some
filesets by issuing:

        mt rew
        vdump -N -v -0 /
        vdump -N -v -0 /usr
        vdump -N -v -0 /home
        mt rew

However when I went to double-check it's validity I issued:

        vrestore -t

and got:

        vrestore: unable to use save-set; invalid or corrupt format

        ************* PROGRAM ABORT **************

        vrestore: can't obtain fileset attributes

Is this user error or some sort of bug?

============================

Thanks again!
Andy



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:17 EDT