HELP Urgent: I/O Error on SDLT under TruCluster 5

From: James (james@ib-maroc.com)
Date: Mon Jun 09 2003 - 13:17:09 EDT


Hi managers,
We are trying to configure a MSL5052 (4 Drives SDLT 110/220) to be added to our SAN for backups.
We are running Compaq Tru64 UNIX V5.1 (Rev. 732), patch kit3 as part of a 2-node cluster.

The Legato server is under Windows 2000 Server, and the two nodes of the TruCluster are considered as Storage nodes.

The problems are:
1) On the TruCluster nodes, when I run a vdump command on the drive, I get an I/O error to retry press yes, when I do yes: the backup continue.

# vdump -0f /dev/ntape/tape18_d1 /
path : /
dev/fset : cluster_root#root
type : advfs
advfs id : 0x3b3a2426.000d3d03.1
vdump: Date of last level 0 dump: the start of the epoch
vdump: Dumping directories
vdump: Dumping 121911479 bytes, 278 directories, 5386 files
vdump: Dumping regular files

vdump: unable to write to device </dev/ntape/tape18_d1>; [5] I/O error
vdump: do you want to retry? Y

vdump: unable to write to device </dev/ntape/tape18_d1>; [5] I/O error
vdump: do you want to retry? Y

vdump: Status at Wed Jun 4 17:10:49 2003
vdump: Dumped 122169710 of 121911479 bytes; 100.2% completed
vdump: Dumped 278 of 278 directories; 100.0% completed
vdump: Dumped 5386 of 5386 files; 100.0% completed
vdump: Dump completed at Wed Jun 4 17:10:49 2003

        When I try to backup files with tar command, there is no I/O error.
       
        Does anyone have any ideas? I took a look at the hwmgr output and I don't see any problems:

# hwmgr -view dev
 HWID: Device Name Mfg Model Location
 --------------------------------------------------------------------------
   37: /dev/disk/dsk1c DEC HSG80 IDENTIFIER=100
   38: /dev/disk/dsk2c DEC HSG80 IDENTIFIER=1
   39: /dev/disk/dsk3c DEC HSG80 IDENTIFIER=2
   40: /dev/disk/dsk4c DEC HSG80 IDENTIFIER=10
  296: /dev/changer/mc0 MSL5000 Series bus-3-targ-2-lun-0
   41: /dev/disk/dsk5c DEC HSG80 IDENTIFIER=20
  297: /dev/ntape/tape16 COMPAQ SuperDLT1 bus-3-targ-2-lun-1
   42: /dev/cport/scp0 HSG80CCL bus-2-targ-1-lun-3
   43: /dev/ntape/tape0 DEC TZ89 (C) DEC bus-1-targ-3-lun-0
  304: /dev/ntape/tape17 COMPAQ SuperDLT1 bus-3-targ-2-lun-2
  305: /dev/ntape/tape18 COMPAQ SuperDLT1 bus-3-targ-2-lun-3
  306: /dev/ntape/tape19 COMPAQ SuperDLT1 bus-3-targ-2-lun-4
  307: /dev/cport/scp7 SWMODULAR ROUTER bus-3-targ-2-lun-5
   93: /dev/disk/dsk12c DEC HSG80 IDENTIFIER=6
  167: /dev/kevm
  225: /dev/disk/dsk27c DEC HSG80 IDENTIFIER=7
  226: /dev/disk/dsk28c DEC HSG80 IDENTIFIER=200
  229: /dev/disk/dsk31c DEC HSG80 IDENTIFIER=17
  230: /dev/disk/dsk32c DEC HSG80 IDENTIFIER=110
  232: /dev/disk/dsk34c DEC HSG80 IDENTIFIER=11
  233: /dev/disk/dsk35c DEC HSG80 IDENTIFIER=16
  234: /dev/disk/dsk36c DEC HSG80 IDENTIFIER=120
  235: /dev/disk/dsk37c DEC HSG80 IDENTIFIER=12
  236: /dev/disk/dsk38c DEC HSG80 IDENTIFIER=140
  237: /dev/disk/dsk39c DEC HSG80 IDENTIFIER=40
  241: /dev/disk/dsk41c DEC HSG80 IDENTIFIER=18
  242: /dev/disk/dsk42c DEC HSG80 IDENTIFIER=8
  245: /dev/disk/dsk45c DEC HSG80 IDENTIFIER=91
  246: /dev/disk/dsk46c DEC HSG80 IDENTIFIER=90
  249: /dev/disk/dsk49c DEC HSG80 IDENTIFIER=71
  250: /dev/disk/dsk50c DEC HSG80 IDENTIFIER=70

        Under windows server there is no problem.

2) When I try to do a backup under networker, the backup is stopped and I have a message explaining that the "media is full"

juin 04 16:25:48 backup-srv: NetWorker Media: (info) loading volume GNA858 into rd=alpha:/dev/ntape/tape19_d0
juin 04 16:27:04 backup-srv: NetWorker media: (warning) rd=alpha:/dev/ntape/tape19_d0 writing: I/O error, at file 2 record 378
juin 04 16:27:04 backup-srv: NetWorker media: (notice) sdlt tape GNA858 used 36 MB of 101 GB capacity
juin 04 16:27:04 backup-srv: NetWorker media: (notice) sdlt tape GNA858 on rd=alpha:/dev/ntape/tape19_d0 is full
juin 04 16:27:23 backup-srv: NetWorker media: (info) verification of volume "GNA858", volid 3726436865 succeeded.
juin 04 16:27:33 backup-srv: NetWorker media: (info) Labeling a new writable volume for pool 'Prod'

        Can anyone suggest any further diagnosis that I might do to establish what is going wrong?

Thanks in advance,



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:21 EDT