Summary: Tape drive problems.

From: Tim Hespe (t.hespe@unsw.edu.au)
Date: Wed Aug 06 2003 - 21:16:27 EDT


Thanks to all who responded.
Darren Dunham.
Sarang
Joe Fletcher.
Glenn May.

All attempts at physical intervention had no effect (I think but I can't be sure about this).
The following entries appeared in /var/adm/messages exactly 5 hours after the mt command hung:
Aug 6 15:55:37 sfx scsi: [ID 365881 kern.info] /pci@8,700000/scsi@5 (glm1):
Aug 6 15:55:37 sfx Cmd (0x6bc8190) dump for Target 4 Lun 0:
Aug 6 15:55:37 sfx scsi: [ID 365881 kern.info] /pci@8,700000/scsi@5 (glm1):
Aug 6 15:55:37 sfx cdb=[ 0x11 0x1 0x0 0x0 0x1 0x0 ]
Aug 6 15:55:37 sfx scsi: [ID 365881 kern.info] /pci@8,700000/scsi@5 (glm1):
Aug 6 15:55:37 sfx pkt_flags=0x0 pkt_statistics=0x61 pkt_state=0x7
Aug 6 15:55:37 sfx scsi: [ID 365881 kern.info] /pci@8,700000/scsi@5 (glm1):
Aug 6 15:55:37 sfx pkt_scbp=0x0 cmd_flags=0xe1
Aug 6 15:55:37 sfx scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/scsi@5 (glm1):
Aug 6 15:55:37 sfx Disconnected command timeout for Target 4.0
Aug 6 15:55:37 sfx genunix: [ID 408822 kern.info] NOTICE: glm1: fault detected in device; service still available
Aug 6 15:55:37 sfx genunix: [ID 611667 kern.info] NOTICE: glm1: Disconnected command timeout for Target 4.0
Aug 6 15:55:37 sfx glm: [ID 160360 kern.warning] WARNING: ID[SUNWpd.glm.cmd_timeout.6016]
Aug 6 15:55:37 sfx scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/scsi@5/st@4,0 (st11):
Aug 6 15:55:37 sfx Failed to restore the last file position: In this state, Tape will be loaded at BOT during next open

Needless to say having spent the preceding 4 1/2 hours checking /var/adm/messages for evidence of anything happening,
I gave up and missed these entries completely. I have no idea if mt terminated when this occurred. A reboot
proceeded some hours later and fixed the problem (assuming it was still a problem at that point).

All in all an unilluminating and unsatisfying experience.

Original post.
>Hi,
> I have a DLT8000 that is performing badly. To verify that the drive was the problem
>I swapped the drive from another machine and confirmed that the drive was the problem.
>I have moved the original drive back. I issued an "mt status" command on the drive but the mt command
>won't complete. I can't kill the process and can't do anything with the drive. When swapping I did not
>take the precaution of unloading the st module which I probably should have. I can't unload the st module now because
>it is in use. Is there any way to recover from this without rebooting?

Tim Hespe
System Admin.
University of New South Wales Library
t.hespe@unsw.edu.au
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:53 EDT