Recovery from interrupted SAN connection

From: Koef (koef@notsupported.org)
Date: Wed Oct 04 2006 - 14:29:22 EDT


I have a Solaris 8 machine connected via a single Qlogic fibre HBA to EMC
storage. When the SAN connection is lost, i.e. when pulling the fibre
connector, following is logged at the console:

WARNING: md: d2: write error on
         /dev/dsk/c0t6006048000028775062953594D433430d0s4
WARNING: /scsi_vhci/ssd@g6006048000028775062953594d433431 (ssd1):
         ssdrestart transport failed (fffffffe)
WARNING: /scsi_vhci/ss
         d@g6006048000028775062953594d433432 (ssd0):
         transport rejected (-2)
WARNING: ufs log for /var changed state to Error
WARNING: Please umount (1M) /var and run fsck(1M)

When I plug the fibre connector back in, the /var filesystem, which is on
the SAN, does not recover. I cannot even login on the console:

console login: root
Password:

No utmpx entry. You must exec "login" from the lowest level "shell".
INIT: failed write of utmpx entry:"co"
INIT: failed write of utmpx entry:"co"

console login:

To regain access to the machine at this point it must be sent a "break"
signal and then booted from the "ok" prompt. It then drops single user
complaining about bad super blocks in the SAN mounted filesystems.

Question: is there a more elegant way to recover from, or to prepare for,
single SAN fibre link interruptions? Thank you.

-- 
Koef.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:40:56 EDT