Sun Cluster scswitch problem

From: Erdei Tamás (Erdei.Tamas@lnx.hu)
Date: Wed Jan 19 2005 - 13:21:50 EST


Hi All,

we have a Sun cluster running Solaris 8, SC 3.0 and DiskSuite 4.2.1 on 2 Netra
20 nodes with 2 D2 disk arrays.
The problem is, that we are unable to switch the primary of a disk group with
"scswitch -z -D dg-XXX -j nodeX". The scswitch command simply does not finish,
and after that, we can't even run scstat, because it hangs as well. This also
happens, when all resource groups are stopped and all global filesystems are
unmounted. We can recover the disk group with a complete reboot of the cluster
only.
It seems, as the system gets into some kind of deadlock state with the
scswitch command.

The strange is, that the system did not have this problem earlier. Maybe one
of the OS patches which were installed in the mean time introduced the
problem, but this is not sure.

Has any of you faced a similar problem? What was the solution?

I talked to a Sun engineer about the problem, and now I am a bit confused: he
said, that switching the disk group primary with the scswitch command is not
possible while resource groups are running and global filesystems are mounted
on the disk group. I thought, that it should be possible to switch the disk
group primary at any time, because the cluster file system manages the disk
group access paths (direct SCSI or crossconnect) seamlessly, within the DID
driver. Now which one is true?

I appreciate any comment and help in this subject.

Best regards,
Tamas
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:02 EDT