trucluster question

From: Steve Feehan (sfeehan@sbb.uvm.edu)
Date: Wed Jul 31 2002 - 15:05:40 EDT


I have just setup a two node trucluster (5.1a) on two DS10Ls w/
a LAN interconnect.

To see what happens when the LAN interconnect was broken, I unplugged
the cable. Before disconnecting, I spread the file systems across the
two nodes (ie. / and /var on member1, /usr on member2) just to make
things interesting.

I disconnected the cable and both systems appeard to hang, which is
expected.

After about two minutes one node came back online, with cfsmgr
showing that it had taken over the other nodes file systems.

The unexpected bit is that the other node had crashed. I switched over
to the console to find it at the >>> prompt.

So my two questions:

 1. why did one of the members crash?

 2. is there a way to reduce the timeout between the clusters
    separating, and a member taking over? And if so, is this a good
    idea or should I not mess with the defaults?

Thanks.

-- 
Steve Feehan
Unix Systems Administrator
Structural Biology and Bioinformatics Group
University of Vermont


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:48 EDT