mcs_lock: time limit exceeded?

From: Charles Ballowe (hangman@steelballs.org)
Date: Sun Sep 29 2002 - 09:06:12 EDT


I have a GS80 in a 2-member cluster running oracle8i. The cluster is for
HA, so the DB is only running on one node at a time. We have a window for
a cold database backup every weekend and at the start of the window a
caa_stop oraDB runs to shutdown the database. The caa script does a
shutdown about
startup restrict
shutdown
on the database. This weekend and two weekends ago, this process has resulted
in a crash of the server running the DB. This also leaves the caa_stop
running and hanging waiting for a return message from the other member.

I'm seeing the following in /var/adm/crash/crash-data for this particular
crash:

_cpu: 59
_system_string: 0xffffffffffbdd780 = "Compaq AlphaServer GS80 6/731"
_ncpus: 4
_avail_cpus: 4
_partial_dump: 1
_physmem(MBytes): 6143
_panic_string: 0xffffffff0082dec8 = "mcs_lock: time limit exceeded"
_paniccpu: 0
_panic_thread: 0xfffffc00e436e000

between the 2 crash-data files, the _paniccpu changes, which leads me to
believe tha this may be a software problem rather than a hardware problem.
we are on tru64 5.1 PK3

Any suggestions?
Thanks,
-Charles Ballowe
aka cballowe@usg.com



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:54 EDT