CPU PANIC on Trucluster 5.1A

From: Sathiamoorthy Balasubramaniyan \(ext_TCS\) (Balasubramaniyan.Sathiamoorthy.ext_TCS@ts.siemens.de)
Date: Fri Mar 05 2004 - 14:26:00 EST


Oops I missed the subject line.
Regards,
Bala

> -----Original Message-----
> From: Sathiamoorthy Balasubramaniyan (ext_TCS)
> Sent: Friday, March 05, 2004 7:33 PM
> To: Tru64 List (E-mail)
> Subject:
>
> hello managers,
> We had a catastrophe on our 2-node trucluster 5.1A yesterday.
> Both the nodes crashed with a CPU panic (vrele: bad ref count).
> And the nodes rebooted automatically for 3 times in a span of 30 minutes
> and
> the logs for the each reboot show the following:
>
> On Node1:
> ---------
> Mar 4 11:33:25 node1 vmunix: vrele: bad ref count: type VDIR, usecount 0
> Mar 4 11:33:26 node1 vmunix: tag VT_CFS, fsid ee4bcf0d,a
> Mar 4 11:33:26 node1 vmunix: panic (cpu 1): vrele: bad ref count
> Mar 4 11:33:26 node1 vmunix: syncing disks...
> Mar 4 11:33:26 node1 vmunix: Memory trolling not supported, cpu Major id
> 11, Minor id 9
> Mar 4 11:33:26 node1 vmunix: Alpha boot: available memory from 0x582e000
> to 0xffff4000
> Mar 4 11:33:26 node1 vmunix: Compaq Tru64 UNIX V5.1A (Rev. 1885); Sat Aug
> 2 22:25:02 MEST 2003
>
> On node2:
> ----------
> Mar 4 11:33:19 node2 vmunix: vrele: bad ref count: type VDIR, usecount 0
> Mar 4 11:33:19 node2 vmunix: tag VT_CFS, refcnt 1 pvp
> fffffc00c6a54a00
> Mar 4 11:33:19 node2 vmunix: type VDIR, usecount 1
> Mar 4 11:33:19 node2 vmunix: panic (cpu 0): vrele: bad ref count
> Mar 4 11:33:19 node2 vmunix: syncing disks...
> Mar 4 11:33:20 node2 vmunix: Memory trolling not supported, cpu Major id
> 11, Minor id 14
> Mar 4 11:33:20 node2 vmunix: Alpha boot: available memory from 0x5824000
> to 0xffff4000
> Mar 4 11:33:20 node2 vmunix: Compaq Tru64 UNIX V5.1A (Rev. 1885); Sun Aug
> 3 11:34:31 MEST 2003
>
> UERF on both systems:
> ---------------------
> ----- EVENT INFORMATION -----
>
> EVENT CLASS ERROR EVENT
> OS EVENT TYPE 302. PANIC
> SEQUENCE NUMBER 14969.
> OPERATING SYSTEM DEC OSF/1
> OCCURRED/LOGGED ON Thu Mar 4 11:21:39 2004
> OCCURRED ON SYSTEM node2
> SYSTEM ID x000B0022
> SYSTYPE x00000000
> PROCESSOR COUNT 2.
> PROCESSOR WHO LOGGED x00000000
> MESSAGE panic (cpu 0): vrele: bad ref
> count
>
>
> System information:
> -------------------
> COMPAQ AlphaServer DS20E 666 MHz with 4GB RAM on both machines.
> Operating system: Tru64 5.1A, Trucluster 5.1A with patchkit 4.
>
>
> The system also saved the vmzcore files but i have no idea how to extract
> relevant information from it.
>
> Please help me with some information how to find the cause of this error.
>
>
> Thanks in Advance,
> Bala
>
>
> Note: This e-mail may contain privileged, undisclosed or otherwise
> confidential information.
> If you have received this e-mail in error, you are hereby notified that
> any review, copying or distribution
> of it is strictly prohibited. Please inform the sender immediately and
> destroy the original transmittal.
> Thank you for your understanding
>



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:52 EDT