CVM crash on 4 node sun cluster.

From: Claude Oliver \(CJ\) (olivercj@telkom.co.za)
Date: Thu Feb 23 2006 - 21:38:31 EST


Hi all, was hoping someone could help me with the following problem.

I have a 4 node sun cluster running Solaris 9 (118558-21) on Sun
V440's, with Veritas vm3.5 and fs3.5. There are all connected to EMC
external storage though dual fiber channels. This afternoon I got
stuck with the job of rebooting them and all went fine, until I
brought up the last node, since then I have been panicking boxes left
and right. The nodes seems to join the Sun Cluster fine, but when CVM
start on the forth node I runs thru to the end of step 3 and then
panics the server. Check in RED. Is there anything that I forgot to do
or set by accident or done that could have caused this error. Please
help. Feel free to ask me anything about the servers that might help.
Will summ. when finished.

This is the console messages for the server that crashes.
======================================================================
=========================
| evidence of criminal activity, system personnel may provide the |
| evidence of such monitoring to law enforcement officials. |
|-----------------------------------------------------------------|

tcen-ep-inf04 console login: Feb 23 20:02:10 tcen-ep-inf04 cl_runtime:
NOTICE: clcomm: Path tcen-ep-inf04:ce5 - tcen-ep-inf02:
ce5 being initiated
Feb 23 20:02:10 tcen-ep-inf04 cl_runtime: NOTICE: clcomm: Path
tcen-ep-inf04:ce3 - tcen-ep-inf02:ce3 being initiated
Feb 23 20:02:10 tcen-ep-inf04 cl_runtime: NOTICE: CMM: Node
tcen-ep-inf02 (nodeid: 2, incarnation #: 1140717727) has become re
achable.
Feb 23 20:02:10 tcen-ep-inf04 cl_runtime: NOTICE: clcomm: Path
tcen-ep-inf04:ce3 - tcen-ep-inf02:ce3 online
Feb 23 20:02:12 tcen-ep-inf04 cl_runtime: NOTICE: CMM: Node
tcen-ep-inf02 (nodeid = 2) is up; new incarnation number = 1140717
727.
Feb 23 20:02:12 tcen-ep-inf04 cl_runtime: NOTICE: CMM: Cluster
members: tcen-ep-inf01 tcen-ep-inf02 tcen-ep-inf03 tcen-ep-inf0
4.
Feb 23 20:02:12 tcen-ep-inf04 cl_runtime: NOTICE: CMM: node
reconfiguration #4 completed.
Feb 23 20:02:14 tcen-ep-inf04 cl_runtime: NOTICE: clcomm: Path
tcen-ep-inf04:ce5 - tcen-ep-inf02:ce5 online
Feb 23 20:04:10 tcen-ep-inf04 ID[vxclust]: starting return time: 02/23
20:04:10.087: seq # 4
Feb 23 20:04:10 tcen-ep-inf04 ID[vxclust]: failure in caching
vxconfigd records
Feb 23 20:04:10 tcen-ep-inf04 ID[vxclust]: ending step return time:
02/23 20:04:10.087:
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: starting step1 time: 02/23
20:04:24.180: seq # 5
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: members f joiners 2 leavers
0
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: ending step step1 time:
02/23 20:04:24.213:
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: starting step2 time: 02/23
20:04:24.436: seq # 5
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: calculating master time:
02/23 20:04:24.436: , nnodes = 4
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: port number is 5573
Feb 23 20:04:24 tcen-ep-inf04 ID[vxclust]: not smallest time: 02/23
20:04:24.437:
Feb 23 20:04:26 tcen-ep-inf04 ID[vxclust]: CVM:MASTER=0 SELF=3
Feb 23 20:04:32 tcen-ep-inf04 ID[vxclust]: ending step step2 time:
02/23 20:04:32.485:
Feb 23 20:04:34 tcen-ep-inf04 ID[vxclust]: starting step3 time: 02/23
20:04:34.053: seq # 5
Feb 23 20:04:34 tcen-ep-inf04 ID[vxclust]: ending step step3 time:
02/23 20:04:34.055:
Feb 23
panic[cpu0]/thread=30005072aa0: 20:04:35 tcen-ep-inf04 ID[vxclust]:
starting step4 time: 02/23 20BAD TRAP: type=31 rp=2a1005c
76d0 addr=aac8a428 mmu_fsr=0

vxconfigd: trap type = 0x31
addr=0xaac8a428
pid=21, pc=0x7843e684, sp=0x2a1005c6f71, tstate=0x800001602,
context=0x9
g1-g7: 36, 0, 30000077508, aac8a350, 0, 0, 30005072aa0

000002a1005c73f0 unix:die+a4 (31, 2a1005c76d0, aac8a428, 0, 564f4ccc,
65720000)
  %l0-3: 0000000000000000 00000000aac8a350 000002a1005c76d0
000002a1005c75c0
  %l4-7: 0000000000000031 000000000015fb14 000000007efefeff
0000000000170594
000002a1005c74d0 unix:trap+8a4 (2a1005c76d0, 0, 10000, 10200, 0, 0)
  %l0-3: 0000000000000001 0000000000000000 00000300052a6a30
0000000000000031
  %l4-7: 0000000000000005 0000000000000001 0000000000000000
0000000000000000
000002a1005c7620 unix:ktl0+48 (78531250, 78537af8, 8, 78531328, 20,
50000)
  %l0-3: 0000000000000003 0000000000001400 0000000800001602
000000000102d9b0
  %l4-7: 0000000000000043 0000000000000052 0000000000000000
000002a1005c76d0
000002a1005c7770 vxio:volcvm_await_join+7c (78531250, 2a1005c78e0,
2a1005c78e0, ffbffadc, 0, 187940)
  %l0-3: 0000000000000001 0000000000000000 0000000000002400
0000000000050000
  %l4-7: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
000002a1005c7830 vxio:volsioctl_real+3c4 (0, 564f4ccc, ffbffadc,
100003, 3000038df28, 2a1005c7aec)
  %l0-3: 0000000000000001 0000030005072ce4 00000000564f4ccc
0000030005072aa0
  %l4-7: 0000000000000001 0000000000000000 0000000000000000
0000000000000000
000002a1005c78f0 vxspec:volsioctl+38 (11800000000, 564f4ccc, ffbffadc,
100003, 3000038df28, 2a1005c7aec)
  %l0-3: 0000000078508948 00000300051b4b00 0000000000000001
00000000ff060000
  %l4-7: 00000300052a6a30 0000000000000028 00000000ff1c27c0
0000000000170594
000002a1005c79a0 genunix:ioctl+1f8 (1, 564f4ccc, ffbffadc, 564f4ccc,
173e98, 74000000)
  %l0-3: 0000000001186990 00000000564f4ccc 0000000000000001
00000000ff060000
  %l4-7: 000003000084dcb0 0000000000000000 0000000000000000
0000000000170594

syncing file systems... 6 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
done (not all i/o completed)
dumping to /dev/dsk/c1t0d0s1, offset 3355312128, content: kernel
100% done: 128214 pages dumped, compression ratio 6.91, dump succeeded
======================================================================
=========================

Regarsd,

Claude Oliver
IT Specialist
Infrastructure Support Services
Telkom SA
(Tel) 012 6803102
(Fax) 012 6803299
(Cell) 082 5783443

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This e-mail and its contents are subject to the Telkom SA Limited
e-mail legal notice available at
http://www.telkom.co.za/TelkomEMailLegalNotice.PDF
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:39:05 EDT