kernel panics (BAD TRAP)

From: Marcelino Mata (mmata@multimatic.com)
Date: Mon May 31 2004 - 17:25:45 EDT


Running Solaris 8 with April 04 patch cluster.

I have upgraded a Blade 1000 from 1.5Gb to 5Gb several months ago and I
started to get serious kernel panics on a regular basis (every 40-75
minutes). The computer is fine when it is under any significant or heavy
load but once it becomes idle for awhile, the kernel panics. The problem
seems to happen the first 15 minutes after the user logs out and then it
keeps recurring until the system no longer boots due to corrupted
/etc/path_to_inst file. I am not 100% certain that the problem started with
the additional 5GB RAM. It has been so long since I have started
experimenting with the root cause of the problem that I vaguely remember
that the computer did have several vmcore files. At the time, I did not
have a serious problem and I was getting low on disk space so I deleted the
original vmcore files (each vmcore is 400Mb). At one point the computer was
fine for about one week so I thought I had the problem licked. I never
figured out why it was fine for several days. SunVTS 4.6 produced no
hardware problems.

I tried doing some core analysis and initially thought the problem was
related to fsflush operations. I performed the analysis based on the
information from this site
http://www.princeton.edu/~psg/unix/solaris/troubleshoot/adbcore.html.
However, checking other core files, I see references to other processes. I
am either not doing the analysis correctly or the problem is not being
reflected in the core file (is this possible?).

I ran across a posting for core analysis which says to do the following and
forward the information to the poster.

cd /var/crash/`uname -n`
echo '$<msgbuf' | adb -k unix.4 vmcore.4
echo '$c' | adb -k unix.4 vmcore.4

 Can anyone explain what this is showing or give advice on what to do next?

Marcelino

# echo '$<msgbuf' | adb -k unix.4 vmcore.4
physmem 9b34e
30001fdc223: /pci@8,700000/usb@5,3/keyboard@1 (hid0) online
3000206dee2: PCI-device: SUNW,XVR-500@1, ifb0
3000206dac3: ifb0 is /pci@8,700000/SUNW,XVR-500@1
3000206d800: cpu0: UltraSPARC-III+ (portid 0 impl 0x15 ver 0x22 clock 900
MHz)
3000206d540: se0 at ebus0: offset 1,400000
3000206d283: se0 is /pci@8,700000/ebus@5/serial@1,400000
3000206cfc2: PCI-device: firewire@5,2, hci13940
3000206cd03: hci13940 is /pci@8,700000/firewire@5,2
3000206ca42: PCI-device: network@5,1, eri0
3000206c783: eri0 is /pci@8,700000/network@5,1
3000206c203: dump on /dev/dsk/c1t1d0s1 size 2048 MB
30005a2bc42: pseudo-device: devinfo0
30005a2b983: devinfo0 is /pseudo/devinfo@0
30005a2b55f: SUNW,eri0 : 100 Mbps full duplex link up
30005a2b140: /pci@8,700000/scsi@6 (glm0): glm0 supports power
management.
30005a2ae80: /pci@8,700000/scsi@6 (glm0): Rev. 7 Symbios 53c875
found.
30005a2abc2: PCI-device: scsi@6, glm0
30005a2a903: glm0 is /pci@8,700000/scsi@6
30005a2a640: /pci@8,700000/scsi@6,1 (glm1): glm1 supports power
management.
30005a2a380: /pci@8,700000/scsi@6,1 (glm1): Rev. 7 Symbios 53c875
found.
30005a2a0c2: PCI-device: scsi@6,1, glm1
30005b95d83: glm1 is /pci@8,700000/scsi@6,1
30005b95ac0: sd6 at glm0: target 6 lun 0
30005b95803: sd6 is /pci@8,700000/scsi@6/sd@6,0
30005b95280: /pci@8,700000/scsi@6,1/st@5,0 (st12): <Vendor 'COMPAQ
' Product 'SDX-300C '>
30005b94fc0: st12 at glm1: target 5 lun 0
30005b94d03: st12 is /pci@8,700000/scsi@6,1/st@5,0
30005b94780: ecpp0 at ebus0: offset 1,300278
30005b944c3: ecpp0 is /pci@8,700000/ebus@5/parallel@1,300278
30005b94200: scmi2c0 at ebus0: offset 0,40
30005bd1f03: scmi2c0 is /pci@8,700000/ebus@5/i2c@1,30/card-reader@0,40
30005bd1c42: pseudo-device: winlock0
30005bd1983: winlock0 is /pseudo/winlock@0
30005bd16c2: pseudo-device: lockstat0
30005bd1403: lockstat0 is /pseudo/lockstat@0
30005bd1142: pseudo-device: vol0
30005bd0e83: vol0 is /pseudo/vol@0
30005bd0bc3: upa64s0 at root: SAFARI 0x8 0x480000
30005bd0902: pseudo-device: llc10
30005bd04e3: llc10 is /pseudo/llc1@0
30005bd00c0: audiocs0 at ebus0: offset 1,200000
30005bc3d83: audiocs0 is /pci@8,700000/ebus@5/audio@1,200000
30005bc3ac2: pseudo-device: pm0
30005bc3803: pm0 is /pseudo/pm@0
30005bc3542: pseudo-device: tod0
30005bc3283: tod0 is /pseudo/tod@0
30005bc2e62: pseudo-device: lofi0
30005bc2ba3: lofi0 is /pseudo/lofi@0
30005bc28e2: pseudo-device: fcp0
30005bc2623: fcp0 is /pseudo/fcp@0
30005bc2360: fcip attach for port instance (0x0) successful
30005bc20a2: PCI-device: pci108e,7063@2, sunpci2drv0
30005d0bda3: sunpci2drv0 is /pci@8,700000/pci108e,7063@2
30005d0bae2: pseudo-device: fssnap0
30005d0b823: fssnap0 is /pseudo/fssnap@0
30005d0b560: gpio_873170 at ebus0: offset 1,300600
30005d0b2a3: gpio_873170 is /pci@8,700000/ebus@5/gpio@1,300600
30005d0ad22: pseudo-device: devinfo0
30005bc2d03: devinfo0 is /pseudo/devinfo@0
30005d0b140: /pci@8,700000/scsi@6,1/st@5,0 (st12): <Vendor 'COMPAQ
' Product 'SDX-300C '>
30005bc33e0: st12 at glm1: target 5 lun 0
30005bc3963: st12 is /pci@8,700000/scsi@6,1/st@5,0
30005bc3c20: ecpp0 at ebus0: offset 1,300278
30005bd0223: ecpp0 is /pci@8,700000/ebus@5/parallel@1,300278
30005bc2a40: scmi2c0 at ebus0: offset 0,40
30005bd0a63: scmi2c0 is /pci@8,700000/ebus@5/i2c@1,30/card-reader@0,40
30005bc3ee2: pseudo-device: lockstat0
30005bd0fe3: lockstat0 is /pseudo/lockstat@0
30005bd0642: pseudo-device: llc10
30005bd1563: llc10 is /pseudo/llc1@0
30005bd0d20: audiocs0 at ebus0: offset 1,200000
30005bd1ae3: audiocs0 is /pci@8,700000/ebus@5/audio@1,200000
30005bd12a2: pseudo-device: tod0
30005b940a3: tod0 is /pseudo/tod@0
30005bd1822: pseudo-device: lofi0
30005b94623: lofi0 is /pseudo/lofi@0
30005bd1da0: fcip attach for port instance (0x0) successful
30005b948e2: PCI-device: pci108e,7063@2, sunpci2drv0
30005b95123: sunpci2drv0 is /pci@8,700000/pci108e,7063@2
30005b94362: pseudo-device: fssnap0
30005b95963: fssnap0 is /pseudo/fssnap@0
30005b94e60: gpio_873170 at ebus0: offset 1,300600
30005b95ee3: gpio_873170 is /pci@8,700000/ebus@5/gpio@1,300600
30005b953e2: pseudo-device: devinfo0
30005a2a4e3: devinfo0 is /pseudo/devinfo@0
30005bc36a0:
panic[cpu0]/thread=30006cf0000:

30005d0b980: BAD TRAP: type=34 rp=2a10093d0b0 addr=baddcafebaddcafe
mmu_fsr=0
30005d0bc40:
30005a2b2a0: mibiisa:
30005d0b400: alignment error:
30005a2bae0: addr=0xbaddcafebaddcafe
30005a2b820: pid=420, pc=0x100f5288, sp=0x2a10093c951,
tstate=0x4480001606, context=0x8b8
30005a2afe0: g1-g7: 1043ac00, 0, 30005b96fc8, 5, 1, 0, 30006cf0000
30005b94a40:
3000206d6a3: 000002a10093cde0 unix:die+a4 (34, 2a10093d0b0,
baddcafebaddcafe, 0, 2a10093d0b0, 0)
3000206d3e3: %l0-3: 0000000000000000 0000030006c9b4d0 0000000000000005
0000030006cf970a
  %l4-7: 0000030006cf9688 0000030006c9b4c8 0000030006c9b448
0000030006cf942830005a2a223: 000002a10093cec0 unix:trap+5d0
(baddcafebaddcafe, 0, 80000d, 10000, 2a10093d0b0, 0)
30005a2b403: %l0-3: 000003000002cec0 0000030006a44240 0000030006a9eaa0
0000000000000000 %l4-7: 0080000d00000034 0000030006ce35380000000000010000
0000000000000000
3000206ce63: 000002a10093d000 unix:prom_rtt+0 (baddcafebaddcafe, 0, b8,
3000002cb40, 1, 300000ed618)
30005bc24c3: %l0-3: 0000000000000007 0000000000001400 0000004480001606
000000001002bf0c
  %l4-7: 00000000ff095c68 0000000000000000 0000000000000000 000002a10093d0b0
3000007fda3: 000002a10093d150 genunix:build_sqlist+40 (30006cf8508,
30006c9b2c0, 0, 30006cf8508, b8, 0)
3000206cba3: %l0-3: 0000030006cf8650 0000000000020000 0000000000000000
0000000000000000
  %l4-7: 00000000000000b0 00000000104121a0 0000000000000000 0000000000000000
30005a2bda3: 000002a10093d200 genunix:removeq+184 (30006cf8508,
30006cf8508,30006cf86c8, 0, 30006cf85e8, 30006c9b2c0)
30005bc2fc3: %l0-3: 0000030006cf8508 0000000000020000 0000000000000001
0000
000000007fff %l4-7: 0000000000000000 0000000000000000 0000030006ce36b0
0000000000000000
30005d0afe3: 000002a10093d2b0 udp:udp_close+8 (30006cf8508, 30006ca0458,
30001243f28, f500, 4400, 30006cf86c8)
30005b956a3: %l0-3: 0000000000000400 0000000000000000 0000000000000100
0000
030006cf8710 %l4-7: 0000000000007fff 000000000000008f 0000030006ce2580
000002a100951af0
30005a2ad23: 000002a10093d360 genunix:qdetach+90 (4400, 30001243f28, 3,
30006cf85e8, 0, 30006cf8508)
3000007f983: %l0-3: 000000001025a4c4 0000000000000003 0000030006cf9688
0000
000000000000 %l4-7: 0000000000007fff 000000000000008f 00000300067af0e0
000002a100957af0
3000206c8e3: 000002a10093d410 genunix:strclose+3c8 (30006a44148, 0,
30001243f28, 3, 30006cf8e88, 200000)
30005a2a7a3: %l0-3: 000000001046ec58 0000030006c9b348 0000000000000005
0000030006cf8fea
  %l4-7: 0000030006cf8f68 0000030006c9b340 0000030006c9b2c0 0000030006cf85e8
30005d0ae83: 000002a10093d4e0 specfs:device_close+8c (30006a44248, 3,
300000051, 30001243f28, 0, 0)
30001fdc383: %l0-3: 0000000000000000 0000030006c9b4d0 0000000000000005
0000030006cf970a %l4-7: 0000030006cf9688 0000030006c9b4c8 0000030006c9b448
0000030006cf9428
30001fdc7a3: 000002a10093d590 specfs:spec_close+124 (fc00, 30001243f28,
300000051, 3, 100, 0)
30001fdcbc3: %l0-3: 0000030006a44248 0000030006a44240 0000030006a44228
0000030006a44140 %l4-7: 0000030006cedea0 0000030006cedea0 0000000000002000
0000000000000000
30001fdcfe3: 000002a10093d640 genunix:closef+58 (1046f000, 30006bb2738,
0, 30006a44248, 1, 300000ed618)
30001fdd403: %l0-3: 000000001015be40 0000000010474ee0 0000000000000040
0000030006ce0de0
  %l4-7: 00000000ff095c68 0000000000000000 0000030006ce2210
000000000000000030001fdd823: 000002a10093d6f0 genunix:closeall+30 (2,
30006aa6070, 20, 30006a9f428, 2c7cf00, 0)
30001fddc43: %l0-3: 0000030000031dc0 000000001013ddd0 0000000000000000
0000
000000000000 %l4-7: 00000000000000b0 00000000104121a0 0000000000000000
0000000000000000
300002180a3: 000002a10093d7a0 genunix:proc_exit+2bc (30006cfbc58,
1041c2c8, 30006ce3538, 30006c743d8, a, 2)
300002184c3: %l0-3: 000000000000000d 0000030006cf0000 0000030006a9eaa0
0000000000000000
  %l4-7: 0000000000000000 0000000000000000 0000030006ce36b0 0000000000000000
300002188e3: 000002a10093d850 genunix:exit+8 (2, a, 48, a, 2, 0)
30000218d03: %l0-3: 0000000000000000 0000030006a9eaa0 0000000000000200
0000
030006ce3538 %l4-7: 000000000000000a 0000030006a9ec08 000000000000000a
0000000000000200
30000219123: 000002a10093d900 unix:trap_cleanup+1cc (2a10093daa0, 1,
30006ce3
538, 30006a9eaa0, 2a10093dba0, ffffffffc0226008)
30000219543: %l0-3: 0000000000000004 000000000000003e 0000000000000000
0000
000000000028 %l4-7: 0000000000000011 0000000000060e00 0000000000044968
000000000005dfa0
30000219963: 000002a10093d9b0 unix:trap+16e8 (a9, 2a10093daf0, 800005,
10000, 2a10093dba0, 0)
30000219d83: %l0-3: 00000000ff1427f0 0000000000000000 0000030006a9eaa0
0000000000000005
  %l4-7: 0080000500010034 0000030006ce3538 0000000000010000 0000000000000000
30000238220:
30000238643: syncing file systems...
30000238a63: 3
30000238e83: 3
300002392a3: 1
300002396c3: done
30000239ae3: dumping to /dev/dsk/c1t1d0s1, offset 429588480

# echo '$c' | adb -k unix.0 vmcore.0
physmem 9b350
panicsys(10423d20,2a100ac5128,10053a98,78002000,30006c16018,0) + 44
vpanic(10053a98,2a100ac5128,1,1a,8,8) + cc
panic(10053a98,31,2a100ac5490,20393735000028,0,0) + 1c
die(31,2a100ac5490,20393735000028,0,2a100ac5490,d05ea028) + a4
trap(20393735000000,1,5,0,2a100ac5490,0) + 8b8
sfmmu_tsb_miss(10428dc8,0,3000004df88,0,3000004df88,19) + 66c
prom_rtt(20393735000000,0,8,300051a6b70,1000c19c,0)
rm_assize(af2c,1,20393735000000,1fff,1042fe90,30002024d08) + 188
prgetpsinfo(30002024d08,2a100ac57d0,30006c16018,2a100ac57d0,30006c16018,3000
2032d90) + 38c
pr_read_psinfo(30006c16018,2a100ac5a28,3000634a1d8,30007777390,30007158168,3
0007158000) + 3c
read(0,0,2001,30001fd4708,9,1a0) + 25c
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:28:45 EDT