SunFire 280R crash

From: Stanley, Jon (Jon.Stanley@savvis.net)
Date: Fri Jun 18 2004 - 13:29:36 EDT


This crash looks like a hardware problem, but I'm not sure what the
hardware problem might be. Here's some relevant output from scat:

SolarisCAT(vmcore.2)> panic
panic on cpu 0
panic string: [AFT1] errID 0x0000e448.4a2353a8 UE Error(s)
    See previous message(s) for details
==== panic user thread: 0x3000e9f9240 pid: 9114 on cpu: 0 ====
cmd: oracleCMLIVE (DESCRIPTION=(LOCAL=no)(ADDRESS=(PROTOCOL=BEQ)))

t_stk: 0x2a101a6baf0 sp: 0x10422c61 t_stkbase: 0x2a101a68000
t_pri: 58(TS) pctcpu: 0.000000 t_lwp: 0x300049794d0 machpcb:
0x2a101a6baf0
t_procp: 0x3000dda2040 p_as: 0x10423910(kas)
last cpuid: 0
idle: 1 ticks (0.01 seconds)
start: Wed Jun 16 22:14:15 2004
age: 39 seconds (39 seconds)
stime: 25092749 (0.01 seconds earlier)
syscall: sys#0 (0x1)
tstate: TS_ONPROC - thread is being run on a processor
tflg: T_DONTBLOCK - for lockfs
        T_PANIC - thread initiated a system panic
tpflg: TP_LWPEXIT - LWP has exited
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag: SLOAD - in core
        SULOAD - u-block in core
        SNOWAIT - children never become zombies

pc: 0x10044488 unix:panicsys+0x44: call unix:setjmp

unix:panicsys+0x44 (0x10423630, 0x2a101a6b070, 0x2a101a6ae28,
0x78002000, 0x10438218, 0xf)
unix:vpanic+0xcc (0x2a101a6ae28, 0x2a101a6b070, 0x3c, 0x104381e8, 0x0,
0x2a101a6ae53)
genunix:vcmn_err+0x18 (0x3, 0x2a101a6ae28, 0x2a101a6b070, 0x3,
0x81010100, 0xff00)
SUNW,UltraSPARC-III+:cpu_aflt_log+0x45c (0x2a101a6ae2e, 0x10148920,
0x101488f8, 0x0, 0x2a101a6afb8, 0x2a101a6ae7b)
SUNW,UltraSPARC-III+:cpu_deferred_error+0x5c4 (0x10464400,
0xc4000000000f, 0xee000000000c, 0x0, 0x1, 0x1)
unix:prom_rtt+0x0 (0x2000, 0x3000e0046c4, 0x0, 0x0, 0x0, 0xfe960000)
-- prom_rtt regs data rp: 0x2a101a6b510
pc: 0x100c47f8 genunix:segvn_lockop+0x658: ldx [%sp + 0x8df],
%o0
npc: 0x100c47fc genunix:segvn_lockop+0x65c: subcc %l6, %o0, %g0
( cmp %l6, %o0 )
  global: %g1 0x10052800
        %g2 0 %g3 0
        %g4 0x1 %g5 0x1892810
        %g6 0xfe910610 %g7 0x3000e9f9240
  out: %o0 0x2000 %o1 0x3000e0046c4
        %o2 0 %o3 0
        %o4 0 %o5 0xfe960000
        %sp 0x2a101a6adb1 %o7 0x100c41c4
  loc: %l0 0x30002d82e88 %l1 0xfe992000
        %l2 0 %l3 0x3
        %l4 0x30004b87708 %l5 0x7b006798
        %l6 0x3000e0046c2 %l7 0x31000806798
  in: %i0 0x3000d99fc08 %i1 0x19
        %i2 0x8 %i3 0x32000
        %i4 0x1c %i5 0
        %fp 0x2a101a6af41 %i7 0x100c106c
<trap>genunix:segvn_lockop+0x658 (0x3000d99fc08, 0x19, 0x8, 0x32000,
0x1c, 0x0)
genunix:segvn_unmap+0x114 (0xfe960000, 0xfe960000, 0x34000,
0x3000dda2040, 0x1, 0x0)
genunix:as_free+0xd8 (0x30003c4b890, 0x10423910, 0x30000060430,
0x30003c4b8c0, 0x30003c4b89a, 0x30003c4b890)
genunix:relvm - frame recycled
genunix:proc_exit+0x39c (0x3000da599d0, 0x1041b1a8, 0x104641d8,
0x3000ca83b58, 0x9, 0x2)
genunix:exit+0x8 (0x2, 0x9, 0x40, 0x9, 0x2, 0x0)
genunix:psig_shared - frame recycled
genunix:post_syscall+0x3ec (0x3000e9f9240, 0xd2, 0x1, 0xfe8ee000, 0x4,
0x0)
unix:_syscall_post32+0x0 (0xfe8fb248, 0xfe8ee000, 0x3, 0xfe8f5938,
0xfe8f5950, 0x20)
-- switch to user thread's user stack --

SolarisCAT(vmcore.2)> msgbuf
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
NOTICE: alloc: /var: file system full
WARNING: [AFT1] Uncorrectable system bus (UE) Event on CPU1 User Data
Access at TL=0, errID 0x0000e446.5b7146d0
    AFSR 0x00000004<UE>.0000010e AFAR 0x00000000.5c877870
    Fault_PC 0x758f28 Esynd 0x010e J0100 J0202 J0304 J0406
[AFT1] errID 0x0000e446.5b7146d0 More than four Bits were in error
[AFT2] errID 0x0000e446.5b7146d0 PA=0x00000000.5c877840
    E$tag 0x00000001.72000910 E$state_1 Exclusive
[AFT2] E$Data (0x00) 0x00000000.00000000 0x00000000.00000000 ECC 0x000
[AFT2] E$Data (0x10) 0x00000000.00000000 0x00000100.1e9b7b5e ECC 0x0a4
[AFT2] E$Data (0x20) 0x00004800.00000001 0xc763f868.c763f868 ECC 0x025
[AFT2] E$Data (0x30) 0xc763f870.c763f870 0xc5d53128.94506000 ECC 0x1b2
*Bad* Esynd=0x10e
[AFT2] D$Tag 0x0005c877 D$state Valid D$utag 0x8f D$snp 0x0005c876
[AFT2] PAtag 0x000.5c877860 PAsnp 0x000.5c877860 VAutag 0x23f860
[AFT2] D$Data (0x20) 0x00004800.00000001 0xc763f868.c763f868
[AFT2] D$Data (0x30) 0xc763f870.c763f870 0xc5d53128.94506000
NOTICE: Scheduling clearing of error on page 0x00000000.5c876000
[AFT3] errID 0x0000e446.5b7146d0 Above Error is in User Mode
    and is fatal: will reboot
WARNING: [AFT1] initiating reboot due to above error in pid 8846
(oracle)
WARNING: [AFT1] Uncorrectable system bus (UE) Event on CPU1 User Data
Access at TL=0, errID 0x0000e448.48458998
    AFSR 0x00000004<UE>.0000010e AFAR 0x00000000.5c877870
    Fault_PC 0x758f28 Esynd 0x010e J0100 J0202 J0304 J0406
[AFT1] errID 0x0000e448.48458998 More than four Bits were in error
[AFT2] errID 0x0000e448.48458998 PA=0x00000000.5c877840
    E$tag 0x00000001.72880910 E$state_1 Exclusive
[AFT2] E$Data (0x00) 0x00000000.00000000 0x00000000.00000000 ECC 0x000
[AFT2] E$Data (0x10) 0x00000000.00000000 0x00000100.1e9b7b5e ECC 0x0a4
[AFT2] E$Data (0x20) 0x00004800.00000001 0xc763f868.c763f868 ECC 0x025
[AFT2] E$Data (0x30) 0xc763f870.c763f870 0xc5d53128.94506000 ECC 0x1b2
*Bad* Esynd=0x10e
[AFT2] D$Tag 0x0005c877 D$state Valid D$utag 0x8f D$snp 0x0005c876
[AFT2] PAtag 0x000.5c877860 PAsnp 0x000.5c877860 VAutag 0x23f860
[AFT2] D$Data (0x20) 0x00004800.00000001 0xc763f868.c763f868
[AFT2] D$Data (0x30) 0xc763f870.c763f870 0xc5d53128.94506000
NOTICE: Scheduling clearing of error on page 0x00000000.5c876000
[AFT3] errID 0x0000e448.48458998 Above Error is in User Mode
    and is fatal: will reboot
WARNING: [AFT1] rebooting system due to above error in pid 9026 (oracle)
WARNING: [AFT1] Uncorrectable system bus (UE) Event on CPU0 Privileged
Data Access at TL=0, errID 0x0000e448.4a2353a8
    AFSR 0x00100004<PRIV,UE>.0000005f AFAR 0x00000000.588746f0
    Fault_PC 0x100c47f8 Esynd 0x005f J0100 J0202 J0304 J0406
[AFT1] errID 0x0000e448.4a2353a8 Four Bits were in error
[AFT2] errID 0x0000e448.4a2353a8 PA=0x00000000.588746c0
    E$tag 0x00000001.62000480 E$state_3 Exclusive
[AFT2] E$Data (0x00) 0x0d000d00.2e320000 0x2f2f2f75.30302f61 ECC 0x105
[AFT2] E$Data (0x10) 0x70702f6f.7261636c 0x652f7072.6f647563 ECC 0x00c
[AFT2] E$Data (0x20) 0x742f382e.312e372f 0x7264626d.732f6d65 ECC 0x06b
[AFT2] E$Data (0x30) 0x73672f6f.72617573 0x00000300.0e005800 ECC 0x00f
*Bad* Esynd=0x05f
[AFT2] D$ data not available

panic[cpu0]/thread=3000e9f9240: [AFT1] errID 0x0000e448.4a2353a8 UE
Error(s)
    See previous message(s) for details

000002a101a6ad70 SUNW,UltraSPARC-III+:cpu_aflt_log+45c (2a101a6ae2e,
10148920, 101488f8, 0, 2a101a6afb8, 2a101a6ae7b)
  %l0-3: 000000001029a828 000002a101a6b080 0000000000000003
0000000000000010
  %l4-7: 0000000000000000 0000000010459e00 000000000001e000
0000000000000000
000002a101a6afc0 SUNW,UltraSPARC-III+:cpu_deferred_error+5c4 (10464400,
c4000000000f, ee000000000c, 0, 1, 1)
  %l0-3: 001000040000005f 0000000000000000 0000000000000000
0000000000000032
  %l4-7: 0000000000024208 0000012433921833 0000000000000000
0000000000000000
000002a101a6b460 unix:prom_rtt+0 (2000, 3000e0046c4, 0, 0, 0, fe960000)
  %l0-3: 0000000000000001 0000000000001400 0000004414001600
000000001013fe98
  %l4-7: 00000000fe950000 000003100264c320 0000000000000000
000002a101a6b510
000002a101a6b5b0 genunix:segvn_lockop+24 (3000d99fc08, 19, 8, 32000, 1c,
0)
  %l0-3: 0000030002d82e88 00000000fe992000 0000000000000000
0000000000000003
  %l4-7: 0000030004b87708 000000007b006798 000003000e0046c2
0000031000806798
000002a101a6b740 genunix:segvn_unmap+114 (fe960000, fe960000, 34000,
3000dda2040, 1, 0)
  %l0-3: 0000000010168464 0000000010132518 000003000d99fc08
0000030004b87708
  %l4-7: 0000000000034000 00000000104121b0 0000000000000000
000002a101a6b770
000002a101a6b810 genunix:as_free+d8 (30003c4b890, 10423910, 30000060430,
30003c4b8c0, 30003c4b89a, 30003c4b890)
  %l0-3: 00000000100c0f58 000003000de16488 0000000000000005
0000000000000000
  %l4-7: 00000000000000b0 00000000104130a0 0000030004979648
0000000000000000
000002a101a6b8c0 genunix:proc_exit+39c (3000da599d0, 1041b1a8, 104641d8,
3000ca83b58, 9, 2)
  %l0-3: 0000000000000005 000003000e9f9240 000003000dda2040
0000000000000000
  %l4-7: 000003000b237a60 0000000000000000 0000030004979648
0000000000400000
000002a101a6b970 genunix:exit+8 (2, 9, 40, 9, 2, 0)
  %l0-3: 0000000000000000 000003000dda2040 0000000000000100
00000300049794d0
  %l4-7: 0000000000000009 0000000010412d80 0000000000000009
0000000000000000
000002a101a6ba20 genunix:post_syscall+3ec (3000e9f9240, d2, 1, fe8ee000,
4, 0)
  %l0-3: 0000000000000000 000002a101a6bba0 00000300049794d0
0000000000000004
  %l4-7: 0000000000000001 000003000dda2040 0000000000000004
0000000000000000

syncing file systems...
panic[cpu0]/thread=3000e9f9240: panic sync timeout
dumping to /dev/dsk/c1t0d0s1, offset 65536

SolarisCAT(vmcore.2)>
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:28:54 EDT