ecache parity error?

From: Dan_Kelley@ssmhc.com
Date: Wed Apr 24 2002 - 13:34:46 EDT


Hello, all.

We have a machine that keeps crashing, and I think it is the ecache parity
error. I have been waiting for it to happen again before I sent an e-mail
to this list, though. Could anyone look at this and tell me if they think
it is the ecache error? If not, any clues as to what it is? Thanks in
advance! I will summarize.

 - Dan

uname -a:
SunOS netdev 5.8 Generic_108528-14 sun4u sparc SUNW,Ultra-5_10

I have tracked here is the info for the first one (note they are slightly
different):

echo '$c' | adb -k unix.1 vmcore.1:

physmem 173a7
panicsys(104234b0,1040c198,10050068,78002000,57542400,c) + 44
vpanic(10050068,1040c198,16e76a3d8cac,10,30000689ea8,30000068438) + cc
panic(10050068,804,1,1041a798,fffd,20) + 1c
sync_handler(1041a980,10400000,0,0,0,2) + 150
prom_rtt(10000000,16,f0000000,16e7332a6da9,0,2)
client_handler(f0066d2c,2a10007d6e8,1,104283d8,1,1041a980) + 2c
prom_enter_mon(0,6,b,2a10004bd40,2a10007dd40,0) + 28
debug_enter(0,16e73315c8c5,16e73315c8c9,0,30000ddf1e8,0) + d0
kbdinput(1045a400,4d,30000689d68,300001b5000,0,1013dd4c) + 304
kbdrput(30000adabe8,30000f7e340,30000ad3a98,30000f7e340,30000689d68,30000ad3a20)
+ 13c
putnext(30000adae48,30000ad9a90,30000adb0a8,30000f7e340,0,0) + 1cc
async_softint(30000f7e340,1,ffff,20000,0,30000adae48) + 568
asysoftintr(3000017a008,30000b7e000,1,2a10007dd40,10180,1026fba8) + 70
intr_thread(2a10001fd40,1041b180,10423890,10423890,0,0) + a4
idle(1040f864,0,0,1041b180,3000005d6c8,0) + 54
thread_start(0,0,0,0,0,0) + 4

/var/adm/messages from this one:

Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 932869 kern.warning]
WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID
0x00015289.afcae2ba
Apr 12 17:59:18 netdev AFSR 0x00000000.80400080<PRIV,EDP> AFAR
0x00000000.3d41fa68
Apr 12 17:59:18 netdev AFSR.PSYND 0x0080(Score 95) AFSR.ETS 0x00
Fault_PC 0x10031cc8
Apr 12 17:59:18 netdev UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 683009 kern.info] [AFT2]
errID 0x00015289.afcae2ba PA=0x00000000.3d41fa68
Apr 12 17:59:18 netdev E$tag 0x00000000.0003cf50 E$State: Modified
E$parity 0x03 Badlines found=6
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.10041eb0
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.10041eb4
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.0247e008
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.10423890
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.10041eb0
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 989652 kern.info] [AFT2]
E$Data (0x28): 0x80000000.00000000 *Bad* PSYND=0x0080
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.00000000
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x000002a1.000b7d20
Apr 12 17:59:18 netdev SUNW,UltraSPARC-IIi: [ID 601312 kern.info] [AFT2]
errID 0x00015289.afcae2ba AFAR was derived from E$Tag
Apr 12 17:59:18 netdev unix: [ID 836849 kern.notice]
Apr 12 17:59:18 netdev ^Mpanic[cpu0]/thread=2a10007dd20:
Apr 12 17:59:18 netdev unix: [ID 455523 kern.notice] [AFT1] errID
0x00015289.afcae2ba EDP Error(s)
Apr 12 17:59:18 netdev See previous message(s) for details
Apr 12 17:59:18 netdev unix: [ID 100000 kern.notice]
Apr 12 17:59:18 netdev genunix: [ID 723222 kern.notice] 000002a10007d200
SUNW,UltraSPARC-IIi:cpu_aflt_log+4e0 (2a10007d2be, 1, 101483a0,
2a10007d448, 2a10007d30b, 101483c8)
Apr 12 17:59:19 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000000 000002a10007d510 0000000000000003 0000000000000010
Apr 12 17:59:19 netdev %l4-7: 0000000000200000 0000000000400000
0000000000000000 000002a10001f9c0
Apr 12 17:59:19 netdev genunix: [ID 723222 kern.notice] 000002a10007d450
SUNW,UltraSPARC-IIi:cpu_async_error+868 (1, 2a10007d510, 80400080, 0,
640000080400080, 2a10007d6d0)
Apr 12 17:59:19 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000001 0000000000000032 0000000000000000 0000000000000000
Apr 12 17:59:19 netdev %l4-7: 0000000000000219 0000000000000000
000003000005d748 0000000000000000
Apr 12 17:59:19 netdev genunix: [ID 723222 kern.notice] 000002a10007d620
unix:prom_rtt+0 (300001b2000, 8000000000000000, a, a, 0, 0)
Apr 12 17:59:19 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000001 0000000000001400 0000000000001600 000000001013fb54
Apr 12 17:59:19 netdev %l4-7: 0000030000697ea0 0000000000000001
000000000000000a 000002a10007d6d0
Apr 12 17:59:19 netdev genunix: [ID 723222 kern.notice] 000002a10007d770
genunix:callout_schedule_1+4 (300001b2000, 10443508, 300001b5000,
10072cf4, 0, 101424b0)
Apr 12 17:59:20 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000008 0000000000000002 0000000000000001 000000001041b718
Apr 12 17:59:20 netdev %l4-7: 000000001041b338 0000000000000016
000000001041baf8 000002a10007d7b0
Apr 12 17:59:20 netdev genunix: [ID 723222 kern.notice] 000002a10007d820
genunix:callout_schedule+54 (104391fc, 1, 10439178, 8, 1, 300000683c8)
Apr 12 17:59:20 netdev genunix: [ID 179002 kern.notice] %l0-3:
00000000100d312c 0000030000cec000 0000030000d79602 0000030000cec000
Apr 12 17:59:20 netdev %l4-7: 000003000188f040 0000000000000000
000003000148af00 000002a10051dba0
Apr 12 17:59:20 netdev genunix: [ID 723222 kern.notice] 000002a10007d8d0
genunix:clock+474 (1045a800, 1041b338, 1042dc00, 94f476874837, 0, 0)
Apr 12 17:59:20 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000000 0000000000000001 000002a10007dd20 0000000000000000
Apr 12 17:59:20 netdev %l4-7: 000000001045a000 000000003b9aca00
000000001041baf8 00000000fed3a004
Apr 12 17:59:20 netdev genunix: [ID 723222 kern.notice] 000002a10007d9a0
genunix:cyclic_softint+a4 (1041b338, 30000057928, 1, 3, 30000068478,
10073f0c)
Apr 12 17:59:20 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000030000057930 800000000237f894 0000000000000000 0000030000068478
Apr 12 17:59:20 netdev %l4-7: 00000300000578c8 000003000068dea8
0000000000000000 000003000068ded0
Apr 12 17:59:21 netdev genunix: [ID 723222 kern.notice] 000002a10007da60
unix:cbe_level10+8 (0, 803, 1041b338, 2a10007dd20, 10060, 1000b34c)
Apr 12 17:59:21 netdev genunix: [ID 179002 kern.notice] %l0-3:
00000000102e4934 0000000000000001 0000000000000001 0000030000070ed8
Apr 12 17:59:21 netdev %l4-7: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
Apr 12 17:59:21 netdev unix: [ID 100000 kern.notice]
Apr 12 17:59:21 netdev genunix: [ID 672855 kern.notice] syncing file
systems...
Apr 12 17:59:21 netdev genunix: [ID 904073 kern.notice] done
Apr 12 17:59:22 netdev genunix: [ID 353387 kern.notice] dumping to
/dev/dsk/c0t0d0s1, offset 322174976
Apr 12 17:59:22 netdev uata: [ID 606412 kern.warning] WARNING: timeout:
reset bus chno = 0 targ = 0
Apr 12 17:59:38 netdev genunix: [ID 409368 kern.notice] ^M100% done: 8116
pages dumped, compression ratio 3.96,
Apr 12 17:59:38 netdev genunix: [ID 851671 kern.notice] dump succeeded

And now for the second crash:

echo '$c' | adb -k unix.0 vmcore.0:

physmem 173a7
panicsys(104234b0,1040c198,10050068,78002000,39ff00,c) + 44
vpanic(10050068,1040c198,faabfb648,10,30000689ea8,30000068438) + cc
panic(10050068,804,1,1041a798,fffd,20) + 1c
sync_handler(1041a980,10400000,0,0,0,2) + 150
prom_rtt(10000000,16,f0000000,f810ca9c6,0,2)
client_handler(f0066d2c,2a10007d6e8,1,104283d8,1,1041a980) + 2c
prom_enter_mon(0,6,b,2a10004bd40,2a10007dd40,0) + 28
debug_enter(0,f80db6987,f80db698a,0,30001092020,0) + d0
kbdinput(1045a400,4d,30000689d68,300001b5000,0,1013dd4c) + 304
kbdrput(30000adabe8,3000108f080,30000ad3a18,3000108f080,30000689d68,30000ad39a0)
+ 13c
putnext(30000adae48,30000ad9a90,30000adb0a8,3000108f080,0,0) + 1cc
async_softint(3000108f080,1,ffff,20000,0,30000adae48) + 568
asysoftintr(3000017a008,30000b7e000,1,2a10007dd40,10180,1026fba8) + 70
intr_thread(2a10001fd40,1041b180,10423890,10423890,0,0) + a4
idle(1040f864,0,0,1041b180,3000005d6c8,0) + 54
thread_start(0,0,0,0,0,0) + 4

/var/adm/messages leading up to the reboot:

Apr 24 12:20:07 netdev SUNW,UltraSPARC-IIi: [ID 370172 kern.warning]
WARNING: [AFT1] EDP event on CPU0 Instruction access at TL=0, errID
0x0001d01e.baad443a
Apr 24 12:20:07 netdev AFSR 0x00000000.004000f0<EDP> AFAR
0xffffffff.ffffffff
Apr 24 12:20:07 netdev AFSR.PSYND 0x00f0(Score 45) AFSR.ETS 0x00
Fault_PC 0x97560
Apr 24 12:20:07 netdev UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Apr 24 12:20:07 netdev SUNW,UltraSPARC-IIi: [ID 798591 kern.info] [AFT2]
errID 0x0001d01e.baad443a No error found in ecache (No fault PA available)
Apr 24 12:20:07 netdev unix: [ID 836849 kern.notice]
Apr 24 12:20:07 netdev ^Mpanic[cpu0]/thread=3000165a440:
Apr 24 12:20:07 netdev unix: [ID 424580 kern.notice] [AFT1] errID
0x0001d01e.baad443a EDP Error(s)
Apr 24 12:20:07 netdev See previous message(s) for details
Apr 24 12:20:08 netdev unix: [ID 100000 kern.notice]
Apr 24 12:20:08 netdev genunix: [ID 723222 kern.notice] 000002a1005dd6d0
SUNW,UltraSPARC-IIi:cpu_aflt_log+4e0 (2a1005dd78e, 1, 101483a0,
2a1005dd918, 2a1005dd7db, 101483c8)
Apr 24 12:20:08 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000000 000002a1005dd9e0 0000000000000003 0000000000000010
Apr 24 12:20:08 netdev %l4-7: 0000000000200000 0000000000400000
0000000000000001 0000000000000080
Apr 24 12:20:08 netdev genunix: [ID 723222 kern.notice] 000002a1005dd920
SUNW,UltraSPARC-IIi:cpu_async_error+868 (1, 2a1005dd9e0, 4000f0, 0,
1400000004000f0, 2a1005ddba0)
Apr 24 12:20:08 netdev genunix: [ID 179002 kern.notice] %l0-3:
0000000000000001 000000000000000a 0000000000000000 0000000000000000
Apr 24 12:20:08 netdev %l4-7: 0000000000004208 0000000000000000
00000000007fbdd0 0000000000000084
Apr 24 12:20:08 netdev unix: [ID 100000 kern.notice]
Apr 24 12:20:08 netdev genunix: [ID 672855 kern.notice] syncing file
systems...
Apr 24 12:20:09 netdev genunix: [ID 733762 kern.notice] 1
Apr 24 12:20:10 netdev genunix: [ID 904073 kern.notice] done
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:15 EDT