E450 400 Mhz crashing repeatedly

From: Bruce Shaw (Bruce.Shaw@gov.ab.ca)
Date: Wed Jun 21 2006 - 13:27:53 EDT


This looks like the notorious e-cache problem, but I've already pulled one
CPU to no affect. The problem seems to travel.

Could this simply be bad RAM?

Output of /var/adm/messages follows:

Jun 21 01:05:34 summit unix: WARNING: uncorrectable error from pci2 (upa mid
4) during dvma read transaction
Jun 21 01:05:34 summit unix: Transaction was a block operation.
Jun 21 01:05:34 summit unix: AFSR=48000000.04800000
AFAR=00000000.b1ee17c0,
Jun 21 01:05:34 summit double word offset=0, Memory Module <170x> port id
4.
Jun 21 01:05:35 summit unix: secondary error from dvma read transaction
Jun 21 01:05:35 summit unix: WARNING: uncorrectable error from pci2 (upa mid
4) during unknown transaction
Jun 21 01:05:35 summit unix: Transaction was a block operation.
Jun 21 01:05:35 summit unix: AFSR=00000000.04800000
AFAR=00000000.b1ee17c0,
Jun 21 01:05:35 summit double word offset=0, Memory Module <170x> port id
4.
Jun 21 01:05:35 summit unix: panic[cpu2]/thread=30003bf4e80:
Jun 21 01:05:35 summit unix: Fatal PCI UE Error
Jun 21 01:05:35 summit unix:

Jun 21 10:50:01 summit unix: WARNING: [AFT1] Uncorrectable Memory Error on
CPU3 Data access at TL=0, errID 0x00001fd8.b011f1e5
Jun 21 10:50:01 summit AFSR 0x00000001<ME>.807000ff<PRIV,EDP,UE,CE> AFAR
0x00000000.8c917fe0
Jun 21 10:50:01 summit AFSR.PSYND 0x00ff(Score 05) AFSR.ETS 0x00
Fault_PC 0x10094964
Jun 21 10:50:01 summit UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0336<UE,CE>
UDBL.ESYND 0x36
Jun 21 10:50:01 summit UDBL Syndrome 0x36 Memory Module 170x
Jun 21 10:50:01 summit unix: [AFT2] errID 0x00001fd8.b011f1e5
PA=0x00000000.8c917fe0
Jun 21 10:50:01 summit E$tag 0x00000000.1cc01192 E$State: Exclusive
E$parity 0x0e
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x00): 0xbaddcafe.baddcafe
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x08): 0x00000300.001a7d80
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x10): 0x00000300.070a6000
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x18): 0x00022300.04fedfc8 *Bad*
PSYND=0x00ff
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x20): 0x00000300.069edfc8
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x28): 0x00022300.070a6db8 *Bad*
PSYND=0x00ff
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x30): 0x00000300.070a7b78
Jun 21 10:50:01 summit unix: [AFT2] E$Data (0x38): 0x0000000c.00000012
Jun 21 10:50:01 summit unix: WARNING: [AFT1] EDP event on CPU3 Data access
at TL=0, errID 0x00001fd8.b011f1e5
Jun 21 10:50:01 summit AFSR 0x00000001<ME>.807000ff<PRIV,EDP,UE,CE> AFAR
0xffffffff.ffffffff
Jun 21 10:50:01 summit AFSR.PSYND 0x00ff(Score 05) AFSR.ETS 0x00
Fault_PC 0x10094964
Jun 21 10:50:01 summit UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0336<UE,CE>
UDBL.ESYND 0x36
Jun 21 10:50:01 summit unix: [AFT2] errID 0x00001fd8.b011f1e5 No error found
in ecache (No fault PA available)
Jun 21 10:50:01 summit unix: panic[cpu3]/thread=30003a2bd40:
Jun 21 10:50:01 summit unix: [AFT1] errID 0x00001fd8.b011f1e5 UE EDP
Error(s)
Jun 21 10:50:01 summit See previous message(s) for details
Jun 21 10:50:01 summit unix:
Jun 21 10:50:02 summit unix: syncing file systems...
Jun 21 10:50:06 summit unix: 42
Jun 21 10:50:30 summit unix: 20
Jun 21 10:50:52 summit unix: 4
Jun 21 10:51:01 summit unix: panic[cpu3]/thread=2a10006fd60:

Jun 16 14:21:49 summit unix: cpr: System is being suspended.
Jun 16 14:21:50 summit unix: BAD TRAP: cpu=0 type=0x30 rp=0x2a1017552d0
addr=0x0 mmu_fsr=0x80100f
Jun 16 14:21:50 summit unix: sys-suspend:
Jun 16 14:21:50 summit unix: data access exception:
Jun 16 14:21:50 summit unix: MMU sfsr=80100f:
Jun 16 14:21:50 summit unix: Data or instruction address out of range
Jun 16 14:21:50 summit unix: on ASI 0x80 E 0 CID 0 PRIV 1 W 1 OW 1 FV 1
Jun 16 14:21:50 summit unix:
Jun 16 14:21:50 summit unix: pid=1417, pc=0x10035c04, sp=0x2a101754b71,
tstate=0x4477001605, context=0x1276
Jun 16 14:21:50 summit unix: g1-g7: 10463c00, 17d6b3f3, 20, 3, 1004e,
ffbee5c4, 30004c50a40
Jun 16 14:21:50 summit unix: Begin traceback... sp = 2a101754b71
Jun 16 14:21:50 summit unix: Called from 10074b24, fp=2a101754c21,
args=2a100339b30 0 20 2a100339b30 2a100339b30 0
Jun 16 14:21:50 summit unix: Called from 102ed9ec, fp=2a101754cd1,
args=30002009cda 0 1044a1f8 0 0 30002009cc8
Jun 16 14:21:50 summit unix: Called from 102e9ddc, fp=2a101754d91,
args=102f03d8 0 1044a1f8 0 0 0
Jun 16 14:21:50 summit unix: Called from 102e9c04, fp=2a101754e41,
args=78088400 78088400 410 0 300031d1b10 30000182500
Jun 16 14:21:50 summit unix: Called from 1004c54c, fp=2a101754fc1, args=0 0
104537f0 ffbef32c 0 20
Jun 16 14:21:50 summit unix: Called from 100f4fec, fp=2a101755071, args=0 0
104537f0 ffbef32c 0 20
Jun 16 14:21:50 summit unix: Called from 100f517c, fp=2a101755121,
args=10410400 0 0 0 3 0
Jun 16 14:21:50 summit unix: Called from 10037558, fp=2a1017552f1, args=3 0
0 3 1 1
Jun 16 14:21:50 summit unix: Called from 14088, fp=ffbef430, args=3 0 0 200c
1 1
Jun 16 14:21:50 summit unix: misaligned saved fp = ffbefc2f
Jun 16 14:21:50 summit unix: End traceback...
Jun 16 14:21:51 summit unix: panic[cpu0]/thread=30004c50a40:

This communication is intended for the use of the recipient to which it is
addressed, and may contain confidential, personal and or privileged
information. Please contact us immediately if you are not the intended
recipient of this communication, and do not copy, distribute, or take action
relying on it. Any communication received in error, or subsequent reply,
should be deleted or destroyed.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:40:12 EDT