E420 crash

From: Stanley, Jon (Jon.Stanley@savvis.net)
Date: Thu Apr 11 2002 - 00:43:27 EDT


I've got an E420R (4 CPU, 4gig RAM) system that just crashed. The crash
messages were sort of interesting - a memory problem that's not a memory
problem? Doing a quick Google search yielded me something have to do with
Apache on an E6500, and a domain crash on an E10K with something or another
to do with the ecache. This one seems to have to do with java...Any
assistance would be greatly appreciated

System kernel patch is 108528-12, running Solaris 8.

Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 798564 kern.warning]
WARNING: [AFT1] WP event on CPU0, errID 0x0018fc94.d0706bec
Apr 10 22:22:39 s87798srv04 AFSR 0x00000000.00800100<WP> AFAR
0x00000000.a7486bd0
Apr 10 22:22:39 s87798srv04 AFSR.PSYND 0x0100(Score 95) AFSR.ETS 0x00
Fault_PC 0xfb1d44a0
Apr 10 22:22:39 s87798srv04 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 714112 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0,
errID 0x0018fc94.ea59713e
Apr 10 22:22:39 s87798srv04 AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.0df4ac20
Apr 10 22:22:39 s87798srv04 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x10023c1c
Apr 10 22:22:39 s87798srv04 UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x00e9
UDBL.ESYND 0xe9
Apr 10 22:22:39 s87798srv04 UDBH Syndrome 0x3 Memory Module U1302 U0302
U1301 U0301
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 604670 kern.warning]
WARNING: [AFT1] errID 0x0018fc94.ea59713e Syndrome 0x3 indicates that this
may not be a memory module problem
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 988403 kern.info] [AFT2]
errID 0x0018fc94.ea59713e PA=0x00000000.0df4ac20
Apr 10 22:22:39 s87798srv04 E$tag 0x00000000.1ac001be E$State: Exclusive
E$parity 0x0d
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00b70009.003c000c
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00040022.00010095
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x000b0064.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000001 *Bad* PSYND=0xff00
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x28): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000001.f8c25b88
Apr 10 22:22:39 s87798srv04 unix: [ID 321153 kern.notice] NOTICE: Scheduling
clearing of error on page 0x00000000.0df4a000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 801579 kern.info] [AFT3]
errID 0x0018fc94.ea59713e Above Error detected by protected Kernel code
Apr 10 22:22:39 s87798srv04 that will try to clear error from system
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 551012 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0,
errID 0x0018fc94.ebc969d8
Apr 10 22:22:39 s87798srv04 AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.0df4ac20
Apr 10 22:22:39 s87798srv04 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x10023c1c
Apr 10 22:22:39 s87798srv04 UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x00e9
UDBL.ESYND 0xe9
Apr 10 22:22:39 s87798srv04 UDBH Syndrome 0x3 Memory Module U1302 U0302
U1301 U0301
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 678544 kern.warning]
WARNING: [AFT1] errID 0x0018fc94.ebc969d8 Syndrome 0x3 indicates that this
may not be a memory module problem
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 784299 kern.info] [AFT2]
errID 0x0018fc94.ebc969d8 PA=0x00000000.0df4ac20
Apr 10 22:22:39 s87798srv04 E$tag 0x00000000.1ac001be E$State: Exclusive
E$parity 0x0d
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00b70009.003c000c
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00040022.00010095
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x000b0064.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000001 *Bad* PSYND=0xff00
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x28): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.00000000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000001.f8c25b88
Apr 10 22:22:39 s87798srv04 unix: [ID 321153 kern.notice] NOTICE: Scheduling
clearing of error on page 0x00000000.0df4a000
Apr 10 22:22:39 s87798srv04 SUNW,UltraSPARC-II: [ID 896442 kern.info] [AFT3]
errID 0x0018fc94.ebc969d8 Above Error detected by protected Kernel code
Apr 10 22:22:39 s87798srv04 that will try to clear error from system
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 219896 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0,
errID 0x0018fe1b.b6ebb1bf
Apr 10 22:50:38 s87798srv04 AFSR 0x00000000.00200000<UE> AFAR
0x00000000.0df4ac20
Apr 10 22:50:38 s87798srv04 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0xfb00b3dc
Apr 10 22:50:38 s87798srv04 UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x00e9
UDBL.ESYND 0xe9
Apr 10 22:50:38 s87798srv04 UDBH Syndrome 0x3 Memory Module U1302 U0302
U1301 U0301
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 814361 kern.warning]
WARNING: [AFT1] errID 0x0018fe1b.b6ebb1bf Syndrome 0x3 indicates that this
may not be a memory module problem
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 745072 kern.info] [AFT2]
errID 0x0018fe1b.b6ebb1bf PA=0x00000000.0df4ac20
Apr 10 22:50:38 s87798srv04 E$tag 0x00000000.0bc001be E$State: Modified
E$parity 0x05
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000022.003c000c
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00040022.00010095
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x000b0064.00000000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000 *Bad* PSYND=0xf000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x28): 0x00000000.00000000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.00000000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000000.00000000
Apr 10 22:50:38 s87798srv04 unix: [ID 321153 kern.notice] NOTICE: Scheduling
clearing of error on page 0x00000000.0df4a000
Apr 10 22:50:38 s87798srv04 SUNW,UltraSPARC-II: [ID 310861 kern.info] [AFT3]
errID 0x0018fe1b.b6ebb1bf Above Error is in User Mode
Apr 10 22:50:38 s87798srv04 and is fatal: will reboot
Apr 10 22:50:38 s87798srv04 unix: [ID 855177 kern.warning] WARNING: [AFT1]
initiating reboot due to above error in pid 25229 (java)
Apr 10 22:50:42 s87798srv04 unix: [ID 221039 kern.notice] NOTICE: Previously
reported error on page 0x00000000.0df4a000 cleared
Apr 10 22:50:56 s87798srv04 syslogd: going down on signal 15
Apr 10 22:51:23 s87798srv04 genunix: [ID 672855 kern.notice] syncing file
systems...
Apr 10 22:51:23 s87798srv04 genunix: [ID 904073 kern.notice] done
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:11 EDT