Spontaneous crash - Diagnostic HELP

From: Brian Lucas (BLucas@accela.com)
Date: Thu Jul 15 2004 - 13:31:23 EDT


Around 10:18 this morning, my E4500 crashed without warning. It came up
after I fsck'd a logical volume but I am wondering what caused it to crash
in the first place. I am looking for any help or thoughts on to what caused
this to crash. I see "Uncorrectable Memory Error." Is that indicative of
faulty or failing memory? I can see this in /var/adm/messages

Jul 15 10:18:00 ORASTAND SUNW,UltraSPARC-II: [ID 650949 kern.warning]
WARNING: [AFT1] WP event on CPU5, errID 0x0004a34a.52cd
35bf
Jul 15 10:18:00 ORASTAND AFSR 0x00000000.00800004<WP> AFAR
0x00000000.38921000
Jul 15 10:18:00 ORASTAND AFSR.PSYND 0x0004(Score 95) AFSR.ETS 0x00
Fault_PC 0x10a3ecc
Jul 15 10:18:00 ORASTAND UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 472329 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data
 access at TL=0, errID 0x0004a34c.82a4f1a2
Jul 15 10:18:09 ORASTAND AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.f6444f28
Jul 15 10:18:09 ORASTAND AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x1026e4c
Jul 15 10:18:09 ORASTAND UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE>
UDBL.ESYND 0x03
Jul 15 10:18:09 ORASTAND UDBL Syndrome 0x3 Memory Module Board 0 J3100
J3200 J3300 J3400 J3500 J3600 J3700 J3800
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 619031 kern.warning]
WARNING: [AFT1] errID 0x0004a34c.82a4f1a2 Syndrome 0x3
indicates that this may not be a memory module problem
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 458761 kern.info] [AFT2]
errID 0x0004a34c.82a4f1a2 PA=0x00000000.f6444f28
Jul 15 10:18:09 ORASTAND E$tag 0x00000000.1cc01ec8 E$State: Exclusive
E$parity 0x0e
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x28): 0x03020000.00466f97 *Bad* PSYND=0x00
ff
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000700.0116e190
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000001.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 133423 kern.info] [AFT3]
errID 0x0004a34c.82a4f1a2: cannot schedule clearing
 of error on page 0x00000000.f6444000; page not in VM system
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 898897 kern.info] [AFT3]
errID 0x0004a34c.82a4f1a2 Above Error detected by p
rotected Kernel code
Jul 15 10:18:09 ORASTAND that will try to clear error from system
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 591848 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data
 access at TL=0, errID 0x0004a34c.84e0e065
Jul 15 10:18:09 ORASTAND AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.f6444f28
Jul 15 10:18:09 ORASTAND AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x1026e4c
Jul 15 10:18:09 ORASTAND UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE>
UDBL.ESYND 0x03
Jul 15 10:18:09 ORASTAND UDBL Syndrome 0x3 Memory Module Board 0 J3100
J3200 J3300 J3400 J3500 J3600 J3700 J3800
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 863244 kern.warning]
WARNING: [AFT1] errID 0x0004a34c.84e0e065 Syndrome 0x3
indicates that this may not be a memory module problem
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 305619 kern.info] [AFT2]
errID 0x0004a34c.84e0e065 PA=0x00000000.f6444f28
Jul 15 10:18:09 ORASTAND E$tag 0x00000000.1cc01ec8 E$State: Exclusive
E$parity 0x0e
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x28): 0x03020000.00466f97 *Bad* PSYND=0x00
ff
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000700.0116e190
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000001.00000000
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 999220 kern.info] [AFT3]
errID 0x0004a34c.84e0e065: cannot schedule clearing
 of error on page 0x00000000.f6444000; page not in VM system
Jul 15 10:18:09 ORASTAND SUNW,UltraSPARC-II: [ID 736008 kern.info] [AFT3]
errID 0x0004a34c.84e0e065 Above Error detected by p
rotected Kernel code
Jul 15 10:18:09 ORASTAND that will try to clear error from system
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 794595 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU5 Data
 access at TL=0, errID 0x0004a350.264d8fcb
Jul 15 10:18:25 ORASTAND AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.f6444f28
Jul 15 10:18:25 ORASTAND AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x104a1f8
Jul 15 10:18:25 ORASTAND UDBH 0x009b UDBH.ESYND 0x9b UDBL 0x0203<UE>
UDBL.ESYND 0x03
Jul 15 10:18:25 ORASTAND UDBL Syndrome 0x3 Memory Module Board 0 J3100
J3200 J3300 J3400 J3500 J3600 J3700 J3800
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 819070 kern.warning]
WARNING: [AFT1] errID 0x0004a350.264d8fcb Syndrome 0x3
indicates that this may not be a memory module problem
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 901539 kern.info] [AFT2]
errID 0x0004a350.264d8fcb PA=0x00000000.f6444f28
Jul 15 10:18:25 ORASTAND E$tag 0x00000000.1cc01ec8 E$State: Exclusive
E$parity 0x0e
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.00000000
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.00000000
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.00000000
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x28): 0x03020000.00466f97 *Bad* PSYND=0x00
ff
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000700.0116e190
Jul 15 10:18:25 ORASTAND SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x38): 0x00000001.00000000
Jul 15 10:18:25 ORASTAND unix: [ID 836849 kern.notice]
Jul 15 10:18:25 ORASTAND ^Mpanic[cpu5]/thread=300017f57c0:
Jul 15 10:18:25 ORASTAND unix: [ID 549369 kern.notice] [AFT1] errID
0x0004a350.264d8fcb UE Error(s)
Jul 15 10:18:25 ORASTAND See previous message(s) for details
Jul 15 10:18:25 ORASTAND unix: [ID 100000 kern.notice]
Jul 15 10:18:25 ORASTAND genunix: [ID 723222 kern.notice] 000002a100365480
SUNW,UltraSPARC-II:cpu_aflt_log+5ac (2a10036558b,
1, 2a1003657b0, 10, 1177478, 11774a0)
Jul 15 10:18:25 ORASTAND genunix: [ID 179002 kern.notice] %l0-3:
0000000000000000 000002a1003656c8 0000000000000010 0000000
000000003
Jul 15 10:18:25 ORASTAND %l4-7: 000002a1003657b0 0000030000431168
0000000000000000 000002a10036553e
Jul 15 10:18:25 ORASTAND genunix: [ID 723222 kern.notice] 000002a1003656d0
SUNW,UltraSPARC-II:cpu_async_error+9cc (ca03, f644
4f20, 80200000, 9b, 2a1003657b0, 1438c00)
Jul 15 10:18:25 ORASTAND genunix: [ID 179002 kern.notice] %l0-3:
0000000000000001 0000000000800000 00000000014910f0 0650193
680200000
Jul 15 10:18:25 ORASTAND %l4-7: 0000000000000000 00000000f6444f00
0000000000000000 0000000003280c9b
Jul 15 10:18:25 ORASTAND genunix: [ID 723222 kern.notice] 000002a1003658b0
unix:ktl0+48 (70005444ec8, 1, 70002114c00, 7000544
4e50, 1, 70005444e50)
Jul 15 10:18:25 ORASTAND genunix: [ID 179002 kern.notice] %l0-3:
0000000000000001 0000000000001400 0000004400001600 0000000
00116e288
Jul 15 10:18:25 ORASTAND %l4-7: 0000000000000000 0000000000000000
0000000000000000 000002a100365960
Jul 15 10:18:25 ORASTAND genunix: [ID 723222 kern.notice] 000002a100365a00
genunix:fsflush+3fc (1450c, 142b3d0, 1496800, 144e
c00, 1000, 1dd8)
Jul 15 10:18:25 ORASTAND genunix: [ID 179002 kern.notice] %l0-3:
0000030001282d70 0000070005444ec8 0000000000012ed2 0000030
00411ce88
Jul 15 10:18:25 ORASTAND %l4-7: 0000000000000bb8 0000070005444ec8
000000000144ee88 0000000000000000
Jul 15 10:18:25 ORASTAND unix: [ID 100000 kern.notice]
Jul 15 10:18:25 ORASTAND genunix: [ID 672855 kern.notice] syncing file
systems...
Jul 15 10:18:26 ORASTAND genunix: [ID 733762 kern.notice] 1
Jul 15 10:18:56 ORASTAND last message repeated 20 times
Jul 15 10:18:57 ORASTAND genunix: [ID 622722 kern.notice] done (not all i/o
completed)
Jul 15 10:18:58 ORASTAND genunix: [ID 111219 kern.notice] dumping to
/dev/dsk/c3t0d0s1, offset 215220224, content: kernel
Jul 15 10:20:05 ORASTAND genunix: [ID 409368 kern.notice] ^M100% done: 43155
pages dumped, compression ratio 2.66,
Jul 15 10:20:05 ORASTAND genunix: [ID 851671 kern.notice] dump succeeded
Jul 15 10:48:56 ORASTAND genunix: [ID 540533 kern.notice] ^MSunOS Release
5.9 Version Generic_112233-11 64-bit
Jul 15 10:48:56 ORASTAND genunix: [ID 943905 kern.notice] Copyright
1983-2003 Sun Microsystems, Inc. All rights reserved.
Jul 15 10:48:56 ORASTAND Use is subject to license terms.
Jul 15 10:48:56 ORASTAND genunix: [ID 678236 kern.info] Ethernet address =
8:0:20:b7:b6:d2
Jul 15 10:48:56 ORASTAND unix: [ID 597320 kern.info] NOTICE: DR Kernel Cage
is DISABLED
Jul 15 10:48:56 ORASTAND unix: [ID 389951 kern.info] mem = 4194304K
(0x100000000)
Jul 15 10:48:56 ORASTAND unix: [ID 930857 kern.info] avail mem = 4083736576
Jul 15 10:48:56 ORASTAND rootnex: [ID 466748 kern.info] root nexus = 8-slot
Sun Enterprise E4500/E5500
Jul 15 10:48:56 ORASTAND rootnex: [ID 349649 kern.info] sbus0 at root: UPA
0x2 0x0 ...
--More--(31%)

[demime 1.01b removed an attachment of type application/x-pkcs7-signature which had a name of smime.p7s]
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:05 EDT