memory error message

From: Sundaram Ramasamy (sun@percipia.com)
Date: Mon Oct 11 2004 - 10:49:31 EDT


Hi all,

 I am getting following memory error message every 4 hours. I would like
to know serious ness of this error message. This is production
application server; during weekdays I can not take it down for
maintenance. So far we didnt see any performance problem.

Server configuration:

Sun Fire 480R, 4 CUP and 16GB memory.

Error message:

Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 356908 kern.info] NOTICE:
[AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID
0x0071c805.80db4948
Oct 11 09:15:43 apps AFSR 0x00000002<CE>.00000182 AFAR
0x000000a1.34015f70
Oct 11 09:15:43 apps Fault_PC 0x10025294 Esynd 0x0182 Slot A: J8000
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 908886 kern.info] [AFT0]
errID 0x0071c805.80db4948 Corrected Memory Error on Slot A: J8000 is
Intermittent
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 744254 kern.info] [AFT0]
errID 0x0071c805.80db4948 Data Bit 85 was in error and corrected
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 797734 kern.info] [AFT2]
errID 0x0071c805.80db4948 E$tag PA=0x000000b0.06415f40 does not match
AFAR=0x000000a1.34015f40
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 868927 kern.info] [AFT2]
errID 0x0071c805.80db4948 PA=0x000000b0.06415f40
Oct 11 09:15:43 apps E$tag 0x000002c0.19512000 E$state_5 Exclusive
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x00000000.bac720c0 0xbac720e0.00000000 ECC 0x0d5
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x00000000.bac71f88 0x00000001.00000000 ECC 0x145
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x00000000.00000000 0x0507dac1.c31660d1 ECC 0x1fd
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x00000008.00000000 0x00000001.00000001 ECC 0x085
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 797734 kern.info] [AFT2]
errID 0x0071c805.80db4948 E$tag PA=0x000000a1.06015f40 does not match
AFAR=0x000000a1.34015f40
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 868927 kern.info] [AFT2]
errID 0x0071c805.80db4948 PA=0x000000a1.06015f40
Oct 11 09:15:43 apps E$tag 0x00000284.18000249 E$state_5 Invalid
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x000000b1.e80d8658 0x00000000.fb300001 ECC 0x1d2
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x00000300.00169700 0x00000310.054d8658 ECC 0x0c4
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x00000310.072785f0 0x00010000.00000004 ECC 0x103
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x00080008.00000000 0x800000b0.6d4a84b0 ECC 0x02f
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$
data not available
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$
data not available
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 981584 kern.info] NOTICE:
[AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID
0x0071c805.81c943f0
Oct 11 09:15:43 apps AFSR 0x00000002<CE>.00000160 AFAR
0x000000a1.34016f40
Oct 11 09:15:43 apps Fault_PC <unknown> Esynd 0x0160 Slot A: J8000
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 207292 kern.info] [AFT0]
errID 0x0071c805.81c943f0 Corrected Memory Error on Slot A: J8000 is
Persistent
Oct 11 09:15:43 apps SUNW,UltraSPARC-III+: [ID 484950 kern.info] [AFT0]
errID 0x0071c805.81c943f0 Data Bit 86 was in error and corrected
Oct 11 09:15:43 apps unix: [ID 596940 kern.warning] WARNING: [AFT0] 20143
soft errors in less than 24:00 (hh:mm) detected from Memory Module Slot A:
J8000

Thanks
-SR
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:33 EDT