v880 memory problem

From: Thomas Carter (TCarter@memc.com)
Date: Wed May 28 2003 - 11:48:23 EDT


I have a v880 half way around the world (in Japan) that is exhibiting
memory problems:

May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 404677 kern.info] NOTICE:
[AFT0] Corrected system bus (CE) Event detected by CPU3 at TL=0, errID
0x00185cc9.e6b46ca0
May 25 08:26:31 coeha02 AFSR 0x00000002<CE>.00000037 AFAR
0x00000040.ca397780
May 25 08:26:31 coeha02 Fault_PC 0x1009d0db4 Esynd 0x0037 Slot B:
J3200
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 897010 kern.info] [AFT0]
errID 0x00185cc9.e6b46ca0 Corrected Memory Error on Slot B: J3200 is
Persistent
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 693738 kern.info] [AFT0]
errID 0x00185cc9.e6b46ca0 Data Bit 66 was in error and corrected
May 25 08:26:31 coeha02 unix: [ID 596940 kern.warning] WARNING: [AFT0] 14
soft errors in less than 24:00 (hh:mm) detected from Memory Module Slot B:
J3200
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 587212 kern.info] [AFT2]
errID 0x00185cc9.e6b46ca0 PA=0x00000040.ca397780
May 25 08:26:31 coeha02 E$tag 0x00000081.94480000 E$state_6 Exclusive
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x2d302e30.32323334 0x014e2c01.0a094d4a ECC 0x03b
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x4c505244.3330300b 0x33523244.44414130 ECC 0x13a
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x34303205.53465044 0x50043536.303003c2 ECC 0x00e
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x04080933.52324444 0x41413034.02c10301 ECC 0x13d
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 422670 kern.info] [AFT2]
D$Tag 0x040ca397 D$state Valid D$utag 0xad D$snp 0x040ca396
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 582021 kern.info] [AFT2]
PAtag 0x040.ca397780 PAsnp 0x040.ca397780 VAutag 0x2b7780
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 842398 kern.info] [AFT2]
D$Data (0x00) 0x2d302e30.32323334 0x014e2c01.0a094d4a
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 842398 kern.info] [AFT2]
D$Data (0x10) 0x4c505244.3330300b 0x33523244.44414130
May 25 08:26:31 coeha02 SUNW,UltraSPARC-III: [ID 335345 kern.info] [AFT2]
I$ data not available

And the memory configuration (from prtdiag) is this:
           Logical Logical Logical
      MC Bank Bank Bank DIMM Interleave Interleaved
 Brd ID num size Status Size Factor with
---- --- ---- ------ ----------- ------ ---------- -----------
  A 0 0 512MB no_status 256MB 8-way 0
  A 0 1 512MB no_status 256MB 8-way 0
  A 0 2 512MB no_status 256MB 8-way 0
  A 0 3 512MB no_status 256MB 8-way 0
  B 1 0 512MB no_status 256MB 8-way 1
  B 1 1 512MB no_status 256MB 8-way 1
  B 1 2 512MB no_status 256MB 8-way 1
  B 1 3 512MB no_status 256MB 8-way 1
  A 2 0 512MB no_status 256MB 8-way 0
  A 2 1 512MB no_status 256MB 8-way 0
  A 2 2 512MB no_status 256MB 8-way 0
  A 2 3 512MB no_status 256MB 8-way 0
  B 3 0 512MB no_status 256MB 8-way 1
  B 3 1 512MB no_status 256MB 8-way 1
  B 3 2 512MB no_status 256MB 8-way 1
  B 3 3 512MB no_status 256MB 8-way 1

Is there a way to disable this memory in Solaris until we have time to
shut down the machine and swap the memory module?

Thanks,
Thomas Carter
MEMC Southwest
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:29 EDT