Disabling failing memory on a 4800

From: Tim Kirby (trk@cray.com)
Date: Wed Jul 16 2003 - 13:35:19 EDT


I have a remote 4800 system that I'm babysitting while "dad" is
on vacation, so I do not have any of the documentation on site.

Said 4800 is clocking correctable errors on a DIMM and also lost a
CPU. We took the box last night and remote hardware man replaced the
failed CPU and moved the DIMM. We not have all 8 CPUs again... but
the DIMM is still failing and filling up every log with verbose
messages. I used to work on the CS6400 line (precursor to the E10K)
and know something of the capabilities of DR as they used to be.

Digging in SunSolve has not been very helpful, so I'm going to try
the list.

Is there any way I can successfully "blacklist" some part of memory
on the fly while we wait for the replacement memory to arrive? We
really don't want to take another downtime. My understanding is this
box should be DR'able. I understand this would potentially make a
mess of the interleaving... but I'd like to know if this is possible.

Thanks

Tim

-- 
Tim Kirby                                       651-605-9074
trk@cray.com                   Cray Inc. Information Systems
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:46 EDT