[HPADM] SUMMARY: Memory checksum Errors in rp2470

From: Joe Brancaleone (joe@nitech.com)
Date: Tue Jan 13 2004 - 18:57:50 EST


SUMMARY:

I finally got somewhere with the memory vendor on this one. They took a
long time to get back to me on these errors, and supposedly even worked
with their HP contact on it. They have indicated this must be an SPD
programming issue, which simply means it refers to an area of
information on the memory module which the BIOS communicates with to
recognize what kind of memory modules they are, etc. And the checksum
error most likely is the system looking for certain bytes which are not
actually where it thinks they are. The memory vendor needs to upgrade
the memory modules' SPD to verify. I will post a follow-up if the
upgrade is unsuccessful and the problem turns out being the memory
carrier or somewhere else in the memory path.

Here are two of the initial responses I got from the list:

Bill Hassell:

  This may be due to timing errors, possibly due to a problem in
  the memory carriers or controller.

  Memory is quite tricky inhigh speed systems like the rp-series.
  Using HP-UX, even fully loaded, will not match the selftests in
  terms of corner cases. The selftests are telling you that memory
  is not reliable. The RAM itself may be fine--there may some other
  problem in the memory path. But don't ignore the error. You'll
  soon have system crashes..

Eef Hartman:

Could be they're non-parity cq nonECC memory modules, while the self-test is testing
those "extra bits". It will mean that while operations are normal, any memory faults
will NOT be detected (parity) and/or CORRECTED (ECC).

I myself never used anything else then the HP recommended kind of memory in
any HP server, especially not the "critical" ones.

Thanks again for everyone's input.

joe

ORIGINAL POST:

Joe Brancaleone wrote:

> Dear list,
>
> I'm getting memory checksum errors while booting up an rp2470 server.
> The messages are below, which pop up on the console during the memory
> testing phase of POST (particularly after the "memory config 7214"
> lines). I can't seem to find any documentation explaining the reasons
> why I'd get "checksum errors" with memory that apparently works ok.
> - Is this a critical error because of faulty memory modules? I tried
> switching around the modules but it did not change when the errors
> appear. (always comes after the 7214 lines)
> - Or is this really a "non-critical" error because they are
> third-party modules? Installing and booting the OS went fine. The OS
> reported the correct amount of memory, etc.
>
> Thanks in advance.
>
>
> ***** EARLY BOOT VFP : SYSTEM ALERT *****
>
> SYSTEM NAME: unknown
>
> DATE: 12/10/2003 TIME: 19:35:36
>
> ALERT LEVEL: 6 = Boot possible, pending failure - action required
>
> REASON FOR ALERT
>
> SOURCE: 7 = memory
>
> SOURCE DETAIL: 4 = SIMM or DIMM SOURCE ID: FF
>
> PROBLEM DETAIL: C = checksum error
>
>
> LEDs: RUN ATTENTION FAULT REMOTE POWER
>
> FLASH FLASH OFF OFF ON
>
> LED State: Running non-OS code. Non-critical error detected.
>
> Check Chassis and Console Logs for error messages.
>
>
>
> 0x2000006C74FF21C2 0000FF00 0001FF74 - type 4 = Physical Location
>
> 0x5800086C74FF21C2 0000670B 0A132324 - type 11 = Timestamp 12/10/2003
> 19:35:36
>
> A/a: ack read of this entry - Q/q: quit Virtual Front Panel Display
>
> Anything else redisplay the log entry
>
> ->Choice:a
>
> *****************************************
>
> memory config 7213
>
> memory config 7214
>
>
>
> ***** EARLY BOOT VFP : SYSTEM ALERT *****
>
> SYSTEM NAME: unknown
>
> DATE: 12/10/2003 TIME: 19:35:38
>
> ALERT LEVEL: 6 = Boot possible, pending failure - action required
>
>
> REASON FOR ALERT
>
> SOURCE: 7 = memory
>
> SOURCE DETAIL: 4 = SIMM or DIMM SOURCE ID: FF
>
> PROBLEM DETAIL: C = checksum error
>
>
> LEDs: RUN ATTENTION FAULT REMOTE POWER
>
> FLASH FLASH OFF OFF ON
>
> LED State: Running non-OS code. Non-critical error detected.
>
> Check Chassis and Console Logs for error messages.
>
>
>
> 0x2000006C74FF21C2 0000FF00 0002FF74 - type 4 = Physical Location
>
> 0x5800086C74FF21C2 0000670B 0A132326 - type 11 = Timestamp 12/10/2003
> 19:35:38
>
> A/a: ack read of this entry - Q/q: quit Virtual Front Panel Display
>
> Anything else redisplay the log entry
>
> ->Choice:a
>
> *****************************************
>
> memory config 7213
>
> memory config 7214
>
>
>
> ***** EARLY BOOT VFP : SYSTEM ALERT *****
>
> SYSTEM NAME: unknown
>
> DATE: 12/10/2003 TIME: 19:35:41
>
> ALERT LEVEL: 6 = Boot possible, pending failure - action required
>
>
> REASON FOR ALERT
>
> SOURCE: 7 = memory
>
> SOURCE DETAIL: 4 = SIMM or DIMM SOURCE ID: FF
>
> PROBLEM DETAIL: C = checksum error
>
>
> LEDs: RUN ATTENTION FAULT REMOTE POWER
>
> FLASH FLASH OFF OFF ON
>
> LED State: Running non-OS code. Non-critical error detected.
>
> Check Chassis and Console Logs for error messages.
>
>
>
> 0x2000006C74FF21C2 0000FF00 0003FF74 - type 4 = Physical Location
>
> 0x5800086C74FF21C2 0000670B 0A132329 - type 11 = Timestamp 12/10/2003
> 19:35:41
>
> A/a: ack read of this entry - Q/q: quit Virtual Front Panel Display
>
> Anything else redisplay the log entry
>
> ->Choice:a
>
> *****************************************
>
>
>
> *****************************************
>
>
>
> ************ EARLY BOOT VFP *************
>
> End of early boot detected
>
> *****************************************
>
>
>
> ************* SYSTEM ALERT **************
>
> SYSTEM NAME: unknown
>
> DATE: 12/10/2003 TIME: 19:35:43
>
> ALERT LEVEL: 6 = Boot possible, pending failure - action required
>
>
> REASON FOR ALERT
>
> SOURCE: 7 = memory
>
> SOURCE DETAIL: 4 = SIMM or DIMM SOURCE ID: FF
>
> PROBLEM DETAIL: C = checksum error
>
>
> LEDs: RUN ATTENTION FAULT REMOTE POWER
>
> FLASH FLASH OFF OFF ON
>
> LED State: Running non-OS code. Non-critical error detected.
>
> Check Chassis and Console Logs for error messages.
>
>
>
> 0x2000006C74FF21C2 0000FF00 0004FF74 - type 4 = Physical Location
>
> 0x5800086C74FF21C2 0000670B 0A13232B - type 11 = Timestamp 12/10/2003
> 19:35:43
>
> A: ack read of this entry - X: Disable all future alert messages
>
> Anything else skip redisplay the log entry
>
> ->Choice:a
>
>
> regards,
> joe

-- 
Joe Brancaleone
Support Technician
_______________
Nitech
35 Musick
Irvine Ca. 92618
Phn: 1-888-264-8324
Fax: 949-586-5234
Email: joe@nitech.com
http://www.nitech.com
Mission Statement: 
Nitech is a leader in data management. Our extensive knowledge in both Unix and NT allows us to provide competitive data storage, networking and computing solutions.
--
             ---> Please post QUESTIONS and SUMMARIES only!! <---
        To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
       Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
 
 Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
            http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 11:02:38 EDT