SUMMARY - E450 Worrying Message

From: Nick Pettefar (Nick@Pettefar.com)
Date: Wed Nov 17 2004 - 12:43:46 EST


Hi list,

Thanks for the really great info.

I took people's advice (listed below, duplications not shown) and verified
that I actually do have a CPU "missing in action" from my E450.

I now have the machine in my office (it's rather loud!) and I'm attempting to
fault-find the problem. If I can't sort it out I'll log a call with Sun.

Cheers! (But not to the rude people and the vacationers).

Nick

> Failed Field Replaceable Units (FRU) in System:
> ==============================================
> SUNW,UltraSPARC-II unavailable :
> PROM fault string: fail-Disabled by Command
                                                                                The FRU has been deliberately failed by a PROM command (Perhaps it was flaky - re-enable it and see)

> Failed Field Replaceable Unit is UltraSPARC module Module 3

That FRU being processor 3. (Assuming they count from 0, the last one in
the machine.)

That's a hardware issue (one of the cpu units) If you can boot up
otherwise doing a "prtdiag -v" via shell will tell you more information.

You might want to check output from "psrinfo" to verify how many CPUs
are active/online in your system .. sounds like one CPU may be
faulty/offline (failed hardware? possibly shutdown / reseat the CPU on
the mainboard to confirm, or play "musical chairs" to rearrange CPUs to
rule out other HW issues such as CPU socket / support hardware upstream)....
ie, sample output from an e450 here (with 4x400mhz CPUs)

root@e450#psrinfo
0 on-line since 11/12/04 22:00:44
1 on-line since 11/12/04 22:00:47
2 on-line since 11/12/04 22:00:47
3 on-line since 11/12/04 22:00:47

and/or

root@e450#psrinfo -v

Status of processor 0 as of: 11/17/04 10:21:58
  Processor has been on-line since 11/12/04 22:00:44.
  The sparcv9 processor operates at 400 MHz,
  and has a sparcv9 floating point processor.
Status of processor 1 as of: 11/17/04 10:21:58
  Processor has been on-line since 11/12/04 22:00:47.
  The sparcv9 processor operates at 400 MHz,
  and has a sparcv9 floating point processor.
Status of processor 2 as of: 11/17/04 10:21:58
  Processor has been on-line since 11/12/04 22:00:47.
  The sparcv9 processor operates at 400 MHz,
  and has a sparcv9 floating point processor.
Status of processor 3 as of: 11/17/04 10:21:58
  Processor has been on-line since 11/12/04 22:00:47.
  The sparcv9 processor operates at 400 MHz,
  and has a sparcv9 floating point processor.

looks like you just lost a CPU , in slot 3 what dose prtdiag -v say :?

Means you should probably reboot the system with eeprom env var
diag-level="max" and diag-switch?=true. Something failed or is failing.
You can also run /usr/platform/sun4u/sbin/prtdiag -v from the root prompt.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:44 EDT