Bricked Sun Fire T2000 -- does not boot any Solaris 10 DVDs

From: Jeff Woolsey (jlw+sun@jlw.com)
Date: Thu Nov 08 2007 - 13:42:40 EST


My client has a Sun Fire T200[0] that evidently escaped a lab at Sun before the
product was released. We haven't determined whether this machine is on any Sun
support yet (I'd bet against it). It had been running Solaris 10 3/05 HW2 for
about 250 days. Then someone wanted patch 119826-02 applied, which is obsoleted
by 120011, which requires 118833, which requires a reconfiguration reboot before
installing any further patches. Having done that, now it fails booting s10hw2
from local disk with

not found: hsvc_register
not found: cpu_setup_common
do_relocations: /platform/sun4v/kernel/cpu/sparcv9/SUNW,UltraSPARC-T1
do_relocate failed
krtld: error during initial load/link phase
panic - boot: exitto64 returned from client program
Program terminated

It also fails booting the Solaris 10 6/06 DVD and fails netbooting s10u1,
s10u2, s10u3, and s11sb64, all with

root nexus = Sun Fire T200
pseudo0 at root
pseudo0 is /pseudo
scsi_vhci0 at root
scsi_vhci0 is /scsi_vhci
pseudo-device: dld0
dld0 is /pseudo/dld@0

SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
EVENT-TIME: 0x473122d8.0x20c81d5c (0x4de29d1504)
PLATFORM: SUNW,Sun-Fire-T200, CSN: -, HOSTNAME:
SOURCE: SunOS, REV: 5.10 Generic_118833-17
DESC: Errors have been detected that require a reboot to ensure system
integrity. See http://www.sun.com/msg/SUNOS-8000-0G for more information.
AUTO-RESPONSE: Solaris will attempt to save and diagnose the error telemetry
IMPACT: The system will sync files, save a crash dump if needed, and reboot
REC-ACTION: Save the error summary below in case telemetry cannot be saved

panic[cpu0]/thread=180e000: Fatal System Bus Error has occurred
[and a traceback ending near vfs_mountroot]

It fails to run max diagnostics (OBP never notices, apparently), the kind that
Entrprise systems would take almost an hour to do.

It is way downrev on everything, pre-release:
sc> showhost
Host flash versions:
   Reset V1.0.0.build_09d
   Sequencer V1.0.0.build_09d
   HV:1.0.0_15c:FIREBALL:firmware.unknown::200508041350:firmware_re
   OBP Ontario build_15c ***PROTOTYPE BUILD*** 2005/08/05 10:46
[firmware obp4.x-same-as-15 #0]
   MPT SAS FCode Version 1.00.33 (2005.01.20)>R
    ONTARIO Integrated POST 4.x.0.build_15c 2005/08/04 13:55
sc>

The flashupdate command in this early ALOM fails to load any new
firmware (complains of bad S-record on the what line and aborts the
FTP pretty early). The SC firmware is so old that it does not have a bootmode
command.

It even failed ALOM POST once, early on:

Boot progress:
  fpga_init...

machine check
Exception current instruction address: 0x000cc8d4
Machine Status Register: 0x00009030
Data Access Register: 0x03c0a109
Condition Register: 0x20000044
Data storage interrupt Register: 0x0000040b
Task: 0x1fffe00 "tRootTask"^G

ALOM - POST run incomplete previously, no POST this time

We pulled it out of the rack far enough to find a serial number (before
discovering that ALOM could do that) and did not see any obnoxious yellow
sticker saying that Solaris was preinstalled on this machine, but it must have
been. It's not clear that this box was ever capable of booting other than HW2
from local disk. I wasn't here 250 days ago.

I've run out of magic; it will need attention from Sun before it is useful
again, as far as I can tell. (I have all of the console output in a file for
reference; a 24x80 serial CRT terminal doesn't cut it any more--even the ALOM
help is too long.)

Does anybody here have any magic that I haven't tried yet?

Jeff Woolsey
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:42:30 EDT