Sun Blade 100 - strange behavior after firmware update.

From: Scott Mickey (mickey@denver.net)
Date: Fri Aug 18 2006 - 14:52:37 EDT


Sun Managers,
 
I updated the firmware on a Sun Blade 100, and now after
exactly 15 minutes with the system idle, it drops to the
ok prompt with these messages:
 
> RED State Exception
> ERROR: error-reset-cleanup: Externally Initiated Reset has occurred.
> ERROR: Last Trap: Externally Initiated Reset
 
If booted single user mode, or if the system is kept busy,
then this never happens. System stays up indefinitely.
 
Solaris 10 01/06 and Solaris 9 09/04 both install without
error (as the machine is kept busy). However, after OS
installation is complete and machine goes idle, 15 minutes
later the 'RED State Exception' happens and it drops to
the ok prompt.
 
Background info:
This machine was very reliable and trouble free with
original OBP firmware, version 4.0.45. Ran Solaris 9,
headless (no USB keybd or mouse, no monitor), with 2x
80GB IDE disks, primarily as a jumpstart and SAMBA server.
Idle nights and weekends, and sometimes extremely busy
during work days. -Never a crash, no errors, no problems.
A good little machine.
 
Upgraded to OBP firmware 4.17.1 using Sun patch 119235-01,
dated Apr/29/2005. Installed Solaris 10 from DVD without
error, but then 'RED State Exception' happened.
 
Downgraded OBP firmware back to 4.0.45 using patch 111179-01,
and reinstalled Solaris 9, but 'RED State Exception' problem
remained. Again, only after 15 minutes of system inactivity
at run-level 3 or run-level 2.
 
Using parts from another Sun Blade 100, swapped memory,
then CPU, then IDPROM chip, and then power supply.
-Problem remained. Put the mainboard (Sun p/n 375-0096)
into another Sun Blade 100 chassis (this one had just one
10 GB IDE drive), and did a Solaris 9 install. -Problem
remained. The problem is on the mainboard, but it is
NOT random. I can tell within 30 seconds when the
'RED State Exception' will occur, by running this script
in a ssh window immediately after boot:
 
$ cat show_uptime
#!/bin/sh -
while :
do
  uptime
  sleep 60
done
 
Here is the output:
$ ./show_uptime
 4:18pm up 1 min(s), 1 user, load average: 0.35, 0.15, 0.06
 4:19pm up 2 min(s), 1 user, load average: 0.14, 0.13, 0.05
 4:20pm up 3 min(s), 1 user, load average: 0.05, 0.11, 0.05
 4:21pm up 4 min(s), 1 user, load average: 0.02, 0.09, 0.05
 4:22pm up 5 min(s), 1 user, load average: 0.01, 0.07, 0.05
 4:23pm up 6 min(s), 1 user, load average: 0.00, 0.06, 0.04
 4:24pm up 7 min(s), 1 user, load average: 0.00, 0.05, 0.04
 4:25pm up 8 min(s), 1 user, load average: 0.00, 0.04, 0.04
 4:26pm up 9 min(s), 1 user, load average: 0.00, 0.03, 0.04
 4:27pm up 10 min(s), 1 user, load average: 0.00, 0.03, 0.04
 4:28pm up 11 min(s), 1 user, load average: 0.00, 0.02, 0.03
 4:29pm up 12 min(s), 1 user, load average: 0.00, 0.02, 0.03
 4:30pm up 13 min(s), 1 user, load average: 0.00, 0.02, 0.03
 4:31pm up 14 min(s), 1 user, load average: 0.00, 0.01, 0.03
 4:32pm up 15 min(s), 1 user, load average: 0.00, 0.01, 0.03
(Then RED State Exception and drops to ok prompt).
 
In single user mode, system runs fine:
# uptime
 6:34pm up 17:42, 0 users, load average: 0.00, 0.00, 0.01
 
Or if I open a second ssh window and run this script,
it runs fine:
$ cat find_usr
#!/bin/sh -
while :
do
    find /usr -print
    sleep 5
done
 
I need to be honest and admit that neither Sun Blade 100
has Sun-branded memory or Sun-branded hard disks.
However, this isn't an enterprise-class machine by any
stretch or measure, so that should not be a factor.
The memory is good memory, as are the disks.
I guess I could do another OBP firmware upgrade on
another Sun Blade 100 to see if this is a repeatable
error, but then I might have two useless Sun Blade 100's.
 
Doing an OBP firmware upgrade and OS reinstall is a very
common procedure. I'm sure someone out there must have
seen this problem also. I know this machine is a FRU,
but I would like to get it working again, rather than
throw it in the recycle bin. I look forward to your
emails, with accounts of successful and unsuccessful
Sun Blade 100 OBP firmware updates. -Thanks!
 
Oh, and why did I do an OBP firmware update in the first
place? I wanted to try out the OBP 'wanboot' feature,
available only in OBP versions 4.17 and above.
 
Also, if someone at Sun Microsystems could please forward
this to the person or persons in-charge of OBP firmware
for the Sun Blade 100/150 series, I would really appreciate
it.
 
Scott Mickey
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:40:38 EDT