Ultra10 Problems - Persistent Clock "drift" ; MAC clobbered once ; etc.

From: Tim Chipman (chipman@ecopiabio.com)
Date: Tue Sep 10 2002 - 10:19:45 EDT


Hi Folks,

I've got a problem with an Ultra-10 which I suspect will be a "classic
typical problem" but alas I can't find anything in google, sunsolve, or
the list archives.

Basic problem: This is a machine in an academic environment, not being
maintained really (purchased by a small math dept) [which I volunteered
to look at briefly while home during Summer Holiday] since they were
getting ready to throw it in the dumpster (or the like). It was just out
of warranty by a few months, not under service from Sun (of course).
Never patched, running a vanilla Solaris8 install.

When I got to the machine, it managed to get up to the OK prompt,
displayed an apparently bogus MAC address, wasn't able to boot the disk
by itself (as it had originally being doing, of course) and was trying
to "boot net" (a fallback option in cases like this, I gather). I was
able to stop this ; "boot disk" ; get the machine up to single-user ;
fsck slices ; enabled logging on slices (no UPS so frequent power bumps
cause much filesystem grief) ; ran the most recent recommended patch
cluster ; reset the mac address & hostID info @ the OK-prompt .. and it
was "apparently" "OK" after this. [yes, I did reboot as needed between
various steps described in rapid succession here :-) ]

A week later, connected remotely to do a bit of tidy-up / lockdown on
the thing, I noted the clock was desperately off (claiming it was the
year 2038, I think?). I reset this via "ntpdate", rebooted the machine
to ensure no odd lingering unpleasant side-effect from this rather
drastic time-warp, and it was back up "OK" ?

Next day, remote connection indicates that time was drifiting rapidly /
erraticly. Sometimes, after doing an "ntpdate", it will be ~5 years off
in 5 minutes. Other times, it will drift much less rapidly (right now it
is off approximately one week, which is almost but not quite the last
time I reset the time on the machine).

Added icing on the cake, it doesn't reboot via a "shutdown -y -g0 -i5"
type command - it just doesn't shut down. Only "sync ; sync ; halt"
appears to do the trick (insane) at this point. And, with the most
recent reboot, it started the nonsense of failing to "boot disk"
automatically (although it hasn't lost its MAC address this time).

All this to ask:

-> Does this sound like hardware failure of an easily replaced
component? not-easily replaced component?

-> any comments, suggestions?

-> aside from these "little details", the machine stays up & running
"just fine" (unless power-bump issues bring it down, doh!). I've
recommended they buy a decent UPS to keep power issues at a minimum, but
they aren't to keen to spend money on what they are suspecting is a
terminally ill box. Especially since this "high power workstation" which
set them back a reasonable chunk of grant $$$ could easily be 4(+)-fold
outclassed now for 1/5 the price by a linux box as its successor. (they
bought this thing fully-loaded with Sun-Ram, a pair of `large gourmet
Sun-branded IDE drives', so despite the academic discount they paid a
premium for this rig).

Any comments or suggestions are certainly greatly appreciated...

Thanks,

--Tim Chipman
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:55 EDT