mysterious reboot

From: Seth Rothenberg (SROTHENB@montefiore.org)
Date: Tue Nov 30 2004 - 16:10:50 EST


Greetings,
We had a mysterious reboot last night.

This morning, I found that my develpment box (E4500) was not
responding.
Logged in to prod and went throught the terminal server to the
console,
after a few minutes, it reported "the system is coming up"

System is Solaris 2.6

The system came up
-the last message in /var/adm/messages was at 23:11 - matched last
modified time.
-At 9:00 am, I ran uptime, and it reported "up, 9:45" -
That's weird...true, it was 9:45 since last boot *started*, but it took
9:42 to boot!
-pringdiag -v reported no hardware problems, no parts missing, 6 CPUs,
6GB memory
-df -k reports root partition has 300MB available

-This was in single user mode.
I ran vxprint - and it hung.

After a reset (from ok>), it came up.

There is a crash dump, but the timestamp is 9:30, so I am sure
that this is the timestamp of the reset, which...well, I know why it
crashed then...
(Sun didn't seem to want that crashdump, but I could ask them to look
at it).

This system is in a 2-node VCS cluster. Checking the VCS log, I see
nothing unusual.

I called Sun, and they said to run VTS...then, on second thoght, they
said, it is not supported on 2.6

I tried installing the old version of SunVTS from the Solaris 2.6 CD,
and it does not seem to work.
(It seems to try mounting a few random disk slices, then it hangs).

I'd appreciate any suggestions.
I managed to keep developers off this box today, but they really should
be on this server,
since the prod machine has a higher load.

Thanks
Seth
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:48 EDT