420 locks up occasionally

From: Dave Lowenstein (dlowenst@mail.sdsu.edu)
Date: Mon Dec 15 2003 - 12:58:18 EST


I have a 420 with 4x450 mhz processors and 4 gb of ram hooked up to a t3
storage array. This morning, for the fourth time in recent memory, it
completely locked up.

I was able to connect to it via ssh but after asking for my password it
never gave me a shell.

I had someone get on the console and dtlogin wouldn't respond to any
keystrokes.

Stop-A didn't even do anything. We had to hit the power switch.

This has happened just like this one other time recently.

There is a different version of this lockup wherein I can connect via ssh
but I can't kill any processes as root. In this other scenario even typing
init 6 doesn't restart the machine, but who -r shows the runlevel as 6.

I opened a ticket with sun and they basically told me "you must be running
some poorly written application", avoiding any thought that something
might be hosed with the hardware. They told me that processes can get in
to a state where even root can't kill them (I forget what it was
called...it wasn't a zombie process) and that root is basically at the
mercy of my application.

I have nothing of interest in any of the log files, nothing from the
console log, nothing from the t3's logs, and no files obviously modified
at or even near the time things locked up.

Does anyone have any suggestions as to what the problem might be?

Dave Lowenstein
Analyst/Programmer
Instructional Technology Services
San Diego State University
(619)594-0270
http://www-rohan.sdsu.edu/dept/its
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:27:41 EDT