Flaky E10k domain

From: Johan Hartzenberg (jhartzen@csc.com)
Date: Fri Jul 26 2002 - 03:44:47 EDT


Hi,

One of my E10K domains have recently started acting up.

In the past few weeks, every couple of days I needed to reboot the domain -
I was still able to log in, but when running "ps" the command would hang so
badly that I could not even exit or close the xterm session.

Today it happened again, but with different symptoms - the df command hangs
while the ps command works fine. To be more exact, the ps command hangs as
soon as it hits the SAN-based mount points.

cd into one of these mount-points and try to run something like "file *"
also hangs.

The messages file and console is clean.

init 6 return and does not reboot.

Eventually I just ran uadmin 1 1 to just get the domain back up, but
obviously now I have no dump or anything and I know this will happen
again....

Maybe I can get something from sar data recorded at the time?

Maybe it's a bad kernel parameter - The apps people are busy setting up new
oracle instances.

Possibly I need a new patch?

  _Johan
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:39 EDT