kernel panic V5.1A

From: Dirk Kleinhesselink (dkleinh@phy.ucsf.edu)
Date: Wed Jun 19 2002 - 14:53:58 EDT


I've got a two member cluster running V5.1A -- two DS10's connected via
memory channel. The other night, one member crashed and rebooted with
this message in the log:

vmunix: panic (cpu 0): ics_unable_to_make_progress: input thread stalled
vmunix: syncing disks.

Now and then I see that the output of the ps command produces some
garbage:

# ps
   PID TTY S TIME CMD
1316826 console ???+ ?? /usr/sbin/getty console console vt100
1462001 pts/1 ???+ ?? -sh (sh)
1473837 pts/1 ???+ ?? ps
1451713 pts/4 ???+ ?? /usr/local/admin/robodump/robodump -0 -f
keck1
1453136 pts/4 ???+ ?? /usr/local/admin/robodump/robodomo -m
dumpmast
1460300 pts/4 ???+ ?? -sh (sh)
1475184 pts/4 ???+ ?? /sbin/vdump -0 -u -b 64 -f /dev/nrmt3h
-U /kec
1475330 pts/4 ???+ ?? /usr/local/admin/robodump/podump -0 -e 0
-f ke

Why the question marks for the S and TIME columns ? The other member
shows:

# ps
   PID TTY S TIME CMD
525908 console I + 0:00.01 /usr/sbin/getty console console vt100
598401 pts/0 S + 0:00.04 -sh (sh)
598866 pts/0 R + 0:00.02 ps

I notice that the member with the messed up ps output has a high memory
usage and vmstat -P indicates that most of the memory is in malloc pages.

Anyone have any idea what the kernel panic means or why the ps output is
messed up ? The system is basically running mail (sendmail), web
(apache) and samba services.

Dirk Kleinhesselink
System and Network Administrator
Keck Center for Integrative Neuroscience
UCSF



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:44 EDT