SUMMARY: runaway ksh process

From: Eiler, James A. (James.Eiler@alcoa.com)
Date: Tue Feb 10 2004 - 08:10:52 EST


The definitive answer came first from John Lanier:

The ksh problem is addressed/fixed in 5.1B pk#2 and above.

Here is a description:
=======================

/usr/bin/ksh can use up to 100% CPU Time

A ksh process does not terminate if a user closes a telnet session abruptly (for example,
by using the "X" in the upper right corner of the window). The process continues to run
and can use up to 100% of the CPU on which it is running.

This problem occurs when trap(1) is defined in either a startup script or a script
executed within the current shell process.

Thanks to:

James Sainsbury
Rafael Visser
John Lanier
Bryan Mills
Johan Brusche
Martin Petder
Thomas Rohr Pedersen

Original question:

Dear gurus,

I've got an AlphaServer 4100 running Tru64 5.1B, Patch Kit 1.

Every few days we get a runaway ksh process. top reports the following:

load averages: 1.94, 2.89, 3.07 19:45:13
124 processes: 8 running, 48 sleeping, 67 idle, 1 zombie
CPU states: 18.3% user, 0.0% nice, 81.6% system, 0.0% idle
Memory: Real: 324M/486M act/tot Virtual: 1991M use/tot Free: 12M

  PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
452737 root 51 0 2576K 368K run 17.9H 61.30% ksh
448455 root 3 -82 13M 614K sleep 213:33 10.20% dmct
  2138 root 4 -80 11M 466K sleep 605:07 6.00% ccdt
  2322 root 44 0 13M 647K run 45:07 5.70% dtrc
  1830 root 5 -78 13M 548K sleep 475:43 4.80% dprd
340897 root 6 -76 21M 688K sleep 3:00 3.10% dasc
  1589 root 29 -30 12M 10M sleep 9:54 2.10% cmds
 28015 root 10 -68 13M 557K sleep 5:13 1.20% dafc
 50460 root 44 0 4680K 1802K run 0:00 0.60% top
  1444 root 44 0 12M 5758K run 9:18 0.40% Xdec
  2349 root 23 -42 2864K 311K sleep 23:15 0.40% dfb2
  2343 root 23 -42 2864K 311K sleep 18:38 0.30% dfb1
316028 root 7 -74 14M 1351K sleep 6:47 0.10% hmists
448406 root 12 -64 13M 598K sleep 4:36 0.10% dcds
  1906 root 44 0 11M 1449K sleep 4:24 0.10% dtterm

If I try to find the parent, I find the following:

# ps -ef | grep 452737
root 51980 5429 0.0 20:07:49 pts/6 0:00.01 grep 452737
root 452737 1 73.3 Feb 08 pts/11 18:08:01 -ksh (ksh)

Can anyone suggest how I might find the origin of this runaway ksh process?

Thanks,

Jim



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:50 EDT