From: stan (stanb@panix.com)
Date: Tue May 10 2005 - 19:40:12 EDT
I've got a machine that's sudenly developed a failty latge memory leak (~=
1M/HR). The machine is running:
SunOS AW0550 5.8 Generic_108528-13 sun4u sparc SUNW,Sun-Blade-100
Now, in the past I've used this little script to catch the culprit process:
#!/bin/sh
OUTFILE=/tmp/siz.rpt
date >> $OUTFILE
ps -ef -o vsz -o pid -o comm | sort -n -r >> $OUTFILE
I process the reulsting data file with this perl script:
#!/bin/perl -w
use strict;
# main()
my $pass = 1;
my $records = 0;
my $line_no = 0;
my $a_record;
my @rval;
my @memused;
my @proccnt;
my %record;
my @results;
my @pass_date;
my $field_qty;
my $lkey;
my $mkey;
my $proc_plus_pid;
my $k;
my $v;
my $date;
my $l_size;
{
open FILE , "< siz.rpt" ;
LINE: while(<FILE>)
{
$line_no++;
next LINE if /^$/; # skip blamk lines
chop; # Kill the Newline that came from the file
$a_record = $_;
$a_record =~ s/^\s+//;
# Parse the fields of each line into individual
# records
(@rval)=split(/\s+/,$a_record);
$field_qty = scalar(@rval);
if ($rval[0] eq "VSZ")
{
$pass++;
next LINE;
}
if ($field_qty != 3)
{
if ($field_qty == 6)
{
$date = $a_record;
}
next LINE;
}
else
{
# Good data, having pased all the tests
# Put it into the hash (associative array)
$memused[$pass]+=$rval[0];
$proccnt[$pass]++;
$pass_date[$pass] = $date;
$proc_plus_pid = $rval[2].':'.$rval[1];
$record{$proc_plus_pid}{"SIZE"}[$pass] = $rval[0];
$record{$proc_plus_pid}{"DATE"}[$pass] = $date;
if (!$record{$proc_plus_pid}{LPASS})
{
$record{$proc_plus_pid}{LPASS}=$pass;
$record{$proc_plus_pid}{ISIZE}=$rval[0];
}
$record{$proc_plus_pid}{HPASS}=$pass;
if ($rval[0] != $record{$proc_plus_pid}{ISIZE})
{
# Resident siz of this process has changed
# flag it for review
$record{$proc_plus_pid}{FOOBAR}=1;
}
}
}
# print seselected results
foreach $lkey (sort keys(%record) )
{
if ($record{$lkey}{FOOBAR}) {
print "$lkey\n";
# print "ISIZE $record{$lkey}{ISIZE}\n";
my $zto=$record{$lkey}{HPASS};
$l_size = 0;
while ($zat <= $zto) {
if ( $l_size != $record{$lkey}{SIZE}[$zat])
{
print "$record{$lkey}{DATE}[$zat] size $record{$lkey}{SIZE}[$zat]\n";
$l_size = $record{$lkey}{SIZE}[$zat];
}
$zat++;
}
}
}
print "\n\nMemory Usage:\n";
print "Pass\tDate\t\t\t\tProcs\tMemory\n";
my $zat=1;
while ($proccnt[$zat]) {
print "$zat\t$pass_date[$zat]\t$proccnt[$zat]\t$memused[$zat]\n";
$zat++;
}
}
Which generates output like this:
AW0550# cd /tmp
AW0550# /t4.pl
/opt/fox/hstorian/bin/sampling_ctl:2005
Tue May 10 15:00:00 GMT 2005 size 10080
Tue May 10 17:00:00 GMT 2005 size 10104
/usr/local/bin/ntop:1045
Tue May 10 15:00:00 GMT 2005 size 29528
Tue May 10 17:00:00 GMT 2005 size 29536
/usr/sbin/syslogd:983
Tue May 10 15:00:00 GMT 2005 size 3384
Tue May 10 18:00:00 GMT 2005 size 3392
mibiisa:1838
Tue May 10 15:00:00 GMT 2005 size 2432
Tue May 10 16:00:00 GMT 2005 size 2440
Memory Usage:
Pass Date Procs Memory
1 Tue May 10 15:00:00 GMT 2005 137 672784
2 Tue May 10 16:00:00 GMT 2005 137 672792
3 Tue May 10 17:00:00 GMT 2005 137 672824
4 Tue May 10 18:00:00 GMT 2005 139 688048
5 Tue May 10 19:00:00 GMT 2005 137 672832
AW0550#
script done on Tue May 10 19:30:02 2005
The problem is, this time I don't see a culprit.
Is it possible that something in the kernel is alloating memory that ps
won't report? If so, how can I examine this?
Thanks.
-- U.S. Encouraged by Vietnam Vote - Officials Cite 83% Turnout Despite Vietcong Terror - New York Times 9/3/1967 _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers
This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:41 EDT