SUMMARY: Missing disk space

From: Adametz, Bluejay (bluejay@fujigreenwood.com)
Date: Wed Jul 23 2003 - 15:42:49 EDT


I got lots of responses to this one (too many to enumerate; if you
responded, Thank you!).

Most of the responses involved some use of fuser, du, or lsof. I probably
didn't do as good a job as I might have of listing what we're already tried,
since we tried these already.

fuser -d /usr returns nothing.

All the files returned by lsof /usr do appear in directories (I did a little
script to awk the output and test for the file). lsof does report a lot of
instances of /usr, which I don't know the significance of.

du is not helpful since it just reports the 3gb that we can see is used, not
the missing 2gb+.

There are no clones.
 
The application is a mix of some home-grown code (which we have the sources
for) and some middle-ware products (which we don't have the sources for).

I'll take any further ideas. Thanks!

                                                - Bluejay Adametz

A good listener is not only popular everywhere,
but after a while, he knows something. -Wilson Mizner

> We have this application running on a V4.0G PK3 PS cluster
> that consists of
> a couple hundred processes. Over the course of time, the /usr
> file system
> runs out of space, but we are unable to locate where it's
> going. du reports
> only ~3gb (out of ~6gb) used, but df and showfdmn shows the
> (advfs) file
> system filling up, and eventually writes fail because of lack
> of space.
>
> # du -ks /usr
> 3800152 /usr
> # df -k
> Filesystem 1024-blocks Used Available Capacity
> Mounted on
> root_domain#root 262144 93312 161592 37% /
> /proc 0 0 0 100% /proc
> usr_domain#usr 6526976 3455289 798720 82% /usr
> home_domain#home 17778192 14696676 2940192 84% /home
> # showfdmn -k usr_domain
> Id Date Created LogPgs Domain Name
> 387ac2f0.00090660 Tue Jan 11 00:43:12 2000 512 usr_domain
>
> Vol 1K-Blks Free % Used Cmode Rblks Wblks Vol Name
> 1L 6526976 798720 88% on 128 128 /dev/re0g
> K5MESAP3#
>
> If we stop the application, all the space comes back.
>
> At the advice of HP, we tried enabling quotas on this file system and
> periodically running quotacheck, but that just results in the
> inconsistent
> numbers shown by df above. We tried using trace to track down
> the offending
> file(s), but that lead nowhere.
>
> I've suggested killing the application one process at a time
> to narrow down
> the culprit, but the application administrator doesn't want
> to do that.
>
> Any ideas on how we can track this down?
>



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:28 EDT