[SUMMARY] Oracle and odd filesystem problems

From: Gedaliah Wolosh (gwolosh@njit.edu)
Date: Thu Feb 09 2006 - 10:58:00 EST


The problem:

We started to experience a problem with the filesystem which houses the
Oracle binaries (/orabin). This filesystem is mirrored with Solaris
Volume Manager. We are running Solaris 9 and Oracle 9.2.0.6.0.

This filesystem appears to grow although nothing is being written to it
at the time.

Yesterday we experienced a disk error which we though might be the cause
of this. We replaced the disk, and ran newfs on the metadevice. Using
"df -k" the partition showed ~50% capacity.

Today -

Filesystem size used avail capacity Mounted on
/dev/md/dsk/d70 6.2G 5.3G 898M 86% /orabin

prophet# du -hs /orabin
4.3G /orabin

The solution:

This answer from frisco was the one that clicked. I had a cron job
removing trace files to conserve disk space. These trace files were
being deleted while still being written to.

Thanks frisco and the others that answered.

A difference between what du and df report, for a single filesystem, is often
caused by a running process keeping a filehandle open for a deleted file. For
example, if you have a logfile on /orabin which a running oracle process has
kept open, but some backup or archiving script has deleted that file, then the
space will still be used on the disk until the running process has been killed
and the file handle released.

du walks through the filesystem counting the space used, and doesn't see the
file since it has been technically deleted. df sees that the space is still
being reserved by the system, since the system knows there is still a file
handle open and using an amount of space on that filesystem.

lsof can be useful for diagnosing such problems.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:38:55 EDT