how to identify why running out of semaphores?

From: sun@bagdon.com
Date: Mon Sep 30 2002 - 20:00:57 EDT


(Please be patient, and presume that we HAVE done what you think you're
going to suggest...)

We have an NES 3.62 SP 12 runing under 2.5.1 (yes, we know, don't bother
telling us both are EOL), that appears to be running out of
semaphores. We've trussed the crap out of it, coming up with about 500
megs of data, and believe we have a pattern. The very last thing the
process does before going into 'eternal sleep' ('(sleeping...)') is a
lwp_mutex_lock. It appears that all of the lwp's are taken, and (as
documented), locks wait for an unlock, which isn't happening, and the
process goes to sleep. lwp_sema_p's and lwp_sema_v's are going on, up
until the process does a lock for an unavailable lwp, and just goes to
sleep.

What we believe is happening is that some process is taking over
lwp_locks, but isn't releasing them, (lwp_mutex_unlock), which puts the
instance to sleep. This would coincide (we believe) with the customer
stating the the instance starts slowing down - as the number of available
locks declines, and the number of hits increases, it gets into a
declining-performance issues, each new http session getting slower.

So - the big question. From the truss output (all 500 mges of it), how do
we identify WHAT process is taking up these locks. We don't want to throw
more system resources at the process (ie: just give it more semaphores),
as no matter what we give them, they'll take. So we'd rather figure out
what is really happening, and deal with that.

Any input would be welcome. This gets painfully deep into truss and core
(and the actual C code of the kernel), but we can't believe that we can't
use some Solaris system tool to identify the offending lcok-stealer.

Thanks!

Steve B.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:25:00 EDT