Re: accessing a system with high load

From: Green, Simon (Simon.Green@EU.ALTRIA.COM)
Date: Thu Feb 27 2003 - 09:41:13 EST


Once things get bad, I don't think that there's much you can do.
Sometimes, an existing telnet session will survive and can be used to kill
off high memory users.

I think that the best way to deal with this sort of situation is through
monitoring, with an alarm sent to a pager or something if paging space
utilisation reaches, say 80%.

That's what we do, using Patrol. We've still been caught out a couple of
times, though, on one of our QA systems. (The application has a memory leak
which hasn't been sorted out yet. Every now and then it goes berserk and
eats all the memory.)

You could create a small paging space somewhere, not normally swapped on,
and activate it when things get bad. That would give you a breathing space
to take care of the problem without the risk of your paging spaces creeping
up in size. Since it would just be for emergencies you could allocate it on
a disk which already had a paging space, if necessary. Obviously you still
need some sort of monitoring and alarm system.

Simon Green
Altria ITSC Europe s.a.r.l.

AIX-L Archive at http://marc.theaimsgroup.com/?l=aix-l
<http://marc.theaimsgroup.com/?l=aix-l&r=1&w=2> &r=1&w=2
AIX FAQ at http://www.faqs.org/faqs/aix-faq/
<http://www.faqs.org/faqs/aix-faq/>

N.B. Unsolicited email from vendors will seldom be appreciated.

-----Original Message-----
From: Holger.VanKoll@SWISSCOM.COM [mailto:Holger.VanKoll@SWISSCOM.COM]
Sent: 27 February 2003 14:15
To: aix-l@Princeton.EDU
Subject: accessing a system with high load

Hello,

I am thinking about what to do to ensure access to a system where some
application uses that much paging-space that connections (telnet/ssh/getty)
cant be made anymore (fork fails).

Aix5.1 has the ability (shconf) to do certain things if certain-priority
apps dont get cpu anymore.
Also, one could start a high-priority ssh-daemon on bootup.

Thats fine, but I solves the problem when applications consume too much cpu.
That doesnt help if they consume too much paging-space.

As far as I see even ulimit/wlm has no way to solve this problem.

I could try to start sshd with plock(); but that would only get sshd up
running... any command started from there still fails (fork - not enough
memory available now).

So far, I see no other possibility than to increase paging-space and set
high values for npswarn and npskill (vmtune).
The only disadvantage I currently see is more disk-usage for paging-space.

What do you think / what do you do to ensure access to a high-paging system?



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:16:37 EDT