Summary: Further questions concerning load average

From: Brewer, Edward (BREWERE@OD.NIH.GOV)
Date: Tue May 14 2002 - 12:34:36 EDT


Admins,

        Thanks for all of the responses. I will post them here for all to
read.

from: alan@nabeth.cxo.cpqcorp.net

        A load average of 1-3 doesn't seem particulary high
        if the system is actually doing work much of the
        time. I don't know whether the run queue length
        that is the basis of the values is just processes
        waiting to run (outside those running) or if it
        includes those running.

        What are you using to monitor the idle time? vmstat
        only shows user, idle and system. It doesn't show
        the split of idle into true idle and I/O wait time.
        It also doesn't show per-CPU stats. Collect should
        and I know Monitor does. The example program cpustat
        will also show per-CPU stats.

        Check memory usage and paging. Page-ins are necessarily
        a good sign of paging activity since program start-up
        also has page-ins counted for it. Page-outs are a good
        place to look for paging activity. Lots of page-ins and
        other fast page-faults are a often a sign of program
        quickly creating new processes, which also exit quickly.
        Fork and exec is fast, but not so fast that it can be
        assumed to take no time. Such loads will also have high
        system call rates. Use Monitor or vmstat to look at fork
        rates. For vmstat you have to take successive samples
        to get a rate.
        
I wrote back:

        I don't see anything abnorml in my values....pages in are fine, with
no page outs, swap is not being allocated, nor do I see I/O waits in
collect, the average service time for the disk I/O is less than 50 ms and I
find that the average wait time is even lower. I see that the disk I/O can
reach as high as 25-40 MB/sec....I just can't find out why when I start an
index generation on the old production system it's throughput is 1.2 MB/sec
and on the new system it is 200-300 bk/sec with the TPS about the same.
Both use advfs, and both are set to the defaults for the volumes....the old
system is running version 3 and the new version 4.

Alan responsed:
        
        The first thing I learned when I started working at
        Digital customer support center was "when something
        stops working correctly, look for what changed". One
        significant difference in the new and old system is
        the AdvFS domain version. Is that difference also
        reflected in the operating system version; V4 for the
        old system and V5 for the new? Or V5.1 for one and
        V5.1A for the other? After all the things that are
        different between the two systems and that's where the
        problem is.

        Why is another matter and may simply be the manifestation
        of a bug in the supporting kernel code. One odd thing about
        what you described is that the data rate is significantly
        different, but the request rate is about the same. That
        just says that the requests are smaller. Is the *work*
        being done in the same amount of time? Taking longer?
        Taking less time?

I respond to everyone:

        I am attempting to do the same actions in oracle, index the same
table, on two very similar machines. The amount of time that it takes to
perform on the old machine is x2 to x3 faster. The differences between the
systems are operating systems ( the old system is running 4.0f the new 5.1
with the new advfs, disk layout (the old system is configured on an HSZ70
and the new on a HSG80 SAN. Also the disk layout on the old system was
planned by the contractors while the new system was configured by Compaq.)
There may be differences elsewhere but I haven't found them. Also we have
run select statements on our new GS-160(8 1 GHZ processors) system and
compared it to the old system ES40 (4 667 processors)and found that the old
system can come back with a response of 3 seconds and the new system 24
seconds. I know that most people are going to jump at the oracle
parameters, but according to the DBA the are running the exact same
parameters. We even have oracle representatives working here.

From: Gavin Kreuiter [gavin@dhsolutions.co.za]

My knowledge is also a bit rusty, but as far as I remember, a load average
<= n (for a system with n processors) means that, on average, the CPU run
queue is NOT a bottleneck. But remember that this is an average, so it is
possible that in a one minute interval, 60 processes/threads were ready to
run in the first second, and none for the remaining 59 seconds, still giving
an average run queue length of 1.0. I also believe that the run queue
comprises processes "ready to run" (NOT blocked on I/O, semaphores, etc),
while, at the same time, a process MUST be placed ON the queue (hence
contributing to the statistics) before it can actually run.

Do not try to use this value as an absolute. It is an average, and is only
meant to supply an indication of whether a system has a potential CPU
bottleneck; reading too much into the actual values is really not advisable.

A better measure of the CPU bottleneck is the idle time (vmstat et al.) If
%idle is close to zero over an extended period, it is a sure indicator of a
CPU bottleneck. top and iostat (amongst others) provide a useful %WIO
value, which are processes that *would* be on the run queue, but are waiting
for completion of I/O. But even this can be misleading. Observation during
a standalone backup will typically show %WIO > 90%, because of the relative
speeds of disk reads v. tape writes.

From: Nemholt, Jesper Frank [JesperFrank.Nemholt@hp.com]

Runqueue/load average in Tru64, both related to uptime and to collect is the
same. The AVG5, AVG30 & AVG60 in Collect are averaged varlues of the RUNQ
field.

The explanation from Alan is right on. If you have a 4 CPU machine with a
runqueue between 1 and 3, then you usually have less jobs in the runqueue
than the machine is capable of processing (4) concurrently.
If you have a runqueue of 8 on the same machine, they you have a problem.
I have several machines with runqueues nearing 100, and only 8-16 CPUs. I
have a bigger problem ;-)

The optimal is to have a runqueue close to (or less than) the number of CPUs
in the machine, but as Alan describes, there are lots of other parameters
that count in.

To find the source of the performance problem, you now know that it isn't
the runqueue. Next thing is to see if those CPUs that are busy at any time,
are 100% (or close) busy, or just a little bit, to see if the bottleneck is
caused by a few (< 4) but very CPU hungry jobs, or whether the problem is
elsewhere.

If the CPUs aren't utilized fully, you'll have to start looking at swap (is
the machine swapping. If in eager swap mode, you may want to go to lazy swap
to prevent preallocation), look at contextswitching (and the parameter to
control it, round-robin-switch-rate (see man sys_attrs_proc) (usually only a
problem if you have a high runqueue, so the fight for CPU cycles is more
frequent). Look at systemtime & waittime (is the machine spending alot of
time waiting for devices (storage)).
If you run your storage without writeback cache it can severely slow down
the performance (but remember to only use writeback if you have
batterybacked cache & UPS).
You can also tune the storage by setting the cache flush timer a bit higher
than default, and set max transfer size higher (ex. 1024. Default is 256 as
far as I remember).
Look at UBC cachesettings (/sbin/sysconfig -q vm, in the top). A server
running Oracle is better off with a reduced UBC and enlarged SGA (Oracle
cache). Default, ubc-maxpercent is 100. Try reducing it to 60-80%.
You can see with collect -sm how the memory is assigned. Check for
page-in/out activity.
Check that the db_block_size is 8192 or multiples of this in init.ora. If
it's smaller it may be a problem (which will often show up with high
systemtime, and bad Oracle performance).
Ask the DBA for performance statistics in Oracle related to access time to
the datafiles, so you can see if the problem in Oracle is related to slow
disk performance.

You run Tru64 v5.1. Are the filesystems created with v5.1, or is it a box
migrated from v4.x ?
If migrated, have you converted the filesystems to the new version of AdvFS
? (the old version performs poorly under V5).
Is Oracle using Direct I/O ?
Is it a cluster, or just 1 machine ?

From: Kevin Raubenolt [raubenol@ohio.edu]

The last few weeks we had Oracle performance issues which were linked to
I/O. These issues were Oracle 8.1.6 (I believe) running on TruCluster 5.1A
had Direct I/O issues that were resolved by an Oracle patch and disabling
Direct I/O on Oracle's end. I don't believe Oracle has the same issues
when running on 5.1 but thought I'd point that out to you. Worth looking
into.

From: Bob Vickers [bobv@cs.rhul.ac.uk]

One minor correction: I'm pretty sure (from observation) that the load
average includes running jobs as well as queued jobs. So if your system
contains one CPU-bound job plus a bunch of I/O bound daemons etc your
load average will be just over 1.

If you divide the load average by the number of CPUs you get a measure of
the demand per CPU.

Here is a real-live example from 'top' on a 3-CPU system:

load averages: 1.35, 1.24, 1.19 11:02:48
404 processes: 2 running, 70 sleeping, 319 idle, 7 stopped, 6 zombie
CPU states: 34.0% user, 0.0% nice, 8.0% system, 57.9% idle
Memory: Real: 812M/1501M act/tot Virtual: 206M/7050M use/tot Free: 263M

  PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
20739 hugh 44 0 93M 59M run 582:28 100.00% matlab
...

to all others, thanks...

Lee Brewer



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:41 EDT