AW: System thrashing?

From: Sippel, Christian (Christian.Sippel@IZB.DE)
Date: Wed Apr 07 2004 - 10:12:27 EDT


Hi Bill,
here is the output. The system is connected both to an ESS an a FAStT.
hdisk0 and -1 are local SCSI-disks where the paging space resides while the
other disks are actually pathes to physical disks inside of the ESS (hdisks
82, 83 and 84 are on FAStT).

Thanks,
Christian

root@o00tsmoe1:> iostat 3 3|grep -v 0.0

tty: tin tout avg-cpu: % user % sys % idle %
iowait

Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 25.2 234.3 50.8 213710777 190345288
hdisk1 11.8 122.1 29.6 20230462 190345288
hdisk88 1.0 25.2 3.8 33338807 10175990
hdisk89 1.0 24.8 3.5 33440042 9319915
hdisk90 0.9 25.0 3.5 33474972 9698960
hdisk91 1.0 25.2 3.8 33324832 10157183
hdisk92 1.0 24.8 3.5 33395810 9338181
hdisk93 0.9 25.0 3.5 33446956 9680556
hdisk94 1.0 25.3 3.8 33374202 10175999
hdisk95 1.0 24.8 3.5 33406766 9321041
hdisk96 0.9 25.0 3.5 33420664 9695520
hdisk97 1.0 25.2 3.8 33352517 10143644
hdisk98 1.0 24.8 3.5 33492370 9340231
hdisk100 0.9 25.3 3.8 33507581 10161538
hdisk101 1.0 24.7 3.5 33366625 9294456
hdisk102 1.0 25.0 3.5 33368232 9666411
hdisk103 0.9 25.3 3.8 33529710 10136950
hdisk104 1.0 24.7 3.5 33332714 9302321
hdisk105 1.0 25.0 3.5 33378992 9690272
hdisk106 0.9 25.3 3.8 33472950 10184587
hdisk107 1.0 24.7 3.5 33355460 9308425
hdisk108 1.0 25.0 3.5 33388196 9697411
hdisk109 0.9 25.3 3.8 33473738 10164647
hdisk110 1.0 24.7 3.5 33374408 9295716
hdisk111 1.0 25.0 3.5 33349102 9689849
hdisk112 0.9 25.3 3.8 33419543 10159626
hdisk113 1.0 24.7 3.5 33323744 9342108
hdisk114 1.0 25.0 3.5 33348171 9687638
hdisk115 0.9 25.3 3.8 33472455 10164420
hdisk116 1.0 24.7 3.5 33326354 9331462
hdisk117 1.0 24.9 3.5 33304304 9663781
hdisk118 0.9 25.3 3.8 33489063 10172664
hdisk119 1.0 24.7 3.5 33290885 9321929
hdisk120 1.0 25.0 3.5 33352919 9692592
hdisk121 0.9 25.3 3.8 33475297 10159592
hdisk122 1.0 24.8 3.5 33358303 9342305
hdisk123 1.0 25.0 3.5 33371585 9698566
hdisk124 0.9 25.3 3.8 33401412 10158392
hdisk125 0.9 24.8 3.5 33492782 9299330
hdisk126 0.9 25.0 3.5 33455440 9683888
hdisk127 0.9 25.3 3.8 33428529 10156701
hdisk128 0.9 24.8 3.5 33473364 9313009
hdisk129 0.9 25.0 3.5 33452794 9690324
hdisk130 0.9 25.3 3.8 33423479 10168230
hdisk131 0.9 24.8 3.5 33498866 9289546
hdisk132 0.9 25.0 3.5 33462207 9676797
hdisk133 0.9 25.3 3.8 33427422 10167885
hdisk134 0.9 24.8 3.5 33498704 9317467
hdisk135 0.9 25.0 3.5 33433278 9677542
hdisk82 21.4 458.1 32.5 46426038 743685272
hdisk83 19.2 719.4 33.4 311639923 929159032
hdisk84 20.5 1430.1 31.3 1889949659 576646168

tty: tin tout avg-cpu: % user % sys % idle %
iowait

Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 76.7 510.4 127.9 1180 364
hdisk1 13.6 126.9 31.4 20 364
hdisk89 0.3 6.6 0.7 20 0
hdisk90 0.3 31.7 2.6 96 0
hdisk91 0.3 5.3 0.7 16 0
hdisk92 0.7 10.6 1.7 32 0
hdisk93 0.3 25.1 3.6 76 0
hdisk94 1.0 25.1 2.0 76 0
hdisk95 1.3 10.6 2.3 32 0
hdisk96 0.3 13.2 2.0 40 0
hdisk99 0.7 9.3 1.7 28 0
hdisk102 0.3 18.5 1.7 56 0
hdisk103 0.7 18.5 1.7 56 0
hdisk104 0.7 18.5 1.7 56 0
hdisk105 0.7 29.1 3.0 88 0
hdisk107 1.0 7.9 1.3 24 0
hdisk108 1.0 13.2 1.7 40 0
hdisk111 0.7 27.8 2.6 84 0
hdisk112 0.3 13.2 1.3 40 0
hdisk113 1.0 13.2 1.7 40 0
hdisk114 0.7 27.8 2.3 84 0
hdisk116 0.3 22.5 2.3 68 0
hdisk118 0.3 30.4 2.6 92 0
hdisk119 0.3 31.7 3.0 96 0
hdisk121 0.3 35.7 3.0 108 0
hdisk122 0.3 15.9 2.0 48 0
hdisk123 0.7 14.5 2.6 44 0
hdisk125 0.7 23.8 3.0 72 0
hdisk126 0.3 18.5 2.3 56 0
hdisk129 0.3 6.6 1.7 20 0
hdisk132 0.3 10.6 1.0 32 0
hdisk133 0.3 18.5 1.7 56 0
hdisk134 0.7 26.4 2.3 80 0
hdisk135 0.3 7.9 1.3 24 0
hdisk82 0.7 253.9 1.0 0 768
hdisk83 0.7 253.9 1.0 0 768

tty: tin tout avg-cpu: % user % sys % idle %
iowait

Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 61.5 470.7 117.0 856 568
hdisk1 16.5 189.1 46.9 4 568
hdisk88 1.0 10.6 1.7 32 0
hdisk91 1.0 18.5 2.6 56 0
hdisk93 0.3 9.3 2.0 28 0
hdisk94 0.7 17.2 2.3 52 0
hdisk95 2.6 23.8 3.3 72 0
hdisk96 1.0 14.5 2.3 44 0
hdisk97 0.3 13.2 2.3 40 0
hdisk99 0.7 13.2 1.0 40 0
hdisk100 0.3 6.6 1.7 20 0
hdisk102 0.3 18.5 3.3 56 0
hdisk103 1.0 11.9 2.0 36 0
hdisk104 1.0 25.1 3.0 76 0
hdisk106 0.7 26.4 3.0 80 0
hdisk107 1.3 15.9 2.3 48 0
hdisk108 1.3 18.5 2.3 56 0
hdisk110 0.7 15.9 1.7 48 0
hdisk112 1.0 15.9 2.3 48 0
hdisk113 0.3 14.5 2.3 44 0
hdisk114 1.7 14.5 2.6 44 0
hdisk115 0.7 14.5 1.3 44 0
hdisk116 0.7 11.9 2.3 36 0
hdisk117 0.7 15.9 2.0 48 0
hdisk118 0.7 11.9 1.3 36 0
hdisk120 1.3 14.5 2.3 44 0
hdisk121 0.3 19.8 1.3 60 0
hdisk122 0.7 10.6 1.7 32 0
hdisk123 1.0 15.9 1.7 48 0
hdisk124 0.3 6.6 1.0 20 0
hdisk125 0.7 23.8 2.3 72 0
hdisk126 0.7 35.7 3.3 108 0
hdisk127 0.7 19.8 3.3 60 0
hdisk128 1.0 11.9 1.7 36 0
hdisk130 0.3 13.2 2.3 40 0
hdisk131 0.3 17.2 3.0 52 0
hdisk132 0.7 10.6 1.3 32 0
hdisk133 0.3 30.4 2.0 92 0
hdisk134 1.0 6.6 1.7 20 0
hdisk135 0.3 4.0 1.0 12 0
hdisk82 0.7 253.9 1.0 0 768
hdisk83 0.3 169.3 0.7 0 512

-----Ursprüngliche Nachricht-----
Von: Bill Verzal [mailto:BVerzal@KOMATSUNA.COM]
Gesendet: Mittwoch, 7. April 2004 15:48
An: aix-l@Princeton.EDU
Betreff: Re: AW: System thrashing?

Oops - that was supposed to be "iostat 3 3|grep -v 0.0"

My bad..

BV
--------------------------------------------------------

"If everything is coming your way, then you are in the wrong lane"

Bill Verzal
AIX Administrator, Komatsu America
(847) 970-3726 - direct
(847) 970-4184 - fax

             "Sippel,
             Christian"
             <Christian.Sippel To
             @IZB.DE> aix-l@Princeton.EDU
             Sent by: IBM AIX cc
             Discussion List
             <aix-l@Princeton. Subject
             EDU> AW: System thrashing?

             04/07/2004 08:31
             AM

             Please respond to
                  IBM AIX
              Discussion List
             <aix-l@Princeton.
                   EDU>

Hi Bill,
thanks for the answer, here is the output of vmstat 3 3|grep -v 0.0

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
 r b avm fre re pi po fr sr cy in sy cs us sy id wa
 2 1 473321 326 0 25 25 271 985 0 1499 12363 13990 5 14 58 24
 1 1 473329 287 0 32 96 251 383 0 976 3903 2333 5 2 51 43
 0 1 473329 366 0 41 98 251 358 0 958 3759 2109 4 2 48 45

For which reason is "grep -v 0.0" ?

TIA
Christian
-----Ursprüngliche Nachricht-----
Von: Bill Verzal [mailto:BVerzal@KOMATSUNA.COM]
Gesendet: Mittwoch, 7. April 2004 14:53
An: aix-l@Princeton.EDU
Betreff: Re: System thrashing?

It depends on what the I/O waits are from.

Post output of 'vmstat 3 3|grep -v 0.0'
--------------------------------------------------------

"If everything is coming your way, then you are in the wrong lane"

Bill Verzal
AIX Administrator, Komatsu America
(847) 970-3726 - direct
(847) 970-4184 - fax

             "Sippel,
             Christian"
             <Christian.Sippel To
             @IZB.DE> aix-l@Princeton.EDU
             Sent by: IBM AIX cc
             Discussion List
             <aix-l@Princeton. Subject
             EDU> System thrashing?

             04/07/2004 03:33
             AM

             Please respond to
                  IBM AIX
              Discussion List
             <aix-l@Princeton.
                   EDU>

Dear List,

I have p630 with 2 GB of RAM and 2 GB of paging space mirrored on hdisk0
and
hdisk1. When looking at nmon I can see that the system has constantly high
I/O-waits (30 to 60 % of the CPU cycles), even when there is very little
activity on the system. Thus the system is performing rather bad. The usage
of paging space is about 1.5 GB. The machine is used as Tivoli Storage
Management Server, and among the dsmserv processes there is one that has a
size of 1140340 KB in memory.

root@o00tsmoe1:> ps aux | more
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
root 647368 12.1 39.0 1140340 823928 - A Mar 18 6861:59
dsmserv -o /tsm

this are values shown by vmstat:

root@o00tsmoe1:> vmstat 5 20
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
 r b avm fre re pi po fr sr cy in sy cs us sy id wa
 2 1 470691 398 0 25 25 271 989 0 1501 12449 14101 5 14 58 24
 3 1 470697 816 0 54 0 301 518 0 1103 5947 4897 10 9 42 40
 0 1 470697 332 0 64 0 150 316 0 1130 6254 5291 11 10 42 38
 1 1 470697 219 0 69 0 302 744 0 1179 7027 5593 14 10 41 35
 3 1 470697 358 0 57 31 302 5552 0 1178 6564 5311 13 10 41 36
 3 1 470697 467 0 84 2 301 2601 0 1170 6663 5531 11 10 41 39
 4 1 470697 699 0 37 0 376 1727 0 1151 7167 5667 16 9 39 35
 3 1 470697 257 0 35 2 151 1372 0 1139 6526 5367 11 9 43 37
 1 1 470697 602 0 55 0 301 1519 0 1091 6029 4869 10 9 41 39
 0 1 470697 703 0 30 9 226 1691 0 1068 5879 4647 11 9 42 38
 0 1 470697 473 0 38 2 150 1589 0 1055 5568 4475 10 7 42 40
 1 1 470697 237 0 24 17 150 1529 0 1060 5712 4509 8 8 45 39
 3 1 470697 399 0 20 24 301 5430 0 1084 6067 4727 12 8 43 37
 1 1 470697 254 0 18 6 226 4073 0 1233 7163 5919 13 10 42 36
 1 1 470697 140 0 25 12 376 2204 0 1266 7556 5729 16 9 43 32
 2 1 470697 695 0 47 13 376 2337 0 1209 6684 5423 13 10 40 37
 1 1 470697 397 0 25 21 226 566 0 1282 7232 5839 13 10 44 34
 2 1 470697 370 0 27 5 226 727 0 1198 6820 5567 13 9 40 39
 1 1 470697 484 0 45 6 302 1215 0 1238 7103 5847 12 13 42 33
 0 1 470697 478 0 24 11 302 1090 0 1293 7729 6258 15 11 39 35

Is the system currently thrashing? Aren't the values of "po" to low for
thrashing? Would it help to put in another 2 GB of RAM into the box to
lower
the I/O-Waits and improve system performance?

Thanks in advance,
Christian



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:17:48 EDT