Summary: IO-Wait of 99% - how to diagnose

From: extern.Tobias.Kronwitter@AUDI.DE
Date: Thu Dec 23 2004 - 11:08:39 EST


Hello all,

thank you for the overwhelming help:

Kanellopoulos Angelos
Jeremy Ahl
PEter (ITServ GmbH)
Imrick Michael
Beth Dodge
Alvin Gunkel
Clive McAdam
Terry Franklin
Murugesan K
Mossey Fahey
Victor Engle
Rebstock, Roland

The broad consensus is, that most likely have a "display problem".
The io ist not really high, nor is the server slow, which would indicate a
high load.

After a reboot however, the iowait indicated a normal values again. Up to
now (I waited with this summary) we had no high iowaits any more.
In case we will experience the problem again, we will install the bug-fix:

        --------------------
        Tobias,
        
        Sun introduced a bug in kernel patch 108528-28. I thought the bug
was
        fixed in -29 but from your stats it appears not to have been. You
may
        try installing 117000-05 which seems to be the latest kernel patch.
        
        Here is a link to the bug description on sunsolve and a link to the
new
        kernel patch,

        
http://sunsolve.sun.com/search/document.do?assetkey=urn:cds:docid:1-1-497822
8-1
        
http://sunsolve.sun.com/pub-cgi/pdownload.pl?target=117000-05&method=h

        Vic
        ------------------

If so, I will post a second summary.

Thank you
Season Greatings to all of you

Regards Tobias

Hello all,

on a Solaris8 / SUN-Fire V440 (SunOS iuaw740 5.8 Generic_108528-29 sun4u
sparc SUNW,Sun-Fire-V440) we are experiencing a very high IO-Wait problem.
This Server is configured with Veritas vxvm 4.0 / mp1 and has SAN-Disks
connected via an Emulex 9002 FCA.

top reports the following:

load averages: 0.02, 0.01, 0.02
11:02:12
82 processes: 81 sleeping, 1 on cpu
CPU states: 0.0% idle, 0.0% user, 0.5% kernel, 99.5% iowait, 0.0% swap
Memory: 8192M real, 6375M free, 469M swap in use, 22G swap free

   PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
  5818 root 1 58 0 2344K 1440K cpu/0 0:00 0.05% top
   813 root 6 39 0 5240K 4440K sleep 4:56 0.04% picld
 29405 root 6 58 0 4728K 2816K sleep 0:05 0.01% elxdiscoveryd
 28964 root 1 48 0 2544K 2008K sleep 0:00 0.01% bash
  5813 root 1 38 0 6248K 2728K sleep 0:00 0.01% sshd
  1444 root 12 58 0 5368K 5080K sleep 0:31 0.00% mibiisa
 17990 dctm_run 3 58 0 40M 12M sleep 0:07 0.00% documentum
  5816 dctm_run 1 38 0 1392K 1144K sleep 0:00 0.00% sar
  5817 dctm_run 1 48 0 1456K 1128K sleep 0:00 0.00% sadc
  1471 root 1 58 0 0K 0K sleep 0:59 0.00% se.sparcv9.5.8
   980 root 5 58 0 4200K 2440K sleep 0:17 0.00% automountd
 10808 dctm_run 5 58 0 39M 22M sleep 0:09 0.00% documentum
    17 root 1 58 0 12M 10M sleep 0:07 0.00% vxconfigd
  3111 dctm_run 4 58 0 5672K 3728K sleep 0:07 0.00% dmdocbroker
 28215 dctm_run 1 2 0 1896K 1440K sleep 0:06 0.00% ksh

iostat doesn't indicate hi disk io:

bash-2.03# iostat 5 15
   tty sd0 sd1 sd2 sd3 cpu
 tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt
id
   8 35 0 0 0 100 4 9 100 4 10 4 1 9 1 1 25
73
 125 451 0 0 0 14 4 7 14 4 6 0 0 0 0 1 99
0
   0 16 0 0 0 0 0 5 0 0 6 0 0 0 0 1 99
0
   0 16 0 0 0 9 18 3 9 18 3 0 1 4 0 1 99
0
   0 16 0 0 0 67 26 23 61 26 26 2 1 6 0 0 100
0
   0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0
   0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0
   0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0
   0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 1 99
0
   0 16 0 0 0 4 8 3 4 8 3 0 0 4 0 1 98
0
  10 37 0 0 0 2 1 5 2 1 4 0 1 5 0 0 100
0
   0 16 0 0 0 4 3 23 21 5 19 0 0 0 0 0 100
0
  35 137 0 0 0 5 2 7 5 2 5 0 0 0 0 0 100
0
 126 421 0 0 0 14 4 5 14 4 6 0 0 0 0 1 99
0
   0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 1 99
0

the san disks are not under load either:

iostat -xnp
                    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0
    1.5 2.7 50.7 50.5 0.0 0.0 0.0 8.8 0 2 c1t0d0
    0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.1 0 0 c1t0d0s0
    0.0 0.1 0.0 0.4 0.0 0.0 0.0 6.2 0 0 c1t0d0s1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0 c1t0d0s2
    1.2 2.7 50.5 50.0 0.0 0.0 0.0 9.7 0 2 c1t0d0s3
    0.2 0.0 0.2 0.0 0.0 0.0 0.0 0.9 0 0 c1t0d0s4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t0d0s5
    1.5 2.7 51.4 50.0 0.0 0.0 0.0 10.1 0 2 c1t1d0
    0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t1d0s0
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t1d0s1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0 c1t1d0s2
    1.3 2.7 51.3 50.0 0.0 0.0 0.0 10.6 0 2 c1t1d0s3
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.2 0 0 c1t1d0s4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t1d0s5
    0.6 0.5 2.5 1.9 0.0 0.0 0.0 9.1 0 0 c1t2d0
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0 c1t2d0s2
    0.4 0.5 2.3 1.9 0.0 0.0 0.0 12.1 0 0 c1t2d0s3
    0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.2 0 0 c1t2d0s4
    0.4 0.5 1.5 1.9 0.0 0.0 0.0 9.9 0 0 c1t3d0
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t3d0s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.2 0 0 c1t3d0s3
    0.2 0.5 1.4 1.9 0.0 0.0 0.0 12.8 0 0 c1t3d0s4
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.9 0 0 c3t30d0
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.9 0 0 c3t30d0s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d0s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t30d1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t30d1s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d1s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0 0 c3t30d2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0 0 c3t30d2s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d2s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t30d3
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t30d3s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d3s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d4s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d4s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d5
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d5s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d5s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0 0 c3t30d6
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0 0 c3t30d6s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d6s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0 0 c3t30d7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0 0 c3t30d7s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d7s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d8
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t30d8s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t30d8s7
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t70d0
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t70d0s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d0s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0 0 c3t70d1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0 0 c3t70d1s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d1s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t70d2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t70d2s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d2s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t70d3
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0 0 c3t70d3s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d3s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t70d4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0 0 c3t70d4s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d4s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d5
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d5s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d5s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d6
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d6s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d6s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d7s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d7s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d8
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0 0 c3t70d8s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t70d8s7
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c4t31d0
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c4t31d0s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d0s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d1s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d1s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d2s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d2s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d3
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d3s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d3s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d4s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d4s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d5
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d5s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d5s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d6
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d6s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d6s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d7s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d7s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d8
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d8s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t31d8s7
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c4t71d0
    0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.7 0 0 c4t71d0s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d0s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d1
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d1s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d1s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d2s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d2s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d3
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d3s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d3s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d4
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d4s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d4s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d5
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d5s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d5s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d6
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d6s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d6s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d7s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d7s7
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d8
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d8s2
    0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t71d8s7

        --> c3t30d0, c3t70d0 are the same LUN viewed at via two
hba's => is one plex (mirror) of a volume
                c4t71d0, c4t71d0 are the same LUN viewed at via two
hba's => is the other plex of the same volume

It looks like, the disks aren't the problem.

Network looks ok also:

RAWIP
        rawipInDatagrams = 0 rawipInErrors = 0
        rawipInCksumErrs = 0 rawipOutDatagrams = 0
        rawipOutErrors = 0

UDP
        udpInDatagrams = 33591 udpInErrors = 0
        udpOutDatagrams = 33596 udpOutErrors = 0

TCP tcpRtoAlgorithm = 4 tcpRtoMin = 400
        tcpRtoMax = 60000 tcpMaxConn = -1
        tcpActiveOpens = 13806 tcpPassiveOpens = 14845
        tcpAttemptFails = 9 tcpEstabResets = 749
        tcpCurrEstab = 15 tcpOutSegs =13312404
        tcpOutDataSegs =11237898 tcpOutDataBytes =63660618
        tcpRetransSegs = 422 tcpRetransBytes =336018
        tcpOutAck =2067205 tcpOutAckDelayed =1853406
        tcpOutUrg = 0 tcpOutWinUpdate = 15
        tcpOutWinProbe = 13 tcpOutControl = 58063
        tcpOutRsts = 1520 tcpOutFastRetrans = 85
        tcpInSegs =11835531
        tcpInAckSegs =10236185 tcpInAckBytes =63671992
        tcpInDupAck = 39885 tcpInAckUnsent = 0
        tcpInInorderSegs =9700449 tcpInInorderBytes =1566751826
        tcpInUnorderSegs = 1 tcpInUnorderBytes = 551
        tcpInDupSegs = 64 tcpInDupBytes = 4171
        tcpInPartDupSegs = 0 tcpInPartDupBytes = 0
        tcpInPastWinSegs = 0 tcpInPastWinBytes = 0
        tcpInWinProbe = 0 tcpInWinUpdate = 3
        tcpInClosed = 184 tcpRttNoUpdate = 347
        tcpRttUpdate =10222115 tcpTimRetrans = 1649
        tcpTimRetransDrop = 5 tcpTimKeepalive = 181
        tcpTimKeepaliveProbe= 16 tcpTimKeepaliveDrop = 1
        tcpListenDrop = 0 tcpListenDropQ0 = 0
        tcpHalfOpenDrop = 0 tcpOutSackRetrans = 0

IPv4 ipForwarding = 2 ipDefaultTTL = 255
        ipInReceives =11593385 ipInHdrErrors = 0
        ipInAddrErrors = 0 ipInCksumErrs = 0
        ipForwDatagrams = 0 ipForwProhibits = 0
        ipInUnknownProtos = 0 ipInDiscards = 0
        ipInDelivers =11849241 ipOutRequests =13132669
        ipOutDiscards = 0 ipOutNoRoutes = 3
        ipReasmTimeout = 60 ipReasmReqds = 0
        ipReasmOKs = 0 ipReasmFails = 0
        ipReasmDuplicates = 0 ipReasmPartDups = 0
        ipFragOKs = 0 ipFragFails = 0
        ipFragCreates = 0 ipRoutingDiscards = 0
        tcpInErrs = 0 udpNoPorts = 4188
        udpInCksumErrs = 0 udpInOverflows = 0
        rawipInOverflows = 0 ipsecInSucceeded = 0
        ipsecInFailed = 0 ipInIPv6 = 0
        ipOutIPv6 = 0 ipOutSwitchIPv6 = 169

What else could be the reason ?
===============================

Who could we diagnose this problem ?
====================================

Thank you for your help.
Tobias
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:55 EDT