High I/O wait times but little I/O

From: Mermell, Todd (Todd.Mermell@nordstrom.com)
Date: Thu Jan 15 2004 - 01:42:11 EST


Hello,

We have a 4500 with 12 CPUs and 12G of memory. This is Solaris 8 (108528-15).
There is plenty of free memory (5G) and not much disk I/O. But mpstat is
showing 0% idle for every CPU since I first noticed this early today.

See mpstat output below(last output of mpstat 1 10):

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
  0 0 0 35 245 145 39 0 1 4 0 123 0 0 100 0
  1 0 0 0 18 18 20 0 2 2 0 106 0 0 100 0
  4 0 0 707 0 0 64 0 1 6 0 143 0 0 100 0
  5 0 0 0 10 0 12 9 0 0 0 217 100 0 0 0
  8 165 0 0 0 0 42 0 12 2 0 172 0 3 97 0
  9 173 0 0 1 0 39 0 7 7 0 1031 1 6 93 0
 10 0 0 43 92 92 105 0 3 20 0 259 0 0 100 0
 11 0 0 0 0 0 95 0 4 6 0 90 0 0 100 0
 12 0 0 0 89 89 78 0 3 17 0 88 0 0 100 0
 13 0 0 2 5 0 224 4 3 3 0 841 32 2 66 0
 14 0 0 0 4 0 212 3 2 3 0 824 35 0 65 0
 15 0 0 0 8 0 9 8 0 0 0 226 100 0 0 0

It does not matter how idle or busy the system is in terms of % CPU usage in
user or system mode. The remaining amount will be in wait so that idle shows 0
every time.

Here is the iostat output (iostat -xnz 1 10) for disks with activity:

                    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    8.0 0.0 4090.3 0.0 0.0 0.3 0.0 34.5 0 27 c3t23d10
   17.0 0.0 8691.8 0.0 0.0 0.7 0.0 42.0 0 60 c3t23d11
    5.0 0.0 2556.4 0.0 0.0 0.2 0.0 41.5 0 16 c3t23d16
    4.0 0.0 2045.1 0.0 0.0 0.1 0.0 34.6 0 14 c3t23d21
    4.0 0.0 2045.1 0.0 0.0 0.2 0.0 44.8 0 18 c3t23d24
    4.0 0.0 2045.0 0.0 0.0 0.2 0.0 55.8 0 19 c3t23d33
    4.0 0.0 2045.0 0.0 0.0 0.2 0.0 47.5 0 19 c3t23d35
    2.0 0.0 1022.5 0.0 0.0 0.1 0.0 37.8 0 8 c3t23d36
    4.0 0.0 2045.0 0.0 0.0 0.2 0.0 39.9 0 16 c3t23d55
    9.0 0.0 4601.2 0.0 0.0 0.5 0.0 60.7 0 45 c4t8d10
   15.0 0.0 7668.6 0.0 0.0 1.0 0.0 68.2 0 70 c4t8d11
    6.0 0.0 3067.4 0.0 0.0 0.4 0.0 62.9 0 29 c4t8d16
    4.0 1.0 2044.9 8.0 0.0 0.3 0.0 55.9 0 22 c4t8d21
    2.0 1.0 1022.4 8.0 0.0 0.3 0.0 94.8 0 17 c4t8d24
    3.0 0.0 1533.6 0.0 0.0 0.3 0.0 102.1 0 21 c4t8d33
    3.0 0.0 1533.6 0.0 0.0 0.2 0.0 73.5 0 17 c4t8d35
    2.0 0.0 854.7 0.0 0.0 0.1 0.0 40.6 0 8 c4t8d36
    4.0 0.0 2044.8 0.0 0.0 0.2 0.0 49.7 0 17 c4t8d55

This is on a SAN with plenty of bandwidth and there are not a lot of IOPS
given what we have seen in the past.

Apparently this happened a month ago and the system was rebooted. It would be
nice to isolate the problem to get a better idea on what's causing this
behavior. Any ideas?

Thank you in advance,

Todd Mermell
Nordstrom IT Services
(206) 233-5416
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:27:49 EDT