From: mrbean@mira.net
Date: Tue May 03 2005 - 18:54:56 EDT
Greetings,
thanks for the responses to my previous email regarding
some "mysterious" processes hogging a couple of CPUs on a v880.
mpstat
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 1 12 10 10 0 0 0 0 17 0 0 0 100
1 0 0 8 4 1 9 0 0 0 0 3 0 0 0 100
2 0 0 1 4 1 77 0 0 0 0 6 0 0 0 100
3 5 0 15 7 4 25 0 0 1 0 16 0 0 0 100
4 0 0 1 4 1 0 0 0 130290 0 0 0 100 0 0
5 0 0 1 4 1 2 0 0 130290 0 0 0 100 0 0
6 0 0 1 4 1 10 0 0 0 0 15 0 0 0 100
7 0 0 2 222 120 5 0 0 0 0 4 0 0 0 100
Unfortunately I had bounced the server before I received the helpful
suggestions of running lockstat to try and identify the kernel process that may
be causing the problem. Note the server is in a very secure environment and
built from a corporate image, so issues of hacked bins could be reasonably
discounted.
A week later the problem has returned... yay... I think :-)
Running 'lockstat -kgIW sleep 5' produced...
Profiling interrupt: 3879 events in 4.998 seconds (776 events/sec)
Count genr cuml rcnt nsec Hottest CPU+PIL Caller
-------------------------------------------------------------------------------
4244 109% ---- 1.00 819 cpu[2]+11 disp_getwork
3879 100% ---- 1.00 745 cpu[4] lockstat_intr
3879 100% ---- 1.00 745 cpu[4] cyclic_fire
3879 100% ---- 1.00 745 cpu[4] cbe_level14
3879 100% ---- 1.00 745 cpu[4] current_thread
3340 86% ---- 1.00 816 cpu[2]+11 idle
1042 27% ---- 1.00 499 cpu[5] ce_start
971 25% ---- 1.00 529 cpu[4] runservice
968 25% ---- 1.00 529 cpu[4] taskq_d_thread
968 25% ---- 1.00 529 cpu[4] stream_service
948 24% ---- 1.00 526 cpu[4] ce_wsrv
387 10% ---- 1.00 606 cpu[4] mutex_vector_enter
...
and
lockstat sleep 5
Adaptive mutex spin: 1214655 events in 4.995 seconds (243197 events/sec)
Count indv cuml rcnt spin Lock Caller
-------------------------------------------------------------------------------
1214655 100% 100% 1.00 47 0x30000467818 ce_start+0x294
-------------------------------------------------------------------------------
Adaptive mutex block: 4 events in 4.995 seconds (1 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
4 100% 100% 1.00 14825 0x30000467818 ce_start+0x294
-------------------------------------------------------------------------------
Spin lock spin: 239 events in 4.995 seconds (48 events/sec)
Count indv cuml rcnt spin Lock Caller
-------------------------------------------------------------------------------
137 57% 57% 1.00 19 cpu[2]+0x90 disp+0xa4
61 26% 83% 1.00 43 cpu[3]+0x90 disp+0xa4
11 5% 87% 1.00 1146 cpu[3]+0x90 disp_getbest+0x4
9 4% 91% 1.00 19 cpu[0]+0x90 disp+0xa4
8 3% 95% 1.00 1661 cpu[6]+0x90 disp_getbest+0x4
4 2% 96% 1.00 85 cpu[5]+0x90 disp+0xa4
4 2% 98% 1.00 752 cpu[6]+0x90 disp+0xa4
3 1% 99% 1.00 924 cpu[2]+0x90 disp_getbest+0x4
1 0% 100% 1.00 72 turnstile_table+0xc28 turnstile_lookup+0x4c
1 0% 100% 1.00 1 cpu[7]+0x90 disp+0xa4
All this reference to ce_start indicates? this is related to the ce network
interface which currently is in a FAILED status in a multipathed configuration.
I note the installed ce driver is circa Feb/2004 and there have been a number
of revisions since then, so it may be prudent to look at updates in this area.
Any other thoughts on this would be muchly appreciated.
Cheers
Neill Griffin
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:38 EDT