OT : Basic Critical O.S. Values that Trigger Problem Alert

Date: Mon Jul 29 2002 - 02:45:53 EDT


We are Trying to make a General Document to be forwarded to Customers which
should allow them to know when they are performing far below normal

At the Operating System Level we are trying to Identify Practical
Critical Values which when below respective Threshold Limits which
would give the alert about a potential problem .

We are Looking for these in Areas of :-

1) Network Thruput
2) Memory Utilization
3) Swap Utilization
4) IO Utilization

Would apreciate actual Commands used (preferably those Generic across
different O.S.) & respective Critical Threshold Limit Values for the Above

EXAMPLE For Network thruput Between APPLICATION Server machine & Database
Server Machine
what , by experience , are the parameters & their respective Minimum
threshold Values which would let us know that there is a Severe problem
therein ?

NOTE - We have generally been measuring this by Manually ftping a Big
File , about 100MB , between APP & DB Server machines , noting the thruput
Displayed in (kbytes/s) on Completion & Converting this Value to Mega Bits /
(i.e. MBPS) . If this Value is Less than 40MBPS for a 100 MBPS Cable we
know there is a PRoblem with Network Bandwidth.

Miscellaneous - Some Threshold Limits known to us :-

Command - vmstat 5 3
Virtual Memory Statistics: (pagesize = 8192)
  procs memory pages intr
  r w u act free wire fault cow zero react pin pout in sy cs us sy
  3 1K 34 266K 84K 32K 811M 132M 339M 635 193M 0 188 28K 1K 16 7
  3 1K 33 267K 84K 32K 410 71 151 0 131 0 494 2K 4K 4 2
  3 1K 36 269K 82K 32K 5459 1720 807 0 3016 0 471 3K 4K 37 5

1) Utilization of CPU due to Operating System (Internal) Operations (%sy)
Exceeding Utilization due to user Applications (%us)

2) Average Wait of CPU for IO to Complete (%wio) Greater than (>) 30 % [
sar Command ]

3) Utilization of CPU due to Operating System (Internal) Operations (%sy) > 30

4) CPU Utilization - If Total CPU Utilization Consistently Near 0%
Idle Or further Coupled with any of the following :-
   a)Abnormally High Wait for IO ( > 30 %) [ From sar Command ]
   b)Abnormally High Operating System CPU Utilization ( > 30 %)
   c)Abnormally High Run Queue ["r" > (3 * Number of CPUs)]

sunmanagers mailing list

This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:40 EDT