OT : Basic Critical O.S. Values that Trigger Problem Alert

From: VIVEK_SHARMA (VIVEK_SHARMA@infosys.com)
Date: Mon Jul 29 2002 - 02:45:53 EDT


Hi

We are Trying to make a General Document to be forwarded to Customers which
should allow them to know when they are performing far below normal

At the Operating System Level we are trying to Identify Practical
Critical Values which when below respective Threshold Limits which
would give the alert about a potential problem .

We are Looking for these in Areas of :-

1) Network Thruput
2) Memory Utilization
3) Swap Utilization
4) IO Utilization

Would apreciate actual Commands used (preferably those Generic across
different O.S.) & respective Critical Threshold Limit Values for the Above

EXAMPLE For Network thruput Between APPLICATION Server machine & Database
Server Machine
what , by experience , are the parameters & their respective Minimum
threshold Values which would let us know that there is a Severe problem
therein ?

NOTE - We have generally been measuring this by Manually ftping a Big
File , about 100MB , between APP & DB Server machines , noting the thruput
Displayed in (kbytes/s) on Completion & Converting this Value to Mega Bits /
Second
(i.e. MBPS) . If this Value is Less than 40MBPS for a 100 MBPS Cable we
know there is a PRoblem with Network Bandwidth.

Miscellaneous - Some Threshold Limits known to us :-

Command - vmstat 5 3
Virtual Memory Statistics: (pagesize = 8192)
  procs memory pages intr
cpu
  r w u act free wire fault cow zero react pin pout in sy cs us sy
id
  3 1K 34 266K 84K 32K 811M 132M 339M 635 193M 0 188 28K 1K 16 7
77
  3 1K 33 267K 84K 32K 410 71 151 0 131 0 494 2K 4K 4 2
93
  3 1K 36 269K 82K 32K 5459 1720 807 0 3016 0 471 3K 4K 37 5
58

1) Utilization of CPU due to Operating System (Internal) Operations (%sy)
Exceeding Utilization due to user Applications (%us)

2) Average Wait of CPU for IO to Complete (%wio) Greater than (>) 30 % [
From
sar Command ]

3) Utilization of CPU due to Operating System (Internal) Operations (%sy) > 30
%

4) CPU Utilization - If Total CPU Utilization Consistently Near 0%
Idle Or further Coupled with any of the following :-
   a)Abnormally High Wait for IO ( > 30 %) [ From sar Command ]
   b)Abnormally High Operating System CPU Utilization ( > 30 %)
   c)Abnormally High Run Queue ["r" > (3 * Number of CPUs)]

THANKS
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:40 EDT