Diagnosis fun

From: Jason.Shatzkamer@cexp.com
Date: Tue Feb 24 2004 - 17:33:58 EST


Hey Guys,

Going to pull an all-nighter tonight....that's for sure....

Got a sick Sun Fire 4800, 8 cpu, 8G ram, single domain system....root is
D240 tray with 2 18 G scsi's, mirrored internally...database volume is
partner pair T3+ arrays, 8@72GB RAID5, 1 hot spare, in each, mirrored across
controllers using Veritas....datavol filesystem is VxFS, and is spread
across all 8 heads....running Solaris 8, patched to kernel level....not
fully patched as of late, but acceptably current....

Application is:

1. Database app written in PICK basic, jBASE to be specific (obviously, this
is a major CAUSE of the problem, just got to figure out where, exactly)
        Not multithreaded, LOTS of forks, assuming application code is the
culprit, but in order to rewrite app, I need to identify which Solaris
subsystem is hurting
2. ~500 users connecting via telnet

Symptoms are:

1. Severe application slowdowns as users switch betwen screens
2. Any internal application related commands hang, unresponsive to ctrl-c
3. Solaris continues to hum, command line is snappy as ever
4. sar -A output for past two days http://e-literate.net/sar/sarout.txt
<http://e-literate.net/sar/sarout.txt>
5. Current /usr/ucb/ps -aux http://e-literate.net/sar/psaux.txt
<http://e-literate.net/sar/psaux.txt>

My Observations are:

1. High address translation faults
2. Extremely high system vs. user time
3. Lots of context switching, mutex contentions
4. Disk seems fine, for now
5. Zero network errors

Looking for:

1. Anyone see anything glaring as far as isolating problematic Solaris
subsystem (i.e. memory, cpu, etc.)
2. Can something like this be caused by a bad RAM chip? Bad CPU? (No errors
in system logs)
3. Some advice in narrowing down definitive cause, troubleshooting
checklists, tools, general approach to finding the needle in the haystack?

As always, thanks to all....any and all additional information is available
upon request....

J.~

Jason Shatzkamer, MCSE, SSA
Corporate Express Imaging
1096 E Newport Center Drive
Suite # 300
Deerfield Beach, FL 33442
(800) 828-9949 x5415
(954) 379-5415
Jason.Shatzkamer@cexp.com
http://imaging.cexp.com <http://imaging.cexp.com>

> Confidentiality Notice: This message, including any or all attachments,
> is for the sole use of the intended recipient(s). This message may
> contain proprietary and confidential pricing information of Corporate
> Express Imaging and shall NOT be used, disclosed or reproduced in whole or
> in part for any purpose other than to evaluate internally and by
> authorized personnel of named company. Any unauthorized review; use,
> disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all
> copies of the original message.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:28:07 EDT