[HPADM] RE: Performance SIG Events at HP World events 2003 <announcement>

From: Jeff Kubler (jrkubler@proaxis.com)
Date: Fri Aug 22 2003 - 15:32:19 EDT


Porteus and all,
Here are the notes that Alex took during the Performance Panel. This years
Panel was excellent and there was a real spirit of volunteerism as several
members volunteered quite late to be involved. My thanks to all of them!
I will also see if these can be posted to our interex website.
Thanks,
Jeff Kubler
Performance SIG organizer/President
Performance Panel Chair
------------------
HP-UX Performance Panel Thu. 8/14/03 4:00pm

Moderator:
Jeff Kubler (JK)
Panelists:
Joseph Coha (JC)
Bill Hassell (BH)
David Olker (DO)
Robert Sauers (RS)
David Totsch (DT)
Don Winterholter (DW)
Chris Wong (CW)
Note-taker:
         Alex Ostapenko

DISCLAIMER (from the note-taker): Questions and answers are not direct
quotes, but have been distilled from “shorthand” written notes (questioners
and panelists provided information quite quickly), and are also at times
paraphrased to increase clarity. However, there are no guarantees of 100%
accuracy.

QUESTION: What is the preferred method for buffer cache, static or dynamic?
BH “It depends!” (RS Sorry, that’s copyrighted.)
If you have constantly changing (unstable) memory configuration, then you
probably want to use static buffer cache. That way, when you add memory,
your desired buffer cache size won’t change from your fixed amount. If you
have a stable memory size, then the dynamic buffer cache parameters are okay.
With regards to actual memory dedicated to buffer cache, you are targeting
an absolute amount of 200-500MB, possibly 1GB, but not a percentage of
actual physical memory. If the application is executing a lot of
sequential read-ahead, then caches as large as 2-3GB could benefit the
application. But if jumping all over the place, then a large buffer cache
could hurt.
HP-UX 11i is more efficient with it’s buffer cache searching algorithm
(it’s code was rewritten). HP-UX 11.00 does a serial search, therefore
larger buffer caches could introduce performance penalties.

JK Check your read/write hit ratios with “sar b”. You can bump buffer
cache up or down based on what you see hear. Read hit ratio of 90-95% is good.

QUESTION: Followup to last question… with an Oracle DB that’s filesystem
based, should you lower the buffer cache since Oracle has it’s own SGA
(shared global area) for buffering?
BH You can lower it.
Oracle SGA buffering does a good job of buffering. Furthermore, Oracle
knows what it needs, and reads ahead to satisfy those needs, whereas the OS
buffer cache (filesystem reading, not raw filesystems) does read-ahead for
everything.
Online-JFS allows tuning off OS buffer cache selectively on
filesystems. However, turn it off ONLY for Oracle objects (tables,
indexes); do NOT turn it off for archive logs and redo logs.
Everything else being equal for an Oracle server, if you have to choose
usage of physical memory, it’s probably better to favor Oracle block
buffers over the OS buffer cache.

QUESTION: HP benchmarking center has used “ramdisks”. What about putting
archive logs and redo logs in ramdisk?

DT HP-UX ramdisks is a “vestigial organ”. It exists, but it’s not
documented and it’s very limited in size.

BH HP-UX ramdisks are very limited in size. Is the potential performance
gain worth the inconvenience of implementing an undocumented and
unsupported feature?

DT One can buy external ramdisks. They’re very fast but very expensive,
and you always have to worry about what happens if the system
crashes. Many do have battery backup, but that needs to be verified. But
there are other ways to try to achieve speed with Oracle.
Reiterate use of selectively bypassing OS buffer cache via Online-JFS’
“mincache=direct, convsync=direct”. Answer the question of how you’re
doing Oracle backups? If you’re using Oracle, then bypassing OS buffer
cache is okay, but if you’re doing filesystem backups, then it needs the OS
buffer cache for speed. Also, bypass OS buffer cache for data and index
objects, but use the OS buffer cache for everthing else. EXCEPT for some
near corner cases of OLTP and/or full tablespace scans: in that case,
bypassing the OS buffer cache could hurt performance. You might have to
experiment.
Another feature is “nodatainlog”. JFS normally puts small writes into the
intent log (DATAINLOG). This is meant to capture filesystem metadata in
case the server crashes. But in addition, it also captures small
writes. If your application generates a lot of small writes, they’ll get
captured in intent log filling it up more often than otherwise causing
additional system overhead as it commits the intent log data more
frequently. By specifying “nodatainlog”, only the FS metadata is captured,
and this may improve performance. General rule of thumb is that
“nodatainlog” will never hurt, but it can only help.

QUESTION: How can you take performance numbers and mold them into SLA’s
(service level agreements) that managers can understand?
RS Start with the SLA that’s important to the business and pick
performance numbers to meet that SLA.

DT Pick an initial SLA/metric that’s approachable, and then tighten up as
you get a feel for types of applications and transactions.
Build a metric that’s governed by more than one point, that is, blending
metrics into the SLA to moderate outlying points.
Do not want non-business metric governing the SLA.

BH Make sure the SLA covers what you’re responsible for, e.g., you can do
disk I/O, LAN cards, but you can’t guarantee performance for SQL which you
can’t control.

JC Java and webservers are much more complicated. There is tremendous
variability in metrics and performance. It’s incumbent on people deploying
applications to work with developers, e.g., load driver in testing. Study
and characterize application behavior under high load. Openview products
can really help (e.g. Glance, Measureware).

DT Get inside the application. Measure has feature in which you can
easily define transactions to track and monitor. Developers can instrument
their application to feed into Measureware.

JK Observation with regards to questioner’s situation: that is,
proactively defining and creating and SLA to be able to report to
management; in contrast to the usual situation that management imposes an
SLA that the responsible people may not have as much control over.

Mike Pagan His SLA’s were based on transaction metrics (what the user
sees), not CPU or disk I/O. With the SLA, there is an agreed upon
workload. If new stuff is being put on the server, then all bets are off…
there are no guarantees with the SLA and it has to be renegotiated.
Interesting situation: installing a patch can blow an SLA. For this case,
have exact duplicates of server configurations so patches can be load
tested in advance.
In addition, it’s better to have an SLA metric that’s aggregate, that is,
not 1 transaction in half a second, but completing 1200 transactions in 10
minutes.

RS Another idea is to have a confidence factor, i.e., X transactions are
completed within Y minutes 90% of the time.

JC With newest versions of Java, less intrusive on system resources and
takes up less overall space. Objects (memory usage) segmented into two
parts: garbage collected space and tenured space.
         10-12GB heaps take a lot of garbage-collection time. Two new
mechanisms to reduce garbage collection time. Parallel garbage collector
parallelizes via threads and collects garbage as quickly as
possible. Concurrent garbage collection executes in parallel with
application execution so as to eliminate long passes.
         These features are also available with Java for Windows and Linux
on HP. HP Jtune can watch, analyze, and tune garbage collection.

QUESTION (from panel): What Java application servers being used?
Audience -- BEA WebLogic Oracle?

JC -- One can use workload manager to manage Java performance. Watch out
for eating memory with objects which slows down performance. Session size
can dramatically affect performance.

QUESTION (from panel): Who’s using high-availability (HA) clusters?
RS psets can really help performance. However, broadcast storms can cause
false failovers. Software.hp.com contains tools for interrupt migration.

JC Same problem can occur on heavy webserver traffic that causes network
load and saturation.

RS You must analyze your environment. There is no magic bullet.

JK See RS’s book for the four rules of tuning.

QUESTION (from panel): Who is running Oracle 32-bit applications on 64-bit
platforms? Why?
Audience Developer refuses to convert.

QUESTION (from panel): How many of you running vPars would like to use WLM?
Audience Prefer to do it manually.

QUESTION: Any performance data/impact with Vmware and vPars? Any rules of
thumb of how many unbound processors we can use in a vPar server?
DT Bound processors have to handle all the hardware interrupts. Unbound
processors don’t.

RS There is no performance data with regards to how low you can go with
number of bound processors in a vPar.
vPar code runs first, then boots the OS. There is no good way to
instrument this to get performance data. Instrumenting interrupts takes 10
times as long to capture performance data as it does for the interrupt to
execute its work. Most of the time of an interrupt is saving and restoring
registers to memory (memory is slow with respect to interrupt handling) and
are only a couple of hundred of instructions, and this doesn’t count the
ISR (interrupt service routine)… it’s just to get started. Performance
capturing code would be order’s of magnitude slower than this.

QUESTION: If you don’t restart or reorg a DB periodically, what is the
effect on performance?
DT SAP is a monolithic turnkey database which creates tables on the fly
and handles its own housekeeping very well, therefore lack of periodic
restart or reorg would not have a negative impact on performance. Other
DB’s (Informix) do not do this as well.

POST-SESSION FOLLOWUP DISCUSSION: Regarding vPars, bound/unbound
processors and measuring resource usage of interrupt handling. (Mike
Pagan, Alex Ostapenko)
There are two questions: (1) How few bound processors can you run a vPar
with? (2) How can you measure the performance impact of the hardware
interrupt handling.

DISCUSSION #1: How few bound processors can you have in a vPAR?
The responsibility of the bound processors are to handle the hardware
interrupts of a vPar in addition to the load that the OS scheduler imposes
upon it. Unbound processors cannot handle any of the hardware
interrupts. The “vmstat” INT field shows all interrupts both hardware and
software, not just the hardware interrupts, so it cannot be directly used
to measure number of hardware interrupts being handled.

Best way would be to take a known benchmark and test it successively
running on fewer and fewer bound processors and gauging its
performance. One could also do this with the representative
application. Using Mike Pagan’s capacity forecasting formulas, one could
get a good idea of proper bounding/unbounding of vPar processors for a
given application. Unfortunately, nobody’s willing to invest for that week
or two of testing to get performance data. Also, it would be application
specific.

DISCUSSION #2: Measuring resource usage of hardware interrupt handling.
Mike proposes a low-level empirical approach in which one “guestimates” the
typical size (number of instructions) of a typical hardware interrupt and
predicts resource usage (primarily CPU to handle the interrupt) for a given
workload, i.e., each disk I/O and network I/O (etc.) translates into a
hardware interrupt. Then, as the known benchmark is rerun on fewer and
fewer bound processors, observe the appropriate metrics and refine this
low-level empirical formula.

Alex proposes a high-level approach in which we assume that unbound
processors will have no resource usage due to hardware interrupt handling,
and that bound processors will have a mixture of resource usage, some of it
due to application, some due to hardware interrupt handling, and some due
to other OS housekeeping tasks. In our benchmark approach (for a known
benchmark), we first take a data point with all processors bound. Say we
use %sys(CPU) as the observed metric. Then take a datapoint with one
processor unbound. Its %sys due to hardware interrupt handling will be
zero, but it will take on some other work. Prediction is that it will go
down by some amount. Whereas the remaining bound processors’ %sys would go
up by some amount, partially due to the unbound processors hardware
interrupt handling split (n-1)ways, but partially offset because some of
its non-hardware interrupt handling %sys may have been offloaded to the
unbound processor. Then unbound 2 processors, and observe metric on all
processors again. And collect this data until we’re down to only one bound
processor. Alex’s belief is that this data plotted would potentially yield
a line or curve (or set of curves) that is representative in some fashion
as to the resource impact of hardware interrupt handling. Obviously, more
data and more analysis would be required. But this would be an indirect
and non-intrusive way to “observe” the behavior.

At 09:49 AM 8/13/2003 -0300, Vibert, Porteus wrote:
>Is there going to be a 'white paper' of the lectures?
>
>If so, how do I get Alex Ostapenko info?
>
>Thanks.
>
>Porteus
>
>-----Original Message-----
>From: Jeff Kubler [mailto:jrkubler@proaxis.com]
>Sent: Tuesday, August 12, 2003 11:32 AM
>To: hpux-admin@DutchWorks.nl
>Subject: [HPADM] Performance SIG Events at HP World events 2003
><announcement>
>
>
>To all;
>To those of you in attendance at HP World 2003 I would like to extend an
>
>invitation to the following events;
>The Performance SIG (Special Interest Group) is meeting at 12:10 AM in
>room
>C208. Jeff Kubler of Kubler Consulting will introduce the purpose and
>reasons for the organization, Bill Hassell of BLH Consulting will talk
>about performance challenges, Alex Ostapenko of PPL will talk about what
>he
>does to manage 200+ systems, and Michael Pagan of HP will discuss his
>recent work in capacity planning.
>
>The Performance Panel convenes at 4:00 PM in room B204. Bring your
>questions or submit them to me at jeff@kublerconsulting.com. Panel
>members
>are Jeff Kubler of Kubler Consulting, Bill Hassell of BLH Consulting,
>Chris
>Wong of Cerius Technologies, Bob Sauer of HP (and author of the great
>book
>HP-UX Tuning and Performance - ask him to autograph your book), David
>Totsch of HP, Don Winterholter of Aptitune. Several others have an open
>
>invitation. It should be a good time to hear responses to your wildest
>performance questions and even your most basic.
>Thanks,
>Jeff Kubler
>Performance SIG Organizer
>
>Jeff Kubler
>Kubler Consulting, Inc.
>541-745-7457
>jeff@kublerconsulting.com
>www.kublerconsulting.com
>
>
>
>--
> ---> Please post QUESTIONS and SUMMARIES only!! <---
> To subscribe/unsubscribe to this list, contact
>majordomo@dutchworks.nl
> Name: hpux-admin@dutchworks.nl Owner:
>owner-hpux-admin@dutchworks.nl
>
> Archives: ftp.dutchworks.nl:/pub/digests/hpux-admin (FTP, browse
>only)
> http://www.dutchworks.nl/htbin/hpsysadmin (Web, browse &
>search)

--
             ---> Please post QUESTIONS and SUMMARIES only!! <---
        To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
       Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
 
 Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
            http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 11:02:33 EDT