From: Bill Verzal (Bill_Verzal@BCBSIL.COM)
Date: Wed Aug 21 2002 - 12:01:11 EDT
I/O drawers are not hot-pluggable. PCI cards not "hot-swappable" cannot be
hot-swapped, even though the Regatta supports it.
Microcode requires a complete outage. CPU and memory of course do as well.
Hot-swappable cards can be changed out on the fly, but each one requires
25-minutes CE time due to the number of "mounting screws" on the carrier.
Also, here are some more issues I raised related to these questions, along
with the answers I received. I had some questions on statements that were
made in the sales manual.
BV
First:
A minimum of 4 GB of system memory is recommended per LPAR.
Can you clarify this statement ? Why is it "recommended?"
This is only a recommendation, depending the environment 1 or 2 GB
of memory minimum per LPAR might do the job. This recommendation
will allow for optimal performance with a minimum configured LPAR
for the average environment. This will allow for applications to
utilize 2 to 3 GB of memory while leaving enough memory for the
Hypervisor and AIX. On average, most applications will size 2 - 3
GB of memory per CPU. This is just an average, some applications
may require more or less that the 2 - 3 GB average. For example,
some web base or routing applications requires only 1 - 1 1/2 GB of
memory per CPU, in this case you will size your total memory for 2
- 3 GB of memory per CPU. Another example, some Databases can
utilize 2 - 4 GB of memory per CPU, in this case you would size
your memory for 3 - 5 GB of memory per CPU. If you are running
multiple applications in the same LPAR or same OS image then your
memory requirements will be greater. It is recommended to allocate
1/2 GB to 1 GB of memory for AIX and the Hypervisor. The memory is
configured in whole numbers, so as a rule of thumb you size the
memory at lease a Gigabyte or more higher than what the application
requires
Second:
Minimum of two internal SCSI hard disks are required per p690 server.
It is recommended that these disks be utilized as mirrored boot
devices. These disks should be mounted in the first 7040-61D I/O
drawer. This configuration provides service personnel the maximum
amount of diagnostic information if the system encounters errors in
the boot sequence.
What does this mean ?
This configuration will allow you to minimize your system downtime
due to a disk Drive failure or service work on an I/O drawer.
Boot support is also available from local SCSI, SSA, and Fibre Channel
adapters, or from networks via ENET or token-ring adapters. The
pSeries 690 does not support booting from FDDI adapters #2741 or #2742
located in 7040-61D I/O drawers.
No questions there...
Consideration should also be given to the placement of AIX rootvg
volume group in the first I/O drawer. This allows AIX to boot any time
other I/O drawers are found offline during boot.
Why would an I/O drawer be offline, and what are scenario's that this
might affect us on ?
The key reason for an I/O drawer to be offline is for maintenance
or repair.
If the boot source other than internal disk is configured, the
supporting adapter should also be in the first I/O drawer.
What does this mean ?
The p690 will provide a very highly available environment without
HACMP, but depending on how a p690 is configured the availability
of a system may drop. To minimize downtime due to a single failure
you should spread your dependencies across the I/O subsystem as
much as possible. Dependencies are things like rootvg disk drives,
network adapters, and adapters used to access disk drives (Fibre
Channel, SSA, and SCSI). The first I/O drawer is the first drawer
to come online or have power applied. There is a remote chance that
a problem in one I/O drawer can affect I/O drawers downstream. A
problem could be disconnecting downstream cables for maintenance or
service. Let's say you configured all of the I/O resources (rootvg
Disk Drives, Ethernet adapters, Fibre Channel adapters, etc.) for
LPAR #1 in the first half of drawer #3. Your single point of
failure in this case would be any failure that would affect the
first planer board in drawer #3, and would cause LPAR #1 to go
down. If on the other hand, half of the I/O resources was spread
across two different drawers (drawer 1 and 3) then you can pull
things like RIO or power cables and LPAR # 1 will not go down.
-----------------------------------------------------------------------------------------------------------
Bill Verzal
Technical Consultant
Forbes Technical Consulting
(312) 653-3684
bill_verzal@bcbsil.com
MailStop: 27.201C
Holger.VanKoll@SW
ISSCOM.COM To: aix-l@Princeton.EDU
Sent by: IBM AIX cc:
Discussion List Subject: p670 availability
<aix-l@Princeton.
EDU>
08/21/2002 10:50
AM
Please respond to
IBM AIX
Discussion List
Hello,
I have to find out
- which parts cause a scheduled downtime when they fail (not
hot-pluggable, but redundant)
- which parts cause a unscheduled downtime when they fail (not redundant)
- which parts cause no downtime when they fail (hot-pluggable).
I know I can find this out by reading a few books, but maybe anyone did
this before?
I am also interested in incomplete answers.
Thank you and regards,
Holger
This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:16:09 EDT