SUMMARY: patch policy

From: kabinet@inf.u-szeged.hu
Date: Thu Mar 10 2005 - 02:42:05 EST


My question was how to avoid serios accidents on a production system
while patching it? How should I apply recommended patches not to risk
the usability of my system? What is your patch routine?

Mike Salehi writes:
===================
We do not apply the latest patch as it becomes available then do some
research before the list is created.

Josh writes:
============
Software management at our location is done by replicating the
production environment as completely as possible in what we call "the
lab" :-) . Things such as clients, servers, software, configuration are
mimicked as well as possible. We use it for testing patches, upgrades,
new releases or even software that isn't currently in our production
environment. We employ standardized Solaris builds and utilize flash
archiving to rebuild our test systems in a very short period of time.
This also helps to minimize differences between what's in production and
what's in our lab. We also have cold spares for our biggest production
servers which are always running a known good configuration.
Anyway, our steps are as follows:

1. Apply [patch | cluster | software | update] to lab system and mock up
a load and generally work and monitor the system. The process we use
to test the patch is completely dictated by the server, the workload,
the clients, the patch, the software, etc. We do have scripts that we
run to confirm the servers functionality for our most commonly tested
configurations.

2. Upon successful testing, we next schedule downtime on the production
server and apply the software/patch.

3. Hopefully nothing severe happened. If it did, I mentioned we have a
cold production spare that we can fail to in case of an emergency. We
then try to figure out what went wrong on the production server.

Also, we are mostly a file-serving-type environment utilizing a SAN
which makes failing over to a cold spare trivial. I'd imagine that using
Sun's Live Upgrade facility could mimic our SAN scenario, which
basically just guarantees a stable OS image.

So we have a few layers of fallback, which has helped tremendously.
We've caught a few issues that would have done some damage had they
notbeen tested. The system we employ also guarantees as little downtime
for our clients as possible. This is obviously extremely important. I
like to say it's an XP (eXtreme Programming) approach to systems testing
and management: By utilizing lots of testing and working in small teams
we have developed a pretty streamlined process.

Troy writes:
============
At a minimum you should run your patches for some sufficient period of
time on a test system before rolling them into production. Also it is
good to have some way to comparre patches on test to production. In a
perfect world it is nice to have a "production lookalike" test level
that has identical patches as production that you patch after the fact.

In other words you might have these test levels:

1. Unit Test
2. System Test
3. Integration
4. UAT
5. Production Lookalike

The first 4 levels are patched to whatever the new production load
requires and the 5 level is patched after the load to match production.

=========
Thanks for your replies!

Krisztian
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:19 EDT