SUMMARRY: Failed upgrade woes

From: lawries@btinternet.com
Date: Wed Feb 12 2003 - 06:46:49 EST


Many thanks to Tom Webster and Dr Tom Blinn,

Replies as follows:

> Following a disaterous weekend trying to rolling upgrade a V5.0 cluster and rolling it all
> back again, I have been left with a LOT of files with tags like:
>
> .mrg...<name>
> .mrg..<name>
> .new..<name>
> .proto..<name>
>
> Is it alright to remove these files? If so how to tell which ones?

These are normal on systems that have been patched (or attempted to be
patched), the updadmin utility will deal with the files that are OK to
remove.

> Also I have <name>.PreUPD, <name>.PreMRG and so on.
> Can these be removed?

As above, but these are copies of your configuration files before the
upgrade and/or merge process has modified them. Its usually a good
idea to keep these around for a couple days to make sure that some
site-specific hand edit to these files wasn't lost.

> Finally, I am unclear on the possibility of doing a static (not rolling) upgrade. If I
> take all the cluster members down and then bring one up at a time, upgrade, take it down
> again, and so on for all the servers, is this considered acceptable?

OK, let me start by saying that you may want to engage HP support at
some point. I'm going to make some comments and perhaps offer some
suggestions, but they are just that -- make sure you have thought
through anything you are going to do. Remember, they don't pay admins
because we can type fast -- they pay us to think before hitting enter.

Make sure that the cluster thinks it is in a clean and stable state
before doing anything drastic. Use the "clu_upgrade -v status"
command to verify that all members show they are rolled and none
are running on tagged files. This will need to get resolved before
anything else is done.

As far as the next steps go, there are always options -- this is
UNIX after all.

Non-rolling upgrade: I don't know about 5.0, but around PK4 in 5.1,
Compaq introduced the option of doing a non-rolling upgrade. It is
something like what you are suggesting. Check the install guide for
your patch kit for details. As a data point, we tried this going
from PK2 to PK4 on 5.1 and hosed our production cluster (to the point
where we needed to boot from the build disk and restore the OS
partitions from tape). HP tells us it is most likely an issue with
needing the updates in PK3 before being able to do a non-rolling.

Rolling again: Try it again? This should be the safest method.
If you're having problems, it generally indicates problems in
the cluster beyond the patch kit. I know its a slow tedious
process, but....

The big ugly stick: I'm not recommend this, but if things are
getting weird, it is a possibility. Dump all members of the cluster
except a lead node. Patch and re-add the cluster members. I would
not do this without working with HP support first. If you choose to
go this way, let me know -- we had to do a node remove and re-add
to resolve some issues and can provide some pointers on what not to
forget to backup.

Tom (Tom Webster)

The ".mrg..", ".new..", and ".proto.." files are all critical to the
success of a rolling upgrade or an "installupdate" (in a stand-alone
system). If you remove them, you WILL regret it later. The other
.PreUPD and .PreMRG files are presumably left-overs from a failed
update installation, perhaps from your attempted rolling upgrade.
I believe that there are log files that tell you what created the
files and what you should do with them; in general, they probably
can be safely removed without impacting a future upgrade, but when
you do the upgrade, they will probably get re-created. (This is all
why you need a LOT of disk space to manage a system and upgrade it.)

Tom (Dr. Thomas P. Blinn )
 
Thanks again

Lawrie.



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:07 EDT