restore a crashed test cluster

From: lars.rieneck@tellabs.com
Date: Fri Apr 02 2004 - 05:11:14 EST


Hello all

I have a 5.1B cluster with dupatch 3.
To try our crash restore procedure, I have taken a full backup
with vdump to tape. I have then zerorised all disks including
the quorum disk. Booted on the UNIX cdrom, and restore the following
filedomians/filesets:

        1) cluster_root#root
        2) cluster_usr#usr
        3) cluster_var#var
        4) root1_domain#root

I then go to the boot prompt again, and try to start the first
member by booting, with the following command:

>>> boot -fl ai dkb100.1.0.15.0

I then boot with the following kernel parameters, to
be able to boot without a quorum disk:

Enter: <kernel_name> [option_1 ... option_n]
  or: ls [name]['help'] or: 'quit' to return to console
Press Return to boot 'vmunix'
# vmunix clubase:cluster_expected_votes=1 clubase:cluster_qdisk_votes=0

The boot procedure starts, but stops after the following statements:

Waiting for cluster mount to complete
clsm: checking for peer configurations
clsm: initialized
CNX QDISK: Successfully claimed quorum disk, adding 0 vote.

Do I miss to restore somthing, or do you have any good ideas
to get the boot process to continue. The <ctrl> - c does not
get it to continue.

Belowe is the complete boot sequence:
>>>boot -flags ai dkb100.1.0.15.0
(boot dkb100.1.0.15.0 -flags ai)
block 0 of dkb100.1.0.15.0 is a valid boot block
reading 19 blocks from dkb100.1.0.15.0
bootstrap code read in
base = 200000, image_start = 0, image_bytes = 2600(9728)
initializing HWRPB at 2000
initializing page table at 3ff2e000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code

UNIX boot - Wednesday October 16, 2002

Enter: <kernel_name> [option_1 ... option_n]
  or: ls [name]['help'] or: 'quit' to return to console
Press Return to boot 'vmunix'
# vmunix clubase:cluster_expected_votes=1 clubase:cluster_qdisk_votes=0
Loading vmunix ...
Loading at 0xfffffc0000230000

Sizes:
text = 8225600
data = 1820992
bss = 3934416
Starting at 0xfffffc00002431c0

Loading vmunix symbol table ... [2098984 bytes]
Kernel argument clubase:cluster_expected_votes=1
Kernel argument clubase:cluster_qdisk_votes=0
Memory trolling not supported, cpu Major id 11, Minor id 6
Alpha boot: available memory from 0x2e44000 to 0x3ff2c000
Compaq Tru64 UNIX V5.1B (Rev. 2650); Tue Jan 20 15:26:53 MET 2004
physical memory = 1024.00 megabytes.
available memory = 976.90 megabytes.
using 3860 buffers containing 30.15 megabytes of memory
Firmware revision: 6.5-15
PALcode: UNIX version 1.92-73
AlphaServer DS10 617 MHz
pci0 (primary bus:0) at nexus
isa0 at pci0
gpc0 at isa0
gpc1 not probed
ace0 at isa0
ace1 at isa0
lp0 at isa0
fdi0 at isa0
fd0 at fdi0 unit 0
tu0: DECchip 21143: Revision: 4.1
tu0: auto negotiation capable device
tu0 at pci0 slot 9
tu0: DEC TULIP (10/100) Ethernet Interface, hardware address:
00-10-64-30-CE-E9
tu0: auto negotiation off: selecting 100BaseTX (UTP) port: full duplex
tu1: DECchip 21143: Revision: 4.1
tu1: auto negotiation capable device
tu1 at pci0 slot 11
tu1: DEC TULIP (10/100) Ethernet Interface, hardware address:
00-10-64-30-CE-E8
tu1: auto negotiation off: selecting 100BaseTX (UTP) port: full duplex
ata0 at pci0 slot 13
ata0: ACER M1543C
scsi0 at ata0 slot 0 rad 0
scsi1 at ata0 slot 1 rad 0
itpsa0 at pci0 slot 14
IntraServer ROM Version V2.0 (c)1998
scsi2 at itpsa0 slot 0 rad 0
isp0 at pci0 slot 15
isp0: QLOGIC ISP1040B/V2 - Differential Mode
isp0: Firmware revision 5.57 (loaded by console)
isp0: Fast RAM timing enabled.
scsi3 at isp0 slot 0 rad 0
alt0 at pci0 slot 17
alt0: DEGPA (1000BaseSX) Gigabit Ethernet Interface, hardware address:
00-60-CF-20-C2-CA
alt0: Driver Rev = V2.0.16 NUMA, Chip Rev = 6, Firmware Rev = 12.4.12
Created FRU table binary error log packet
kernel console: ace0
dli: configured
NetRAIN configured.
Random number generator configured.
rm_configure_callback: No valid MC adapter in the system
TruCluster Server V5.1B (Rev. 1029); 09/29/03 03:10
clubase: configured
Configuring RDG to use TCP
ics_hl: Configuring TCP as transport.
ics_ll_tcp: cluster network interface started: rendezvous port is 900
ics_tcp_init: Declaring this_node up 1
icsnet: configured
drd configured 0
kch: configured
dlm: configured
Starting CFS daemons
Registering CFS Services
Initializing CFSREC ICS Service
Registering CFSMSFS remote syscall interface
Registering CMS Services
TNC kproc_creator_daemon: Initialized and Ready
i2c: Server Management Hardware Present
alt0: 1000 Mbps full duplex Link Up via autonegotiation
CNX MGR: Cluster fornax incarnation 0x787fc has been formed
CNX MGR: Founding node id is 1 csid is 0x10001
CNX MGR: membership configuration index: 1 (1 additions, 0 removals)
CNX MGR: quorum (re)gained, (re)starting cluster operations.
CNX MGR: Node lupus 1 incarn 0x7842b csid 0x10001 has been added to the
cluster
Joining versw kch set.
dlm: resuming lock activity
kch: resuming activity
scsi2: SCSI Bus was reset
cam_logger: SCSI event packet
cam_logger: bus 2
itpsa SCSI HBA
SCSI Bus was reset

scsi2: SCSI Bus was reset
cam_logger: SCSI event packet
cam_logger: bus 2
itpsa SCSI HBA
SCSI Bus was reset

cam_logger: SCSI event packet
cam_logger: bus 3
isp_reinit
Beginning Adapter/Chip reinitialization (0x1)
cam_logger: SCSI event packet
cam_logger: bus 3
isp_cam_bus_reset_tmo
SCSI Bus Reset performed
WARNING: Magic number on ADVFS portion of CNX partition on boot disk is
not valid

Waiting for cluster mount to complete
clsm: checking for peer configurations
clsm: initialized
CNX QDISK: Successfully claimed quorum disk, adding 0 vote.

-----------------------------------------
============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the
reader of this message is not the intended recipient, or an
employee or agent responsible for delivering this message to
the intended recipient, you are hereby notified that any
reproduction, dissemination or distribution of this
communication is strictly prohibited. If you have received
this communication in error, please notify us immediately by
replying to the message and deleting it from your computer.

Thank you.
Tellabs
============================================================



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:49:55 EDT