E4500 cluster reboot & A5200

From: Julian Grunnell (julian.grunnell@pipex.net)
Date: Wed Mar 30 2005 - 07:16:29 EST


Hi - hoping someone can shed some light on a problem encountered this
morning, the platform is:

2 x E4500's / 4 CPU's / 4GB RAM
2 x A5200 photon's / 22 x 18GB disks

Each server has a fibre connection to the photons. Servers are running
Solaris 2.6, Sun Cluster 2.2 and the disks are under Veritas Volume
Manager 2.6 control. Yep, I know the software is very old ;-)

It would appear from the messages file that a panic on CPU5 caused the
initial reboot, seen these before and may need to be replaced. The
cluster actioned a failover and now the one remaining node has control
of all the disks / disk groups & services.

BUT when the other node came backup & before I was going to start the
cluster software I noticed that using the "vxdisk list" command or
"format" command shows all the disks as errored or some other form of
errored state. The "good" node does not and can see ALL disks either
from the O/S (format), via Veritas or when using "luxadm dis".

Output is:

"vxdisk list"
c0t1d0s2 sliced - - error
c0t2d0s2 sliced - - error
c0t3d0s2 sliced - - error
c0t4d0s2 sliced - - error
c0t5d0s2 sliced - - error
c0t6d0s2 sliced - - error
c0t7d0s2 sliced - - error
c0t8d0s2 sliced - - error
c0t9d0s2 sliced - - error
Etc...

"format"
         2. c0t2d0 <drive not available: reserved>
          /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203788df8a,0
       3. c0t3d0 <drive not available: reserved>
          /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203788fd77,0
       4. c0t4d0 <drive not available: reserved>
          /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203788ec03,0
       5. c0t5d0 <drive not available: reserved>
          /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w21000020375a27a2,0
       6. c0t6d0 <drive not available: reserved>
          /sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203788eecf,0
       7. c0t7d0 <drive not available: reserved>
Etc...

"luxadm dis ..."
                                   SENA
                                 DISK STATUS
SLOT FRONT DISKS (Node WWN) REAR DISKS (Node
WWN)
0 On (O.K.) 20000020375a25c7 On (Rsrv cnflt:A)
20000020375a247f
1 On (Rsrv cnflt:A) 200000203788e9de On (Rsrv cnflt:A)
2000002037bd0ca3
2 On (Rsrv cnflt:A) 200000203788df8a On (Rsrv cnflt:A)
2000002037e49c8c
3 On (Rsrv cnflt:A) 200000203788fd77 On (Rsrv cnflt:A)
20000020375a26f6
4 On (Rsrv cnflt:A) 200000203788ec03 On (Rsrv cnflt:A)
2000002037bab8a2
5 On (Rsrv cnflt:A) 20000020375a27a2 On (Rsrv cnflt:A)
200000203788ed6d
6 On (Rsrv cnflt:A) 200000203788eecf On (Rsrv cnflt:A)
20000020375a277e
7 On (Rsrv cnflt:A) 200000203788e7a7 On (Rsrv cnflt:A)
2000002037bac3b5
8 On (Rsrv cnflt:A) 2000002037ba2c75 On (Rsrv cnflt:A)
2000002037ba7e67
9 On (Rsrv cnflt:A) 2000002037966e78 On (Rsrv cnflt:A)
200000203788f5de
10 On (Rsrv cnflt:A) 20000020375a2322 On (Rsrv cnflt:A)
20000020375a2362
Etc...

As stated previously, ALL the above commands work perfectly fine on the
"good" node.

I have physically checked the server, photon, cables etc over and they
all look fine.

Any help, tips or solutions would be greatly appreciated.

Thanks - Julian.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:26 EDT