V880 and CPU offline

From: Rich Bonfoey (Rich.Bonfoey@thenewstribune.com)
Date: Thu Feb 24 2005 - 15:23:41 EST


Hello

On the recommendation of a software vendor, to handle their database
application ( progress ), that the values seminfo_semmns and seminfo_semmnu
be increased from 3000 to 20000 each. On the reboot of the 880 running
solaris 9 one of the CPU's went offline, extract from messages:

Question did the change in the semaphore values cause this or is there
indeed a problem with the CPU

Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 485337 kern.info] NOTICE:
[AFT0] EDC Event detected by CPU1 at TL=0, e
rrID 0x0000009a.ad45c7dc
Feb 24 09:38:16 tnt-pbs AFSR 0x00000010<EDC>.0000002c AFAR
0x000000a0.c8ab3810
Feb 24 09:38:16 tnt-pbs Fault_PC 0xfef7ffb4 Esynd 0x002c
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 514443 kern.info] [AFT0]
errID 0x0000009a.ad45c7dc Data Bit 7 was in e
rror and corrected
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 535269 kern.info] [AFT2]
errID 0x0000009a.ad45c7dc PA=0x000000a0.c8ab3
800
Feb 24 09:38:16 tnt-pbs E$tag 0x00000283.22924924 E$state_0 Modified
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x00000000.00000032 0x0000003f.
00000006 ECC 0x043
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x00000006.001339c8 0x00000001.
00000000 ECC 0x1d8
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x00000000.00000040 0x0000004d.
00000006 ECC 0x14e
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x00000006.001339f8 0x00000001.
00000000 ECC 0x0bc
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2]
D$ data not available
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2]
I$ data not available
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 151764 kern.info] NOTICE:
[AFT0] WDC Event detected by CPU1 at TL=0, e
rrID 0x0000009a.ad45c7dc
Feb 24 09:38:16 tnt-pbs AFSR 0x00000040<WDC>.0000002c AFAR
0x000000a0.c8ab3810
Feb 24 09:38:16 tnt-pbs Fault_PC 0xfef7ffb4 Esynd 0x002c
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 514443 kern.info] [AFT0]
errID 0x0000009a.ad45c7dc Data Bit 7 was in e
rror and corrected
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 685509 kern.info] NOTICE:
[AFT0] UCC Event detected by CPU1 in User mo
de at TL=0, errID 0x0000009a.ad76cbe8
Feb 24 09:38:16 tnt-pbs AFSR 0x00000400<UCC>.0000002c AFAR
0x000000a0.c8ab3810
Feb 24 09:38:16 tnt-pbs Fault_PC 0xfef80de0 Esynd 0x002c
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 590688 kern.info] [AFT0]
errID 0x0000009a.ad76cbe8 Data Bit 7 was in e
rror and corrected
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 513633 kern.info] [AFT2]
errID 0x0000009a.ad76cbe8 PA=0x000000a0.c8ab3
800
Feb 24 09:38:16 tnt-pbs E$tag 0x00000283.22124922 E$state_0 Exclusive
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x00000000.00000032 0x0000003f.
00000006 ECC 0x043
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x00000006.001339c8 0x00000001.
00000000 ECC 0x1d8
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x00000000.00000040 0x0000004d.
00000006 ECC 0x14e
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x00000006.001339f8 0x00000001.
00000000 ECC 0x0bc
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2]
D$ data not available
Feb 24 09:38:16 tnt-pbs SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2]
I$ data not available
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 865859 kern.info] NOTICE:
[AFT0] UCC Event detected by CPU1 in User mo
de at TL=0, errID 0x0000009a.b33cfd18
Feb 24 09:38:17 tnt-pbs AFSR 0x00000400<UCC>.0000002c AFAR
0x000000a0.c8ab3810
Feb 24 09:38:17 tnt-pbs Fault_PC 0xff2d4198 Esynd 0x002c
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 217014 kern.info] [AFT0]
errID 0x0000009a.b33cfd18 Data Bit 7 was in e
rror and corrected
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 113563 kern.info] [AFT2]
errID 0x0000009a.b33cfd18 PA=0x000000a0.c8ab3
800
Feb 24 09:38:17 tnt-pbs E$tag 0x00000283.22122924 E$state_0 Modified
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x00) 0x00000000.00000032 0x0000003f.
00000006 ECC 0x043
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x10) 0x00000006.001339c8 0x00000001.
00000000 ECC 0x1d8
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x20) 0x00000000.00000040 0x0000004d.
00000006 ECC 0x14e
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2]
E$Data (0x30) 0x00000006.001339f8 0x00000001.
00000000 ECC 0x0bc
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2]
D$ data not available
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2]
I$ data not available
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 217284 kern.info] NOTICE:
[AFT0] WDC Event detected by CPU1 at TL=0, e
rrID 0x0000009a.b33cfd18
Feb 24 09:38:17 tnt-pbs AFSR 0x00000040<WDC>.0000002c AFAR
0x000000a0.c8ab3810
Feb 24 09:38:17 tnt-pbs Fault_PC 0xff2d4198 Esynd 0x002c
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 217014 kern.info] [AFT0]
errID 0x0000009a.b33cfd18 Data Bit 7 was in e
rror and corrected
Feb 24 09:38:17 tnt-pbs SUNW,UltraSPARC-III+: [ID 765675 kern.notice]
NOTICE: [AFT1] CPU1 offlined due to more than 2 xx
C Events in 24:00:00 (hh:mm:ss)

Thanks in advance for your help

Richard Bonfoey
The News Tribune
Information Systems
Successfully Meeting the Business Needs of
The News Tribune through Information Technology
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:30:13 EDT