crash

From: paf1@email.cz
Date: Thu Jul 29 2004 - 07:05:16 EDT


Hi,

can anybody explain what's exactly wrong, please ?

It looks like QBB0 backplain issue, but who knows .....

WSEA - analyzing
******************************************************************************

---------- Problem Found: Duplicate tag parity error detected in the DTAG of SoftQbb0 (HardQbb0) at Mon 26 Jul 2004 14:09:19 GMT+01:00 ----------

Problem Report Times:
    Event Time: Thu 15 Jul 2004 19:40:25 GMT+03:00
    Report Time: Mon 26 Jul 2004 14:09:19 GMT+01:00
    Expiration Time: Thu 15 Jul 2004 19:40:25 GMT+03:00

Managed Entity:
System Name : ux23
System Type : AlphaServer GS80 6/731
System Serial : AY12302979
OS Type : Tru64 UNIX/Compaq Tru64 UNIX V5.1B (Rev. 2650)

Service Obligation Data:

   Service Obligation: Valid
   Service Obligation Number: AY85151089
   System Serial Number: AY85151089
   Service Provider Company Name: Hewlett-Packard Company

Brief Description:
Duplicate tag parity error detected in the DTAG of SoftQbb0 (HardQbb0)

Callout ID:
Theory Code : 0x03C003000007AF05
HQBB.Ent.Flt : 0.24.3

Severity:
1

Reporting Node:
zacerp

Full Description:
The Dtag control register (DTAG_CONTROL) indicated a parity or double bit ECC
error in the duplicate tag store. Before the Dtag writes information in its tag
rams it generates parity (dtag1) or ECC (dtag2). Therefore the fact that we did
get this error on a read means the problem is caused by the tag rams or by the
Dtag control logic. Dtag rams and the control logic are part of the QBB
backplane. The error was detected in Duplicate Tag 0 block 3. This error causes
a system fault condition.

FRU List:

Probability : High
Fru Manufacturer : -
Fru Model : -
Fru PartNumber : 54-30354-03.B01
Fru SerialNumber : AY10547388
Fru FirmwareRev : -
Fru SiteLocation : -
Fru CabinetId : 600mm 4P System Cabinet
Fru Position : 4P System Cabinet, Full Depth, box, located 4 from the bottom
Fru Chassis : System Drawer
Fru Assembly : -
Fru Subassembly : -
Fru Slot : -

Evidence:
Time of Event : 15 Jul 2004 15:31:28 GMT+03:00 (Thu)
Unique ID : 9572.0 (cdl)
Analysis Revision : GS320_UCE_RULE V6.5 (06may2004)

SEA Version:
System Event Analyzer for Tru64 UNIX V4.3.3 (Build
40)

****************************************************************************************

WSEA - full output event

Event: 1586
Description: Console Data Log Event at Thu 15 Jul 2004 19:40:25 GMT+03:00 from ux23
File: ./binary.errlog
================================================================================

COMMON EVENT HEADER (CEH) V2.0
Event_Leader xFFFF FFFE
Header_Length 260
Event_Length 680
Header_Rev_Major 2
Header_Rev_Minor 0
OS_Type 1 -- Tru64 UNIX
Hardware_Arch 4 -- Alpha
CEH_Vendor_ID 3,564 -- Hewlett-Packard Company
Hdwr_Sys_Type 35 -- GS40/80/160/320 Series
Logging_CPU 0 -- CPU Logging this Event
CPUs_In_Active_Set 1
Major_Class 113
Minor_Class 0
Entry_Type 113 -- Console Data Log Event
DSR_Msg_Num 1,967 -- AlphaServer GS80
Chip_Type 11 -- EV67 - 21264A
CEH_Device 255
CEH_Device_ID_0 x0000 03FF
CEH_Device_ID_1 x0000 0007
CEH_Device_ID_2 x0000 0007
Unique_ID_Count 0
Unique_ID_Prefix 9,572
Num_Strings 5

TLV Section of CEH
TLV_DSR_String AlphaServer GS80 6/731
TLV_OS_Version Compaq Tru64 UNIX V5.1B (Rev. 2650)
TLV_Sys_Serial_Num AY12302979
TLV_Time_as_Local Thu 15 Jul 2004 19:40:25 GMT+03:00
TLV_Computer_Name ux23
Entry_Type 113

Console_Data_log

START OF SUBPACKETS IN THIS EVENT

Halt Frame Header Subpacket - V1.0
Time_Stamp x0000 3407 0F0D 1F1C Time Stamp
   Seconds[7:0] 28 Seconds
   Minutes[15:8] 31 Minutes
   Hours[23:16] 13 Hours Unix = GMT Ovms = Local
   Day[31:24] 15 Day
   Month[39:32] 7 July
   Year[47:40] 52 2004

System Machine Check Error Frame Subpacket - Version 1
whami 0 CPU Reporting Error
frame_size x0000 00E8
frame_flags x0000 0000
processor_offset x0000 0018
system_offset x0000 00A0
mchk_code x0000 0200
   ev6_mchk_code[31:0] x200 660 - System Fault
frame_revision x0000 0001 GS80-160-320 BitToText Revision=2106.2002.01
i_stat x0000 0000 0000 0000 IBox Status Register
dc_stat x0000 0000 0000 0000 Dcache Status Register
c_addr x0000 0000 0000 0000 Cbox read register field
   error_address[42:6] x0 Error Address of last reported ECC or Parity error
c_syndrome_1 x0000 0000 0000 0000 CBox Syndrome 1
   upper_qw_syndrome[7:0]x0 Syndrome for Upper Quadword
c_syndrome_0 x0000 0000 0000 0000 Cbox Syndrome 0
   lower_qw_syndrome[7:0]x0 Syndrome for Lower Quadword
c_stat x0000 0000 0000 0000 CBox Read C_STAT
c_sts x0000 0000 0000 0000 CBox Read Register C_STS
   block_status[3:0] x0 Shared
mm_stat x0000 0000 0000 0000 Memory Management Status Register
   opcode[9:4] x0 Opcode of the Instruction that Caused the Error
exc_addr x0000 0000 0000 0000 Exception Address Register
   pc[63:2] x0 Exception Address
ier_cm x0000 0000 0000 0000 Interrupt Enable and Current Processor Mode Register
   cm[4:3] x0 Kernel
   asten[13] x0 AST Interrupt Enable
   sien[28:14] x0 Software Interrupt Enables
   pcen[30:29] x0 Performance Counter Interrupt Enables
   eien[38:33] x0 External Interrupt Enable
isum x0000 0000 0000 0000 Interrupt Summary Register
   astk[3] x0
   aste[4] x0
   asts[9] x0
   astu[10] x0
   si[28:14] x0
   pc[30:29] x0
   cr[31] x0
   sl[32] x0
   ei[38:33] x0
pal_base x0000 0000 0000 0000 Pal Base Register
   pal_base[43:15] x0 Base Physical Address for PALcode
i_ctl x0000 0000 0000 0000 Ibox Control Register
   ic_en[2:1] x0
   spe[5:3] x0
   sde[7:6] x0
   sbe[9:8] x0
   bp_mode[11:10] x0
   hwe[12] x0
   sl_xmit[13] x0
   sl_rcv[14] x0
   va_48[15] x0
   va_form_32[16] x0
   single_issue_h[17] x0
   pct0_en[18] x0
   pct1_en[19] x0
   call_pal_r23[20] x0
   mchk_en[21] x0
   tb_mb_en[22] x0
   bist_fail[23] x0
   chip_id[29:24] x0 ChipId = EV6 PASS 1
   vptp[47:30] x0
   sext[63:48] x0
process_context x0000 0000 0000 0000 Process Context Register
   ppce[1] x0 Process Performance Counting Enable
   fpe[2] x0 Floating Point Enable
   aster[8:5] x0 AST Enable
   astrr[12:9] x0 AST Request
   asn[46:39] x0 Address Space Number
uncorr_cpu_error_sum x0000 0000 0000 0001 Uncorrectable Error or Fault Summary
   QBB0[0] x1 QBB0 uncorrectable Error or Fault
QBB0_csrs_to_be_logged x0000 0000 0100 0000 Registers logged for QBB0:
   dtag0[24] x1 DTAG0
QBB1_csrs_to_be_logged x0000 0000 0000 0000
QBB2_csrs_to_be_logged x0000 0000 0000 0000
QBB3_csrs_to_be_logged x0000 0000 0000 0000
QBB4_csrs_to_be_logged x0000 0000 0000 0000
QBB5_csrs_to_be_logged x0000 0000 0000 0000
QBB6_csrs_to_be_logged x0000 0000 0000 0000
QBB7_csrs_to_be_logged x0000 0000 0000 0000

System Error Frame Header Subpacket - V1.0

DTag Error Frame Subpacket - Version 2
base_physical_address x0000 0FFF FFE0 0000 Base physical addess
   entity[22:18] x18 Duplicate Tag 0 (DTAG0)
   qbb_id[41:36] x3F QBB0
DTAG_CONTROL x0000 0000 0000 0011 DTAG Control Register
   ena_fault[0] x1 Enable DTAG Fault
   pe_sum[5:2] x4 Tag RAM Parity Error or ECC DBE Summary
DTAG_ERR_SUM x0000 0000 0000 0040 DTAG Error Summary Register
   bist_err_sum[3:0] x0 BIST ok
   nxm_err[6] x1 Non-existent memory error (ignore)
DTAG_ERR_CID x0000 0000 0000 0003 DTAG Error commander ID Register
   cid[5:0] x3 Commander ID
DTAG_ERR_CMD x0000 0000 0000 001B DTAG Error Command Register
   cmd[6:0] x1B Command
DTAG_ERR_ADDR_0 x0000 0000 0000 0004 DTAG Error Address 0 Register
DTAG_ERR_ADDR_1 x0000 0000 0000 00AF DTAG Error Address 1 Register
DTAG_ERR_ADDR_2 x0000 0000 0000 008E DTAG Error Address 2 Register
DTAG_ERR_ADDR_3 x0000 0000 0000 00D8 DTAG Error Address 3 Register
DTAG_ECC_CONTROL x0000 0000 0000 008E DTagII ECC Control Register
   SBE_Err_Sum[3:0] xE DTag SRAM sub-block 1, 2 and 3 detected a Single Bit Error (SBE)
   Ena_SBE_Interrupt[5]x0 Disable DTag ECC SBE interrupts
   Ena_ECC[6] x0 Disable DTag ECC on DTAG RAMs
   Force_SBE[7] x1 Force an DTag ECC SBE
DTAG_ECC_SYNDROME x0000 0000 0000 008E DTagII ECC Syndrome Register
   ECC_Syndrome[5:0] xE ECC syndrome value

 thanks for any info
Jiri

________________________________________________________________________________
NOVINKA --- kofeinový nápoj COFFEINUM. Odstraňuje únavu, zlepšuje koncentraci. V rámci akce za 299 Kč, poštovné a balné zdarma! Chcete vědět víc?
http://www.mixer.cz/redirect.phtml?sig=survival



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:05 EDT