2ND SUMMARY: UPDATE/Alpha particles and cosmic rays - Bcache Tag Parity Error

From: David.Knight@clubcorp.com
Date: Tue Sep 14 2004 - 10:02:20 EDT


Managers,
Here is what I have found, Thanks to all that replied, I really appreciate
it!!!
It seems that there are several IT shops out there that have been given
this same story from HP and even Sun Microsystems.
However, HP has not only claimed Cosmic reasons. I have reports from some
that HP pointed the finger at
RFI/RFC soundwaves.
As recommended by one response I searched threw some BLOGS/Archives on
solar activity and found (Listed Below under Reports)
that there were indeed reports of solar activity on the or near the dates
of our BCACHE errors.
I'm still not sure if I am a believer however there are a several IT
personnel out there that do.
Below is some interesting reference documentation that I was referred to
along with listing of the cosmic sites that record solar events.

Thanks again to all that helped with this cosmic issue.

-David

Here is a note from a respected HP Engineer that brings a good argument:

It's basic physics. Both alpha particles and cosmic rays are what is
called "ionizing radiation" -- stuff that when it interacts with other
matter can induce ionization or in other words random electrical charge
perturbations. When this happens in the context of, say, your radio
or TV, it's heard or seen as "static". When it happens in your CPU
or memory, it's apt to change a bit somewhere from a one to zero or
vice versa. But usually it induces a single bit error. Depending
on the part involved, such an error may be undetected, detected and
corrected, or detected but not correctable. Parts that have parity
checking (some data paths in the CPU but not all, most cache memory)
but not EDC/ECC (error detection and correction) are susceptible to
this "static" just as much as parts that either don't detect errors
(often inside CPU chip registers, for example) or detect and correct
errors; all it takes is one sufficiently energetic interaction in
the wrong place and you've got an error. With "parity only" parts
like the Bcache, such an error (if reported to the OS) is usually
treated as fatal. Of course, if no OS is running, you can get the
error and it will have no effect. And while sometimes the error
may be detected is actually irrelevant, as a general rule (since
the system software has to assume that the contents of memory and
persistent storage matter and that data integrity is paramount)
the only safe thing to do in the face of such an error is to halt
the system (i.e., "panic").

Parity errors can, of course, have other causes as well, including
defective parts (either defective in design, or defective as a side
effect of aging, usually due to heat stress). In most computer
system applications, there is enough shielding against electrical
and electronic "emissions" that alpha particles and cosmic rays are
an unlikely cause of parity errors in the kinds of components that
are provided with only error detection. The more likely cause in
most cases where a part that had been reliable starts failing is
heat induced failure.

_____________________________________________________________________________

------A Book has been published on the topic by a Cypress Semiconductor
Corp:

http://www.eeproductcenter.com/showArticle.jhtml?articleID=46200051

-----Research Document published by the Nuclear Physics Laboratory
University of CO:

http://www.taek.gov.tr/taek/tudnaem/yayinlar/yayinlar_pdf/fundamental/Fundamental-42.PDF

----A scientific explanation of cosmic rays:

http://zebu.uoregon.edu/~js/glossary/cosmic_rays.html

Cosmic Reports from two different Sites:

--------- 1st:
http://data.gns.cri.nz/hazardwatch/2003_02_01_solararch.html

30.7.04
  A moderate geomagnetic storm occurred on 23-24 July, and on 25-26 July
another,
  more severe, storm produced auroras in North America. A third
geomagnetic storm
  reached extreme levels on 27-28 July and spectacular auroras were seen
from Dunedin.
  All of these storms were caused by solar coronal mass-ejections
associated with powerful flares.

19.12.03
  Low level geomagnetic storms caused by the wind stream from a solar
coronal hole
  continued until 15 December. Activity is presently at low levels, but
gusts from
  another coronal hole may strike the Earthâ€?s magnetosphere on 21 or 22
December,
  causing more geomagnetic disturbances.
 
 
  21.2.03
    The Earth has been inside the high-speed wind stream from a solar
coronal hole
    for the six days, but there have been only minor disturbances to
Earth's magnetic field.
  2:12 PM
 
 
 
  14.2.03
    Solar conditions have been quiet since the aurora of 1-3 February.
However,
    disturbances to Earth's magnetic field may increase as the solar wind
stream
    from a hole in the sun's corona impacts the Earth on Saturday or
Sunday.
2:36 PM

--------- 2nd:

http://www.bbso.njit.edu/cgi-bin/ActivityReport

BBSO Solar Activity Report 28-JUL-2004 17:29:42 UT
Sunny with light winds and fair seeing.
Solar activity has been at a slightly lower level with only C-class events
from NOAA 0652. Region continues to decay and is expected to produce
C-class and M-class events.

NOAA 0652, N07W72. Decaying beta-gamma region. Region continues to decay
both in sunspot area and magnetic complexity. Region has only produced
C-class events since yesterday. Except C-class and M-class events to
continue.

NOAA 0653, S14W75. Decaying region.

NOAA 0654, N07E15. Simple beta region. Little change.

Positions are for July 28,2004 at 17:00 UT.

RF
 
~~~~
 
 
 Partly cloudy with high clouds.
 Solar activity has been low with multiple C-class events from NOAA 0525.
The largest flare was a C8.6 at 0931 UT today. Solar activity should
remain about the same with C-class events from NOAA 0525.
 
 NOAA 0520, S11W32. Slowly decaying beta region.
 
 NOAA 0521, S11W30. Decaying beta region.
 
 NOAA 0523, S15E48. Single stable sunspot.
 
 NOAA 0524, S08E44. Small beta region.
 
 NOAA 0525, N09E46. Beta-gamma region. Region remains mostly unchanged
from yesterday. Region produced a C8.6@0931 UT today. Expect C-class
events with a very slight chance for a low level M-class event.
 
 NOAA 0526, N12W64. Small beta region.
 
 NOAA 0527, S15W51. Small beta region.
 
 NOAA 052-, N08E75. Middle size single sunspot.
 
 Positions are for December 18,2003 at 17:00 UT.
 
 RF
 
~~~~~
 
 
 High thick clouds today.
 Solar activity has been very low there is only one spotted region on the
disk.
 
 NOAA 0285, S12W36. Decaying plage.
 
 NOAA 0288, N11E50. Simple beta region. Region remains mostly unchanged
from yesterday. C-class event possible.
 
 Positions are for 16:00 UT.
 
 Solar activity is expected to remain very low. Region NOAA 0288 may
produce a low level C-class event.
 
 RF



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:08 EDT