Bug 579958 - EDAC false positive UC errors with unbuffered ECC on 3200 controller
Summary: EDAC false positive UC errors with unbuffered ECC on 3200 controller
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 12
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-07 04:37 UTC by arth
Modified: 2010-12-03 16:14 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-03 16:14:38 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description arth 2010-04-07 04:37:15 UTC
Description of problem:

After upgrading kernel from 2.6.31.12-174.2.19 to 2.6.32.10-90, rsyslogd started spitting out multiple errors per second to syslog and all console users, without pause.

 kernel:EDAC MC0: UE page 0x0, offset 0x0, grain 1073741824, row 3, labels ":": i3200 UE
 kernel:EDAC MC0: UE page 0x0, offset 0x0, grain 1073741824, row 7, labels ":": i3200 UE
 kernel:EDAC MC0: UE page 0x0, offset 0x0, grain 1073741824, row 2, labels ":": i3200 UE
 kernel:EDAC MC0: UE page 0x0, offset 0x0, grain 1073741824, row 6, labels ":": i3200 UE
(and so on -- the row changes between these four values and none other, the rest stays the same)

The memory tests OK with memtest86 running multiple iterations.
No problems are reported with kernel 2.6.31.12-174.2.19.

With 2.6.32, edac-util reports (after a couple of minutes runtime):
mc0: csrow2: ch0|ch1: 36 Uncorrected Errors
mc0: csrow3: ch0|ch1: 85 Uncorrected Errors
mc0: csrow6: ch0|ch1: 103 Uncorrected Errors
mc0: csrow7: ch0|ch1: 55 Uncorrected Errors

How reproducible:

Install unbuffered ECC RAM on an Intel board with a 3200 controller.
In this case Kingston KVR800D2E5K2/4G on an Asus TS100-E5/PI4 server.

Install one of the newer kernel patch levels from Fedora or Red Hat.

Additional info:

Also see https://bugzilla.redhat.com/show_bug.cgi?id=564274
Worth noting that the user in that bug has the same error with the exact same grain value, with a different Red Hat patched kernel.

Comment 1 arth 2010-04-07 05:32:33 UTC
A "blacklist i3200_edac" in /etc/modprobe.d/local.conf quiets the errors, but is obviously not a fix.

Also worth noting:  The address reported for both bug reports equals 0x40000000.

Comment 2 Matthias Prager 2010-04-27 18:09:10 UTC
This bug seems to be related / is a dup of bug 564274.

Comment 3 Stefan Neufeind 2010-05-16 20:34:52 UTC
Yes, I also think this is a dup. Experienced the same problems reported here

Comment 4 arth 2010-05-18 00:40:17 UTC
Thus the "Also see https://bugzilla.redhat.com/show_bug.cgi?id=564274" in the original bug report.  The main difference is that the original bug was opened for RHEL with 2.6.18, and this one reproduces it for Fedora with 2.6.31+ (but not earlier Fedora versions).

Comment 5 Bug Zapper 2010-11-03 17:42:28 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Bug Zapper 2010-12-03 16:14:38 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.