Red Hat Bugzilla – Bug 463872
[LTC 6.0 FEAT] 201264:EDAC Support
Last modified: 2010-10-18 15:17:09 EDT
Emily J. Ratliff <firstname.lastname@example.org> - 2008-09-24 13:53 EDT
1. Feature Overview:
Feature Id: 
a. Name of Feature: EDAC Support
b. Feature Description
RAS Enhancements Require EDAC support for all chipsets in IBM systems (including proprietary).
Also would prefer the ability to disable EDAC drivers during install for platforms that don't
require them. The majority of IBM Intel and AMD based servers do memory and CPU predictive
failure analysis (PFA) in the BIOS with the help of the BMC. When EDAC drivers are loaded they will
poll for memory errors once a second. The EDAC drivers may read the same hardware status registers
that the BIOS is using for PFA. This can easily lead to interference between EDAC and the BIOS if
EDAC reads and clears the registers before the BIOS gets a chance to do so and can potentially
render the BIOS's PFA mechanism dysfunctional. Therefore, a mechanism (black/white listing or
perhaps something more robust) for disabling EDAC on platforms that already do PFA in firmware would
2. Feature Details:
Arch Specificity: Both
Affects Installer: Yes
Affects Kernel Modules: Yes
Delivery Mechanism: Request Red Hat development assistance
Request Type: Kernel - Enhancement from Upstream
d. Upstream Acceptance: In Progress
Sponsor Priority 1
f. Severity: High
IBM Confidential: no
Code Contribution: IBM code
g. Component Version Target: The EDAC code for some IBM chipsets has been started. The target at
this point is probably 2.6.27. Additionally, there are components to this feature request that
require code from the distros (ability to disable/enable EDAC based on the platform type).
3. Business Case
Need to provide a mechanism for customers to be able to detect and report platform errors from Linux.
4. Primary contact at Red Hat:
5. Primary contacts at Partner:
Project Management Contact:
Monte Knutson, email@example.com, 877-894-1495
Kevin Stansell, firstname.lastname@example.org
Chris McDermott, email@example.com
Deneen T. Dock, firstname.lastname@example.org
If you want to do disabling of the EDAC code based on whether or not the firmware in the platform supports other methods, this should really be done via DMI matching in the kernel code itself.
(In reply to comment #4)
> ------- Comment From email@example.com 2008-10-03 13:29:02 EDT-------
> If you want to do disabling of the EDAC code based on whether or not the
> firmware in the platform supports other methods, this should really be done via
> DMI matching in the kernel code itself.
Yes, agreed. However, this is slightly more complicated than just DMI matching. Since the BIOS setup can provide an option for disabling PFAs, there needs to be a way to dynamically determine whether or not the BIOS is _currently_ handling PFA (through SMIs, typically). There are potentially race conditions that can occur if both BIOS and Linux are handling errors simultaneously.
It's unclear to me where this code should be - it's not as if userspace would have any better idea what BIOS option has been set. Can this be read from SMBIOS or similar?
Chris, can you please help with an answer to Bill's question in comment 3?
Max has been looking at this issue. I'll have him respond.
Is this bug still active? I just had some partners in APAC ask about it.
Yes, this BZ is still active.
Assigning this to Peter Bogdanovic at IBM.
IBM System x has ceased further EDAC driver development.
------- Comment From firstname.lastname@example.org 2009-12-08 20:34 EDT-------
This was quit in devtrack so doing the same here.