Bug 463872 - [LTC 6.0 FEAT] 201264:EDAC Support
[LTC 6.0 FEAT] 201264:EDAC Support
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.0
All All
high Severity high
: alpha
: 6.0
Assigned To: James Takahashi
Martin Jenner
: FutureFeature
Depends On:
Blocks: 356741
  Show dependency treegraph
 
Reported: 2008-09-24 23:00 EDT by IBM Bug Proxy
Modified: 2010-10-18 15:17 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-17 17:06:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 48473 None None None Never

  None (edit)
Description IBM Bug Proxy 2008-09-24 23:00:22 EDT
=Comment: #0=================================================
Emily J. Ratliff <emilyr@us.ibm.com> - 2008-09-24 13:53 EDT
1. Feature Overview:
Feature Id:	[201264]
a. Name of Feature:	EDAC Support
b. Feature Description
RAS Enhancements    Require EDAC support for all chipsets in IBM systems (including proprietary).  
 Also would prefer the ability to disable EDAC drivers during install for platforms that don't
require them.    The majority of IBM Intel and AMD based servers do memory and CPU predictive
failure analysis (PFA) in the BIOS with the help of the BMC. When EDAC drivers are loaded they will
poll for memory errors once a second. The EDAC drivers may read the same hardware status registers 
that the BIOS is using for PFA. This can easily lead to interference between EDAC and the BIOS if
EDAC reads and clears the registers before the BIOS gets a chance to do so and can potentially  
render the BIOS's PFA mechanism dysfunctional. Therefore, a mechanism (black/white listing or
perhaps something more robust) for disabling EDAC on platforms that already do PFA in firmware would
be ideal.

2. Feature Details:
Sponsor:	xSeries
Architectures:
x86
x86_64

Arch Specificity: Both
Affects Installer: Yes
Affects Kernel Modules: Yes
Delivery Mechanism: Request Red Hat development assistance
Category:	Kernel
Request Type:	Kernel - Enhancement from Upstream
d. Upstream Acceptance:	In Progress
Sponsor Priority	1
f. Severity: High
IBM Confidential:	no
Code Contribution:	IBM code
g. Component Version Target:	The EDAC code for some IBM chipsets has been started. The target at
this point is probably 2.6.27.   Additionally, there are components to this feature request that
require code from the distros (ability to disable/enable EDAC based on the platform type).

3. Business Case
Need to provide a mechanism for customers to be able to detect and report platform errors from Linux.

4. Primary contact at Red Hat: 
John Jarvis
jjarvis@redhat.com

5. Primary contacts at Partner:
Project Management Contact:
Monte Knutson, mknutson@us.ibm.com, 877-894-1495

Technical contact(s):
Kevin Stansell, kstansel@us.ibm.com
Chris McDermott, mcdermoc@us.ibm.com

IBM Manager:
Deneen T. Dock, deneen@us.ibm.com
Comment 1 Bill Nottingham 2008-10-03 13:29:02 EDT
If you want to do disabling of the EDAC code based on whether or not the firmware in the platform supports other methods, this should really be done via DMI matching in the kernel code itself.
Comment 2 IBM Bug Proxy 2008-10-08 15:40:47 EDT
(In reply to comment #4)
> ------- Comment From notting@redhat.com 2008-10-03 13:29:02 EDT-------
> If you want to do disabling of the EDAC code based on whether or not the
> firmware in the platform supports other methods, this should really be done via
> DMI matching in the kernel code itself.
>

Yes, agreed. However, this is slightly more complicated than just DMI matching. Since the BIOS setup can provide an option for disabling PFAs, there needs to be a way to dynamically determine whether or not the BIOS is _currently_ handling PFA (through SMIs, typically). There are potentially race conditions that can occur if both BIOS and Linux are handling errors simultaneously.
Comment 3 Bill Nottingham 2008-10-09 10:57:55 EDT
It's unclear to me where this code should be - it's not as if userspace would have any better idea what BIOS option has been set. Can this be read from SMBIOS or similar?
Comment 4 John Jarvis 2009-02-04 12:00:56 EST
Chris, can you please help with an answer to Bill's question in comment 3?
Comment 5 Chris McDermott 2009-02-04 13:56:53 EST
Max has been looking at this issue. I'll have him respond.
Comment 6 Michael Waite 2009-04-27 14:08:36 EDT
Is this bug still active? I just had some partners in APAC ask about it.
Comment 7 John Jarvis 2009-04-27 14:53:46 EDT
Yes, this BZ is still active.
Comment 8 Jesse Larrew 2009-06-29 18:43:17 EDT
Assigning this to Peter Bogdanovic at IBM.
Comment 9 Peter Bogdanovic 2009-09-17 17:06:00 EDT
IBM System x has ceased further EDAC driver development.
Comment 10 IBM Bug Proxy 2009-12-08 20:40:42 EST
------- Comment From sglass@us.ibm.com 2009-12-08 20:34 EDT-------
This was quit in devtrack so doing the same here.

Note You need to log in before you can comment on or make changes to this bug.