Bug 463872

Summary: [LTC 6.0 FEAT] 201264:EDAC Support
Product: Red Hat Enterprise Linux 6 Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: James Takahashi (IBM) <nobody+PNT0273897>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: ejratl, jjarvis, jlarrew, lcm, mwaite, notting, peterm
Target Milestone: alphaKeywords: FutureFeature
Target Release: 6.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-17 21:06:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 356741    

Description IBM Bug Proxy 2008-09-25 03:00:22 UTC
=Comment: #0=================================================
Emily J. Ratliff <emilyr.com> - 2008-09-24 13:53 EDT
1. Feature Overview:
Feature Id:	[201264]
a. Name of Feature:	EDAC Support
b. Feature Description
RAS Enhancements    Require EDAC support for all chipsets in IBM systems (including proprietary).  
 Also would prefer the ability to disable EDAC drivers during install for platforms that don't
require them.    The majority of IBM Intel and AMD based servers do memory and CPU predictive
failure analysis (PFA) in the BIOS with the help of the BMC. When EDAC drivers are loaded they will
poll for memory errors once a second. The EDAC drivers may read the same hardware status registers 
that the BIOS is using for PFA. This can easily lead to interference between EDAC and the BIOS if
EDAC reads and clears the registers before the BIOS gets a chance to do so and can potentially  
render the BIOS's PFA mechanism dysfunctional. Therefore, a mechanism (black/white listing or
perhaps something more robust) for disabling EDAC on platforms that already do PFA in firmware would
be ideal.

2. Feature Details:
Sponsor:	xSeries
Architectures:
x86
x86_64

Arch Specificity: Both
Affects Installer: Yes
Affects Kernel Modules: Yes
Delivery Mechanism: Request Red Hat development assistance
Category:	Kernel
Request Type:	Kernel - Enhancement from Upstream
d. Upstream Acceptance:	In Progress
Sponsor Priority	1
f. Severity: High
IBM Confidential:	no
Code Contribution:	IBM code
g. Component Version Target:	The EDAC code for some IBM chipsets has been started. The target at
this point is probably 2.6.27.   Additionally, there are components to this feature request that
require code from the distros (ability to disable/enable EDAC based on the platform type).

3. Business Case
Need to provide a mechanism for customers to be able to detect and report platform errors from Linux.

4. Primary contact at Red Hat: 
John Jarvis
jjarvis

5. Primary contacts at Partner:
Project Management Contact:
Monte Knutson, mknutson.com, 877-894-1495

Technical contact(s):
Kevin Stansell, kstansel.com
Chris McDermott, mcdermoc.com

IBM Manager:
Deneen T. Dock, deneen.com

Comment 1 Bill Nottingham 2008-10-03 17:29:02 UTC
If you want to do disabling of the EDAC code based on whether or not the firmware in the platform supports other methods, this should really be done via DMI matching in the kernel code itself.

Comment 2 IBM Bug Proxy 2008-10-08 19:40:47 UTC
(In reply to comment #4)
> ------- Comment From notting 2008-10-03 13:29:02 EDT-------
> If you want to do disabling of the EDAC code based on whether or not the
> firmware in the platform supports other methods, this should really be done via
> DMI matching in the kernel code itself.
>

Yes, agreed. However, this is slightly more complicated than just DMI matching. Since the BIOS setup can provide an option for disabling PFAs, there needs to be a way to dynamically determine whether or not the BIOS is _currently_ handling PFA (through SMIs, typically). There are potentially race conditions that can occur if both BIOS and Linux are handling errors simultaneously.

Comment 3 Bill Nottingham 2008-10-09 14:57:55 UTC
It's unclear to me where this code should be - it's not as if userspace would have any better idea what BIOS option has been set. Can this be read from SMBIOS or similar?

Comment 4 John Jarvis 2009-02-04 17:00:56 UTC
Chris, can you please help with an answer to Bill's question in comment 3?

Comment 5 Chris McDermott 2009-02-04 18:56:53 UTC
Max has been looking at this issue. I'll have him respond.

Comment 6 Michael Waite 2009-04-27 18:08:36 UTC
Is this bug still active? I just had some partners in APAC ask about it.

Comment 7 John Jarvis 2009-04-27 18:53:46 UTC
Yes, this BZ is still active.

Comment 8 Jesse Larrew 2009-06-29 22:43:17 UTC
Assigning this to Peter Bogdanovic at IBM.

Comment 9 Peter Bogdanovic 2009-09-17 21:06:00 UTC
IBM System x has ceased further EDAC driver development.

Comment 10 IBM Bug Proxy 2009-12-09 01:40:42 UTC
------- Comment From sglass.com 2009-12-08 20:34 EDT-------
This was quit in devtrack so doing the same here.