Bug 547762 - PCI AER: HEST FIRMWARE FIRST support
Summary: PCI AER: HEST FIRMWARE FIRST support
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 5.5
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 496328
TreeView+ depends on / blocked
 
Reported: 2009-12-15 16:21 UTC by Prarit Bhargava
Modified: 2010-10-12 15:42 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:24:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
RHEL5 fix for this issue [1/2] (13.77 KB, patch)
2009-12-16 18:43 UTC, Prarit Bhargava
no flags Details | Diff
RHEL5 fix for this issue [2/2] (2.07 KB, patch)
2009-12-16 18:43 UTC, Prarit Bhargava
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description Prarit Bhargava 2009-12-15 16:21:46 UTC
Description of problem:

Dell has realized that PCI AER support is going to be in RHEL5.5.  Their new Dell PowerEdge 11G systems require HEST FIRMWARE FIRST support, which was added to the upstream kernel after the initial PCI AER snapshot.

Without this support, PCI AER errors reported on domains other than 0 would not be handled correctly on their hardware.

Additionally, add a PCI AER on/off switch for those users who may experience problems with PCI AER.

Version-Release number of selected component (if applicable): 2.6.18-180.el5

Comment 2 RHEL Program Management 2009-12-15 16:52:28 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Prarit Bhargava 2009-12-16 18:19:32 UTC
Upstream patch (for RHKL reviewers):

commit 0584396157ad2d008e2cc76b4ed6254151183a25
Author: Matt Domsch <Matt_Domsch>
Date:   Mon Nov 2 11:51:24 2009 -0600

    PCI: PCIe AER: honor ACPI HEST FIRMWARE FIRST mode

    Feedback from Hidetoshi Seto and Kenji Kaneshige incorporated.  This
    correctly handles PCI-X bridges, PCIe root ports and endpoints, and
    prints debug messages when invalid/reserved types are found in the
    HEST.  PCI devices not in domain/segment 0 are not represented in
    HEST, thus will be ignored.

    Today, the PCIe Advanced Error Reporting (AER) driver attaches itself
    to every PCIe root port for which BIOS reports it should, via ACPI
    _OSC.

    However, _OSC alone is insufficient for newer BIOSes.  Part of ACPI
    4.0 is the new APEI (ACPI Platform Error Interfaces) which is a way
    for OS and BIOS to handshake over which errors for which components
    each will handle.  One table in ACPI 4.0 is the Hardware Error Source
    Table (HEST), where BIOS can define that errors for certain PCIe
    devices (or all devices), should be handled by BIOS ("Firmware First
    mode"), rather than be handled by the OS.

    Dell PowerEdge 11G server BIOS defines Firmware First mode in HEST, so
    that it may manage such errors, log them to the System Event Log, and
    possibly take other actions.  The aer driver should honor this, and
    not attach itself to devices noted as such.

    Furthermore, Kenji Kaneshige reminded us to disallow changing the AER
    registers when respecting Firmware First mode.  Platform firmware is
    expected to manage these, and if changes to them are allowed, it could
    break that firmware's behavior.

    The HEST parsing code may be replaced in the future by a more
    feature-rich implementation.  This patch provides the minimum needed
    to prevent breakage until that implementation is available.

    Reviewed-by: Kenji Kaneshige <kaneshige.kenji.com>
    Reviewed-by: Hidetoshi Seto <seto.hidetoshi.com>
    Signed-off-by: Matt Domsch <Matt_Domsch>
    Signed-off-by: Jesse Barnes <jbarnes>

Comment 8 Prarit Bhargava 2009-12-16 18:43:03 UTC
Created attachment 378815 [details]
RHEL5 fix for this issue [1/2]

Comment 9 Prarit Bhargava 2009-12-16 18:43:33 UTC
Created attachment 378816 [details]
RHEL5 fix for this issue [2/2]

Comment 14 Marizol Martinez 2010-02-12 20:09:49 UTC
Partners -- Please wait for *Snapshot 1* to test this feature. Thanks!

Comment 17 errata-xmlrpc 2010-03-30 07:24:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.