Bug 798382 - error parsing HEST for firmware_first
error parsing HEST for firmware_first
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
All Unspecified
unspecified Severity medium
: rc
: ---
Assigned To: Lenny Szubowicz
Red Hat Kernel QE team
Depends On:
  Show dependency treegraph
Reported: 2012-02-28 13:51 EST by Stuart Hayes
Modified: 2013-12-06 09:57 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-12-06 09:57:47 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
proposed patch (643 bytes, patch)
2012-02-28 14:13 EST, Stuart Hayes
no flags Details | Diff

  None (edit)
Description Stuart Hayes 2012-02-28 13:51:17 EST
Description of problem:

When a PCI deivce is setup in pci_setup_device, one of the things it does is set pdev->aer_firmware_first to 1 if the ACPI HEST table indicates that the system firmware is going to take care of errors on that device.

The function that actually parses the HEST is drivers/acpi/hest.c acpi_hest_firmware_first(), which loops through each HEST entry and checks it.  However, this function is failing to update a pointer each loop, so it ends up thinking that each entry is the same type of HEST entry as the first one, so the table is not parsed correctly (excpet for the first entry...).

This is causing some error reporting registers to get enabled when they shouldn't.

Version-Release number of selected component (if applicable):

RHEL 5.7 -- 2.6.18-308.el5 kernel

How reproducible:

every time, if you have a setup that is susceptible to the issue

Steps to Reproduce:
1. install a pci card that's behind a bridge (I am using a qlogic QLE2462 fibre channel card)... notice that BIOS sets the device control register (offset 8 in the pci express capability structure) set to 0x4814 (correctable, non-fatal, and unsupported request error reporting are all disabled)
2. boot into rhel5.7
3. use lspci -vvv (or -xxx) to see that this register was changed to 0x481f (because the qla2xxx driver calls pci_enable_pcie_error_reporting(), and aer_firmware_first for this device was 0)
Actual results:
the device control register (offset 8 in the pci express capability structure for the qlogic card) is changed from 0x4814 to 0x481f when the qla2xxx driver loads

Expected results:
the device control register should be left at 0x4814

Additional info:
i have a trivial patch to fix this... i will attach it to this bug
Comment 1 Stuart Hayes 2012-02-28 14:13:21 EST
Created attachment 566382 [details]
proposed patch

Here's a patch that I've tested.
Comment 2 Lenny Szubowicz 2013-12-06 09:57:47 EST
This Bugzilla has been reviewed by Red Hat and is not planned on being
addressed in Red Hat Enterprise Linux 5, and therefore is being closed.

If this bug is critical to production systems, please contact your Red
Hat support representative and provide a sufficient business
justification in order to re-open it.


Note You need to log in before you can comment on or make changes to this bug.