Bug 798382 - error parsing HEST for firmware_first
Summary: error parsing HEST for firmware_first
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.8
Hardware: All
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Lenny Szubowicz
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-28 18:51 UTC by Stuart Hayes
Modified: 2013-12-06 14:57 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-06 14:57:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
proposed patch (643 bytes, patch)
2012-02-28 19:13 UTC, Stuart Hayes
no flags Details | Diff

Description Stuart Hayes 2012-02-28 18:51:17 UTC
Description of problem:

When a PCI deivce is setup in pci_setup_device, one of the things it does is set pdev->aer_firmware_first to 1 if the ACPI HEST table indicates that the system firmware is going to take care of errors on that device.

The function that actually parses the HEST is drivers/acpi/hest.c acpi_hest_firmware_first(), which loops through each HEST entry and checks it.  However, this function is failing to update a pointer each loop, so it ends up thinking that each entry is the same type of HEST entry as the first one, so the table is not parsed correctly (excpet for the first entry...).

This is causing some error reporting registers to get enabled when they shouldn't.


Version-Release number of selected component (if applicable):

RHEL 5.7 -- 2.6.18-308.el5 kernel


How reproducible:

every time, if you have a setup that is susceptible to the issue


Steps to Reproduce:
1. install a pci card that's behind a bridge (I am using a qlogic QLE2462 fibre channel card)... notice that BIOS sets the device control register (offset 8 in the pci express capability structure) set to 0x4814 (correctable, non-fatal, and unsupported request error reporting are all disabled)
2. boot into rhel5.7
3. use lspci -vvv (or -xxx) to see that this register was changed to 0x481f (because the qla2xxx driver calls pci_enable_pcie_error_reporting(), and aer_firmware_first for this device was 0)
  
Actual results:
the device control register (offset 8 in the pci express capability structure for the qlogic card) is changed from 0x4814 to 0x481f when the qla2xxx driver loads

Expected results:
the device control register should be left at 0x4814

Additional info:
i have a trivial patch to fix this... i will attach it to this bug

Comment 1 Stuart Hayes 2012-02-28 19:13:21 UTC
Created attachment 566382 [details]
proposed patch

Here's a patch that I've tested.

Comment 2 Lenny Szubowicz 2013-12-06 14:57:47 UTC
This Bugzilla has been reviewed by Red Hat and is not planned on being
addressed in Red Hat Enterprise Linux 5, and therefore is being closed.

If this bug is critical to production systems, please contact your Red
Hat support representative and provide a sufficient business
justification in order to re-open it.

                               -Lenny.


Note You need to log in before you can comment on or make changes to this bug.