Bug 462416 - [QLogic 5.3 bug] Update qla2xxx - PCI EE error handling support
[QLogic 5.3 bug] Update qla2xxx - PCI EE error handling support
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
high Severity urgent
: beta
: ---
Assigned To: Marcus Barrow
Martin Jenner
: OtherQA
Depends On:
Blocks: 415811
  Show dependency treegraph
Reported: 2008-09-15 22:19 EDT by Marcus Barrow
Modified: 2009-06-20 01:28 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-01-20 14:59:36 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
EEH for rhel5.3 (9.77 KB, patch)
2008-09-15 22:39 EDT, Marcus Barrow
no flags Details | Diff
V2 (9.72 KB, patch)
2008-09-29 19:13 EDT, Marcus Barrow
no flags Details | Diff

  None (edit)
Description Marcus Barrow 2008-09-15 22:19:16 EDT
Provide support for PCI Enhanced Error Recovery.

This was originally approved for 5.2 and an update to the driver was provided by issues found during testing prevented it's release during 5.2. In the meantime IBM contributed the work to provide fixes to the kernel and an updated patch for our driver.

The original BZ was 253267. This patch has been tested at IBM before being provided here and it applies to kernel-2.6.18-110 and builds cleanly.
Comment 1 Marcus Barrow 2008-09-15 22:39:19 EDT
Created attachment 316804 [details]
EEH for rhel5.3
Comment 2 RHEL Product and Program Management 2008-09-16 08:43:33 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 3 Richard A Lary 2008-09-16 14:55:48 EDT
This patch provides EEH support for PPC platforms that was originally requested in RHEL5.2, but had to be removed due to issues found during testing.

EEH provides the ability for QLogic adapter to transparently recover from PCIe 
protocol errors detected by IBM EEH hardware on Power Series platforms.

Without this functionality, any protocol errors detected would cause I/O to
fail and adapter to be shutdown immediately.

IBM will conduct testing of this patch and update this bug with results.
Comment 4 Marcus Barrow 2008-09-29 19:13:03 EDT
Created attachment 318023 [details]

updated the patch to remove a line of code from upstream. The patch uncommented a line of code that released the firmware on module unload. It is now commented out again.
Comment 5 Don Zickus 2008-10-06 11:56:16 EDT
in kernel-2.6.18-118.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 7 Richard A Lary 2008-10-07 11:47:27 EDT
(In reply to comment #5)
> in kernel-2.6.18-118.el5
> You can download this test kernel from http://people.redhat.com/dzickus/el5

Thanks, will download, build and test today.
Comment 8 Chris Ward 2008-10-21 09:10:33 EDT
Attention Partners! 

RHEL 5.3 public Beta will be released soon. This URGENT severity bug
should have a fix in place in the recently released Partner Alpha drop,
available at ftp://partners.redhat.com. If you haven't had a chance yet to test
this bug, please do so at your earliest convenience, to ensure the highest
possible quality bits in the upcoming Beta drop.

Thanks, more information about Beta testing to come.

 - Red Hat QE Partner Management
Comment 9 Chris Ward 2008-11-14 09:05:36 EST
~~~ Attention Partners! ~~~

Please test this URGENT / HIGH priority bug at your earliest convenience to ensure it makes it into the upcoming RHEL 5.3 release. The fix should be present in the Partner Snapshot #2 (kernel*-122), available NOW at ftp://partners.redhat.com. As we are approaching the end of the RHEL 5.3 test cycle, it is critical that you report back testing results as soon as possible. 

If you have VERIFIED the fix, please add PartnerVerified to the Bugzilla Keywords field to indicate this. If you find that this issue has not been properly fixed, set the bug status to ASSIGNED with a comment describing the issues you encountered.

All NEW issues encountered (not part of this bug fix) should have a new bug created with the proper keywords and flags set to trigger a review for their inclusion in the upcoming RHEL 5.3 or other future release. Post a link in this bugzilla pointing to the new issue to ensure it is not overlooked.

For any additional questions, speak with your Partner Manager.
Comment 10 Chris Ward 2008-11-18 13:14:08 EST
~~ Snapshot 3 is now available ~~ 

Snapshot 3 is now available for Partner Testing, which should contain a fix that resolves this bug. ISO's available as usual at ftp://partners.redhat.com. Your testing feedback is vital! Please let us know if you encounter any NEW issues (file a new bug) or if you have VERIFIED the fix is present and functioning as expected (add PartnerVerified Keyword).

Ping your Partner Manager with any additional questions. Thanks!
Comment 11 Chris Ward 2008-11-28 01:45:37 EST
~~ Attention ~~ Snapshot 4 is now available for testing @ partners.redhat.com ~~

Partners, it is vital that we get your testing feedback on this important bug fix / feature request. If you are unable to test, please clearly indicate this in a comment to this bug or directly with your partner manager. If we do not receive your test feedback, this bug is at risk from being dropped from the release.

If you have VERIFIED the fix, please add PartnerVerified to the Bugzilla Keywords field, along with a description of the test results. 

If you encounter a new bug, CLONE this bug and request from your Partner manager to review. We are no longer excepting new bugs into the release, bar critical regressions.
Comment 12 Chris Ward 2008-12-04 05:19:10 EST
IBM, QLogic, what is the current status of this bug fix? The fix should be present in the latest RHEL5.3 Snapshot. Please test and send feedback ASAP.
Comment 13 Marcus Barrow 2008-12-04 10:08:15 EST
I believe the current status is as follows: Without this patch the system will panic when it encounters a PCI Bus Error. When this patch is applied, the system will recover from several PCI Bus Errors. On about the 6th error the system will panic.

The current status seems to be a substantial improvement. It needs more work however. Work will continue and hopefully yield a solution before the actual availability of 5.3, as a KMOD, but there is higher priority work for FCoE, which is also desired for 5.3
Comment 14 Richard A Lary 2008-12-04 10:11:43 EST
The code for support of Enhanced Error Recovery is in current RHEL5.3 snapshots.
This function is intended to allow qlogic adapters to recover on IBM Power platforms in the unexptected case where a PCI hardware error is detected.  The current qla2xxx code has as yet unresolved issues which prevent a complete recovery.

The issues with qla2xxx EEH code will not affect normal operation of IBM power platforms, it is only invoked in unxpected case where a PCI hardware error is detected.

Reccomendation is to retarget this work to RHEL5.4.
Comment 15 Chris Ward 2008-12-04 10:25:36 EST
If i read you right, the fix present in 5.3 is sufficient to call this bug VERIFIED. However, additional fixes are requested for a future release (ie, 5.4). 

If this is the case, please add the PartnerVerified keyword and move this bug to VERIFIED. Then CLONE this bug or open a fresh new bug and describe the issues you would like fixed in an upcoming RHEL release.
Comment 16 Richard A Lary 2008-12-04 12:05:41 EST
Added partnerverified to keywords, will open a new bugzilla for RHEL5.4 with information for specific outstanding issues.
Comment 18 errata-xmlrpc 2009-01-20 14:59:36 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.