Bug 598946 - [NetApp 5.6 bug] QLogic FC firmware errors seen on RHEL 5.5
[NetApp 5.6 bug] QLogic FC firmware errors seen on RHEL 5.5
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.5
All Linux
high Severity high
: rc
: 5.6
Assigned To: Chad Dupuis (Cavium)
Red Hat Kernel QE team
: OtherQA, Reopened, ZStream
: 604134 (view as bug list)
Depends On:
Blocks: 557597 613688
  Show dependency treegraph
 
Reported: 2010-06-02 07:57 EDT by Martin George
Modified: 2015-05-18 05:59 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-13 16:35:19 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
QLogic firmware dump (299.88 KB, application/x-gzip)
2010-06-02 08:02 EDT, Martin George
no flags Details
/var/log/messages for the above issue (422.06 KB, application/octet-stream)
2010-06-02 08:03 EDT, Martin George
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 05:37:42 EST

  None (edit)
Description Martin George 2010-06-02 07:57:11 EDT
Description of problem:
On a RHEL 5.5 host with QLogic 8G FC adapters (QLE2562), one can regularly see firmware dump errors in the /var/log/messages during IO runs as seen below:

kernel: qla2xxx 0000:06:00.0: ISP System Error - mbx1=5af4h mbx2=10h mbx3=3h.
kernel: qla2xxx 0000:06:00.0: Firmware dump saved to temp buffer (0/ffffc20000022000).
kernel: qla2xxx 0000:06:00.0: Mailbox command timeout occured. Issuing ISP abort.
kernel: qla2xxx 0000:06:00.0: Performing ISP error recovery - ha= ffff81007fe4c4f8.

Version-Release number of selected component (if applicable):
RHEL 5.5 Errata (2.6.18-194.3.1.el5)
QLE2562 FW:v4.04.09 DVR:v8.03.01.04.05.05-k

How reproducible:
Consistently
Comment 1 Martin George 2010-06-02 08:02:26 EDT
Created attachment 419001 [details]
QLogic firmware dump
Comment 2 Martin George 2010-06-02 08:03:17 EDT
Created attachment 419002 [details]
/var/log/messages for the above issue
Comment 3 Andrew Vasquez 2010-06-02 16:01:22 EDT
QLogic has released OOT (out-of-train) drivers with 5.03.02 firmware
for RHEL5.x (8.03.01.06):

http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/SearchByProduct.aspx?ProductCategory=39&Product=1044&Os=65

Patches are also being pushed for RHEL5.6 that include the updated
firmware.
Comment 4 Martin George 2010-06-02 16:14:32 EDT
(In reply to comment #3)
> Patches are also being pushed for RHEL5.6 that include the updated
> firmware.    

Any plans of including this updated driver/firmware in 5.5.z itself? If so, that would be really helpful.
Comment 5 Andrius Benokraitis 2010-06-02 16:33:58 EDT
(In reply to comment #4)
> (In reply to comment #3)
> > Patches are also being pushed for RHEL5.6 that include the updated
> > firmware.    
> 
> Any plans of including this updated driver/firmware in 5.5.z itself? If so,
> that would be really helpful.    

No, firmware is not distributed via z-stream. Any updates to the firmware should be made through Qlogic - Red Hat has no visibility/support statements regarding certain firmware versions. Red Hat defers to Qlogic on stuff like this.
Comment 6 Martin George 2010-06-03 08:10:39 EDT
(In reply to comment #5)
> No, firmware is not distributed via z-stream. Any updates to the firmware
> should be made through Qlogic - Red Hat has no visibility/support statements
> regarding certain firmware versions. Red Hat defers to Qlogic on stuff like
> this.    

But seems QLogic bundles the firmware along with the native inbox driver now. The current 4.04.xx firmware has issues causing errors on the RHEL 5.5 host as reported above. 

Since RHEL 5.6 is a long way off, we are requesting for the upgraded native inbox FC driver v8.03.01.06.05.06-k (which packages the fixed firmware v5.03.02 in it) for 5.5.z itself - since the current native RHEL 5.5 QLogic FC solution is buggy.
Comment 7 Andrius Benokraitis 2010-06-03 10:34:26 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > No, firmware is not distributed via z-stream. Any updates to the firmware
> > should be made through Qlogic - Red Hat has no visibility/support statements
> > regarding certain firmware versions. Red Hat defers to Qlogic on stuff like
> > this.    
> 
> But seems QLogic bundles the firmware along with the native inbox driver now.
> The current 4.04.xx firmware has issues causing errors on the RHEL 5.5 host as
> reported above. 
> 
> Since RHEL 5.6 is a long way off, we are requesting for the upgraded native
> inbox FC driver v8.03.01.06.05.06-k (which packages the fixed firmware v5.03.02
> in it) for 5.5.z itself - since the current native RHEL 5.5 QLogic FC solution
> is buggy.    

Not quite - upgrading of the firmware inbox or out-of-box doesn't present the same issues as using an inbox or out-of-box driver. Firmware is a blob that RH doesn't have source code for, so it doesn't matter where it comes from. Red Hat defers to QLogic anyway for a firmware issue, even if it is distributed by Red Hat. Customers can (and should) grab the latest firmware from QLogic since it is released much more often.
Comment 8 Andrius Benokraitis 2010-06-03 10:35:13 EDT
Closing - QLogic will update firmware as part of the normal process in RHEL 5.6.
Comment 9 Andrius Benokraitis 2010-06-14 16:12:23 EDT
At the request of QLogic - will open up additional dialogue with NetApp.

NetApp - how frequently are you hitting this?
Comment 10 Martin George 2010-06-15 07:59:07 EDT
(In reply to comment #9)
> At the request of QLogic - will open up additional dialogue with NetApp.
> 
> NetApp - how frequently are you hitting this?    

Seen consistently during heavy IO.
Comment 11 Andrius Benokraitis 2010-06-15 14:31:31 EDT
*** Bug 604134 has been marked as a duplicate of this bug. ***
Comment 12 Andrius Benokraitis 2010-06-15 14:32:06 EDT
Re-opening -- QLogic to look into this.
Comment 13 Tanvi 2010-06-21 05:42:28 EDT
QLogic, any update here?
Comment 14 Andrius Benokraitis 2010-06-21 10:03:10 EDT
Tanvi,

QLogic is investigating if the reward outweighs the risk. Updating a full-blown firmware via z-stream will be largely *untested* by Red Hat and will introduce additional risk. Stay tuned.

Andrius.
Comment 16 Rob Evers 2010-06-23 13:10:22 EDT
Qlogic to provide testing information of the firmware update under the rhel5.5 environment.
Comment 17 RHEL Product and Program Management 2010-06-23 13:18:24 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 18 Andrius Benokraitis 2010-06-23 23:58:01 EDT
Based on discussions with Tom Coughlan and Rob Evers, and weighing the risk/reward and seriousness of this defect, RH Engineering is recommendeding this be proposed for RHEL 5.5.z.
Comment 25 Tom Coughlan 2010-06-29 11:22:25 EDT
You can get a test 5.5.z kernel with the new firmware here:

http://people.redhat.com/coughlan/bz598946/
Comment 26 Martin George 2010-07-06 08:11:03 EDT
The test kernel looks good - we don't see the QLogic firmware errors now.
Comment 29 Jarod Wilson 2010-07-28 10:28:18 EDT
in kernel-2.6.18-209.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.
Comment 31 Chris Ward 2010-12-02 10:28:47 EST
Reminder! There should be a fix present for this BZ in snapshot 3 -- unless otherwise noted in a previous comment.

Please test and update this BZ with test results as soon as possible.
Comment 33 errata-xmlrpc 2011-01-13 16:35:19 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html
Comment 34 masanari iida 2011-12-16 05:45:11 EST
Andrew,
Does this symptom happen on ISP2532 (HPAK344A) with fw=4.04.05?
One customer saw this "ISP System Error" only once.

I found the firmware release not, but I have no idea which error
report correspond to this issue.
http://filedownloads.qlogic.com/files/software/75742/release.pdf

Thanks
Comment 35 Chad Dupuis (Cavium) 2011-12-16 11:52:38 EST
(In reply to comment #34)
> Andrew,
> Does this symptom happen on ISP2532 (HPAK344A) with fw=4.04.05?
> One customer saw this "ISP System Error" only once.
> 
> I found the firmware release not, but I have no idea which error
> report correspond to this issue.
> http://filedownloads.qlogic.com/files/software/75742/release.pdf
> 
> Thanks

It's possible.  The problem in this BZ was originally with the 4.04.09 firmware.  I'd recommend using the latest RHEL 5 release (RHEL 5.7 as of this date) as it has the updated firmware.

Note You need to log in before you can comment on or make changes to this bug.