Bug 467153
Summary: | [QLogic 5.3 bug] latest qlogic driver takes several minutes to find LUNs on older qla2xx controller | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Doug Chapman <dchapman> |
Component: | kernel | Assignee: | Marcus Barrow <mbarrow> |
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> |
Severity: | urgent | Docs Contact: | |
Priority: | medium | ||
Version: | 5.3 | CC: | andrew.vasquez, andriusb, berthiaume_wayne, coughlan, cward, dwa, dzickus, kueda, m-ikeda, mikeda, qlogic-redhat-ext, rpacheco |
Target Milestone: | rc | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-01-20 20:16:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 415811 | ||
Attachments: |
Description
Doug Chapman
2008-10-16 02:33:03 UTC
Created attachment 320509 [details]
messages with timestamps
I found I can reproduce this easily at runtime by rmmod qla2xxx / modprobe qla2xxx. I captured the logs from /var/log/messages so you can see the timestamps and can see exactly where the long 4+ minute pause is.
If you could try reproducing this with "ql2xextended_error_logging=1" appended to the modprobe line, that would be a big help. The log shows that there was a timeout and error recovery was performed;. Perhaps more logging would point to a clue... Created attachment 320550 [details]
logs with ql2xextended_error_logging=1
Created attachment 320596 [details]
Don't do NPIV table init for older HBA's
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. in kernel-2.6.18-120.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Attention Partners! RHEL 5.3 public Beta will be released soon. This URGENT severity bug should have a fix in place in the recently released Partner Alpha drop, available at ftp://partners.redhat.com. If you haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure the highest possible quality bits in the upcoming Beta drop. Thanks, more information about Beta testing to come. - Red Hat QE Partner Management The situation seems to get worse a little bit. kernel-2.6.18-120.el5 can't find all LUNs connected to 4G-FC HBA. I used QLA2460 and QLA2462 on an NEC ia64 machine. The machine has also 2G-FC HBAs (QCP2340, QLA2342). kernel-2.6.18-118.el5 took 4-5 minutes to find LUNs connected them, but now kernel-2.6.18-120.el5 works well with them. So new issue for 4G-FC HBA appeared despite the issue for 2G-FC HBA is solved. On my test bed, this driver has been finding Luns. Can you re-run your test after loading the driver with "ql2xextended_error_logging=1" set and attach the log? Also could you describe your storage setup including how many LUNS should be found. Thanks. Created attachment 321456 [details]
dmesg loading with ql2xextended_error_logging=1 on an NEC machine
The attachment is dmesg after
# modprobe qla2xxx ql2xextended_error_logging=1
on an NEC machine.
FC storages should be found as sdg--sdaj (30 LUNs) but they are not.
Storages on the machine are set up as below.
(S1) LSI Logic LSI22320 (Ultra320 SCSI) --> HDD 0-0, 0-1
(S2) LSI Logic LSI22320 (Ultra320 SCSI) --> DAT
(S3) LSI Logic LSI22320 (Ultra320 SCSI) --> HDD 1-0, 1-1
(S4) LSI Logic LSI22320 (Ultra320 SCSI) --> HDD 1-2, 1-3
(F1) Qlogic QLA2460 (4G-FC, single) --> FC-Storage
(F2) Qlogic QLA2462 (4G-FC, dual): port1 --> FC-Storage
: port2 --> (N/C)
(F1) and (F2) are connected same FC-Storage and construct 2-path-multi-path.
The FC-Storage has 15 LUNs. So the kernel should recognize 30 LUNs on the FC-Strage.
Partners, this bug should be fixed in the latest RHEL 5.3 Snapshot. We believe that you have some interest in its correct functionality, so we're making a friendly request to send us some testing feedback. If you have a chance to test it, please share with us your findings. If you have successfully VERIFIED the fix, please add PartnerVerified to the Bugzilla keywords, along with a description of the results. Thanks! patch verified in 2.6.18-123.el5 The second problem described in this BZ by Munehiro IKEDA, is resolved in a newer BZ, BZ 471269. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html |