Bug 632195
| Summary: | [NetApp 5.6 bug] SCSI devices offlined on 5.5 FC host during IO with fabric faults | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Martin George <marting> | ||||||
| Component: | kernel | Assignee: | Mike Christie <mchristi> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Storage QE <storage-qe> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 5.5.z | CC: | andriusb, bdonahue, coughlan, mchristi, xdl-redhat-bugzilla | ||||||
| Target Milestone: | rc | Keywords: | OtherQA, Regression | ||||||
| Target Release: | 5.6 | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2010-11-09 15:28:51 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 557597 | ||||||||
| Attachments: |
|
||||||||
|
Description
Martin George
2010-09-09 11:34:00 UTC
Created attachment 446233 [details]
/var/log/messages for the SCSI offline scenario
Above logs taken with lpfc log verbose set to 0x1004
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. After discussions with Mike, we came to the following conclusions: 1) Firstly the SCSI offline is hit due to a regression introduced in RHEL 5.5 where the offlined SCSI devices were prevented from moving back to running state (in certain scenarios like the one mentioned above in comment #0). This issue is also seen in RHEL6 - tracked in bug 643237. 2) There is also a race bug in the timeout code for FC drivers which may also trigger the SCSI offline issue. This is present in all RHEL4/RHEL5/RHEL6 kernels. But that's a different issue altogether and not tracked here. Created attachment 454776 [details]
Mike Christie's reverted block state debug patch addressing the RHEL 5.5 regression
(In reply to comment #5) > After discussions with Mike, we came to the following conclusions: > > 1) Firstly the SCSI offline is hit due to a regression introduced in RHEL 5.5 > where the offlined SCSI devices were prevented from moving back to running > state (in certain scenarios like the one mentioned above in comment #0). This is addressed by Mike's patch in comment #6. Mike, Could you also attach the actual patch here (the non debug one)? (In reply to comment #8) > Mike, > > Could you also attach the actual patch here (the non debug one)? It is actually in a kernel you can test already: https://bugzilla.redhat.com/show_bug.cgi?id=641193#c3 *** This bug has been marked as a duplicate of bug 641193 *** |