Bug 643237
| Summary: | [NetApp 6.1 bug] regression: allow offlined devs to be set to running | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Mike Christie <mchristi> | ||||
| Component: | kernel | Assignee: | Mike Christie <mchristi> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Gris Ge <fge> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.0 | CC: | bdonahue, coughlan, dhoward, fge, jwest, marting, xdl-redhat-bugzilla | ||||
| Target Milestone: | rc | Keywords: | ZStream | ||||
| Target Release: | 6.1 | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | kernel-2.6.32-91.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Prior to this update, when using Red Hat Enterprise Linux 6 with a qla4xxx driver and FC (Fibre Channel) drivers using the fc class, a device might have been put in the offline state due to a transport problem. Once the transport problem was resolved, the device was not usable until a user manually corrected the state. This update enables the transition from the offline state to the running state, thus, fixing the problem.
|
Story Points: | --- | ||||
| Clone Of: | 641193 | Environment: | |||||
| Last Closed: | 2011-05-19 12:21:14 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 641193 | ||||||
| Bug Blocks: | 660590 | ||||||
| Attachments: |
|
||||||
|
Comment 2
RHEL Program Management
2010-10-15 03:28:38 UTC
Created attachment 454673 [details]
revert patch that prevents changing device state from offline
For the IO stuck in a queue problem, it is possible if the device was offlined
by the scsi layer, then the fc class tried to delete it due to dev_loss_tmo,
then the IO could get stuck in the scsi/block layer queue. It could have gone
from blocked->offline->then it should go to the cancel and device delete state,
but due to a bug in the scsi layer, it would not and the queues would not get
started again so IO in the queue would be stuck with no timer running on it to
unjam us.
Attached is a patch that reverts the patch that caused the problem (note it is
slightly different than the RHEL 5 one).
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Mike - any idea when this is getting POSTed? Requesting 6.0.z due this is blocking NetApp's 6.0 GA cert. (In reply to comment #6) > Mike - any idea when this is getting POSTed? I was trying to work on a proper upstream fix first. I think I can send the patch in this bz that just reverts the bad patch now and then open a new bz for the proper fix. So tomorrow/this-weekend. (In reply to comment #9) > (In reply to comment #6) > > Mike - any idea when this is getting POSTed? > > I was trying to work on a proper upstream fix first. I think I can send the > patch in this bz that just reverts the bad patch now and then open a new bz for > the proper fix. So tomorrow/this-weekend. Yeah, I don't think NetApp can wait until the whole upstream set is set for 6.1 since this is blocking 6.0 cert (and needs 6.0.z) so it sounds like if you could POST just the patch that fixes this ASAP that would be great. Patch(es) available on kernel-2.6.32-91.el6
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Prior to this update, when using Red Hat Enterprise Linux 6 with a qla4xxx driver and FC (Fibre Channel) drivers using the fc class, a device might have been put in the offline state due to a transport problem. Once the transport problem was resolved, the device was not usable until a user manually corrected the state. This update enables the transition from the offline state to the running state, thus, fixing the problem.
NetApp and Qlogic verified this patch. Code reivewed. Patch has been applied into kernel-2.6.32-120. Set as Sanity only. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html |