Bug 231319

Summary: [QLogic 4.6 bug] Qlogic driver handles RSCN updates in a problematic way
Product: Red Hat Enterprise Linux 4 Reporter: Josef Bacik <jbacik>
Component: kernelAssignee: Marcus Barrow <mbarrow>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: andrew.vasquez, andriusb, casmith, coughlan, dmair, emcnabb, hgarcia, jbaron, mbarrow, mceci, mchristi, michael.hagmann, pan_haifeng, poelstra, qlogic-redhat-ext
Target Milestone: ---Keywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0791 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 16:21:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 217099    
Attachments:
Description Flags
time based failover for dm multipath
none
use did imm retry instead of did bus busy none

Description Josef Bacik 2007-03-07 18:33:31 UTC
I'm putting this in bugzilla because we have _many_ customers who hit this 
issue and its never completely resolved because of inability to debug and 
other such road blocks.  I have recently gotten a customer who reproduced the 
problem with extended_error_logging enabled which has narrowed down the issue.  
I'm going to do my best to explain the situation, and I apologize if I 
misspeak or use the wrong terminology.

The qlogic driver in RHEL4 maintains its own scsi command queue, where it 
knows all currently scsi commands that it is currently in charge of.  When the 
qlogic driver gets an RSCN update, it sets LOOP_RESYNC_NEEDED on the host's 
flags.  This in turn means that qla2x00_loop_resync gets run, and subsequently 
qla2x00_restart_queues gets run.  This runs through the HA's pending queue and 
retry queue and marks all of the scsi commands with DID_BUS_BUSY and kicks 
them back up to the scsi midlayer.  Now generally this isn't a problem, since 
the scsi midlayer will just retry it and everything goes along its merry way, 
but when using DM multipathing or EMC power path, the REQ_FAILFAST gets set on 
that particular request, which means the command is never retried, and the 
command is sent all the way back up to the multipathing layer.  Now in the 
case of DM multipath, it only handles things on the BIO level, so the only 
error that it sees is -EIO, it has no way to differentiate what happened so 
there is no way for it to check the error and possibly retry, so having some 
sort of check in DM multipath isn't an option.  Upstream Andrew posted this 
patch, 

commit f4f051ebb40e74ad0ba02d2cb3a6c16b0393472b
Author:  <andrew.vasquez>
Date:   Sun Apr 17 15:02:26 2005 -0500
    
    [PATCH] qla2xxx: remove internal queuing...

which removes all of this internal queueing crap.  Unfortunately it also 
depends on this patch

commit 8482e118afa0cb4321ab3d30b1100d27d63130c0
Author:  <andrew.vasquez>
Date:   Sun Apr 17 15:04:54 2005 -0500

    [PATCH] qla2xxx: add remote port codes...

which in turn relies on a few other patches.

So RSCN updates will cause the current path to fail, and then if all of your 
paths are hooked into switches that also receive RSCN updates at the same 
time, all of your paths will be failed, and your filesystem will be remounted.  
Now in DM multipath there are things that you can do to get around this, ie 
the queue if no path option, but AFAIK there is no such thing for EMC.  

Mike Christie has offered up a patch that will retry any BIO that comes back 
with an error for a limited amount of time, but the problem with this is that 
there is no way to distinguish what kind of error occurred, so if a true error 
happens, you get the retry delay instead of failing over.  And again this 
leaves customers with EMC power path (which are numerous and large) without a 
solution.

So this bugzilla is in order to facilitate some sort of permanent solution to 
this problem.  I believe the best solution is to keep the qlogic driver from 
setting DID_BUS_BUSY for these kind of scenarios.  Hopefully through this 
bugzilla we can determine the best course of action.

Comment 1 Josef Bacik 2007-03-07 18:39:51 UTC
Created attachment 149474 [details]
time based failover for dm multipath

this is the time based failover patch that Mike Christie suggested on RHKL in
reference to this problem.

Comment 2 Mike Christie 2007-03-07 19:40:22 UTC
(In reply to comment #0)  
> Now in DM multipath there are things that you can do to get around this, ie 
> the queue if no path option, but AFAIK there is no such thing for EMC.  
> 

Just my 2 cents on this. If EMC does not have the exact same thing it is because
how you handle errors isimplementation specific. If we can get some traces from
EMC, they have their own path testing and failback scheme. DM decided to haandle
the problem partially in userspace.

Also the problem of error being propogated back to the FS layer when there are
no paths is not limited to the qla2xxx RCSN problem. It occurs with any driver
and any transport if there is a single point of failure and multipath layer
decideds to fail IO to the FS layer instead of retrying it.

For iscsi we have the same problem. If you put all your cables through one
switch and reboot the switch, you will get errors on all paths and then if no
path retry is set to fail the IO it will fail the IO when there are no paths.

Comment 3 Mike Christie 2007-03-09 21:04:08 UTC
Created attachment 149734 [details]
use did imm retry instead of did bus busy

Here is the patch from Andrew Vasquez.

From Andrew:

Essentially it's a backport of changes done in our standard driver which swap
DID_BUS_BUSY statuses for DID_IMM_RETRY statuses in 'select' logic
paths -- those where the driver uses command recylcing during topology
disruptions.

Of course the usage of DID_IMM_RETRY implies some care, as to avoid infinite
retries.  But, given the use of qla2xxx's own internal dev-loss-tmo timers,
command recycling will not proceed ad infinitum.

I'd suggest RH serious consider this for their RHEL4 qla2xxx driver.

Comment 4 Mike Christie 2007-03-09 21:06:35 UTC
I think this patch should be fine because as Andrew pointed it out the driver
has timers so the command is not retried forever and he stated that:

    RSCN processing is typically very fast.  The worse case fabric timeout
    one must worry about for any type of extended-link-service fabric
    command is 2 * R_A_TOV, where R_A_TOV is typically 10 seconds.

So commands would not sit too long.

Comment 5 RHEL Program Management 2007-05-09 05:50:25 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Tom Coughlan 2007-05-31 01:04:07 UTC
Marcus,

Is this in your queue for 4.6? If not, please consider it a hight priority.

Tom

Comment 9 Marcus Barrow 2007-05-31 15:50:23 UTC
I will put the it in the queue. QLogic was not sure if the opinion fell in favor of
including this.



Comment 11 Issue Tracker 2007-06-13 14:57:22 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Client
Resolution set to: 'Closed by Client'

This event sent from IssueTracker by robert.wehner 
 issue 119734

Comment 14 Marcus Barrow 2007-07-13 23:11:19 UTC
The use imm retry patch was submitted to RHEL4.6


Comment 16 Don Howard 2007-07-17 19:53:14 UTC
A patch addressing this issue has been included in kernel-2.6.9-55.19.EL.

Comment 23 Andrius Benokraitis 2007-07-23 19:56:41 UTC
The reason that kernel package isn't signed is because it is an unofficial build
on the way to RHEL 4.6 Beta. If you require an officially supported kernel with
this fix prior to RHEL 4.6, please request a hotfix.

Comment 29 Pan Haifeng 2007-08-10 19:00:11 UTC
*** Bug 180212 has been marked as a duplicate of this bug. ***

Comment 30 John Poelstra 2007-08-29 04:24:38 UTC
A fix for this issue should have been included in the packages contained in the
RHEL4.6 Beta released on RHN (also available at partners.redhat.com).  

Requested action: Please verify that your issue is fixed to ensure that it is
included in this update release.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to FAILS_QA.

If you cannot access bugzilla, please reply with a message to Issue Tracker and
I will change the status for you.  If you need assistance accessing
ftp://partners.redhat.com, please contact your Partner Manager.

Comment 32 John Poelstra 2007-08-29 22:31:19 UTC
thanks for your update

Comment 37 errata-xmlrpc 2007-11-15 16:21:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html