Bug 300871

Summary: Busy status on tape write results in incorrect residual returned to tape driver
Product: Red Hat Enterprise Linux 5 Reporter: Jonathan Lim <jolim>
Component: kernelAssignee: Jonathan Lim <jolim>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.1CC: coughlan, dledford, gbeshers, jh, martinez, mdr, peterm, prarit, tee
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0314 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 14:55:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 425461    
Attachments:
Description Flags
linux-2.6-scsi-clear-resid.patch none

Description Jonathan Lim 2007-09-21 17:45:35 UTC
Description of problem:

A scsi i/o request to a tape drive can receive a "stale" residual when an
immediate mode rewind request is in progress.   The i/o request receives a BUSY
response from the tape drive.  The low-level driver sets the residual field of
the scsi request to the request length and returns

  (DID_OK << 16) | scsi_status

in the result field.  The result field contains "8" which breaks down as
status_byte(result) == 4, i.e. BUSY.

scsi_softirq_done() calls scsi_decide_disposition() which returns
ADD_TO_MLQUEUE.  scsi_softirq_done() then calls scsi_queue_insert() which, on
the way to resubmitting the request to the driver, calls scsi_init_cmd_errh().

The following patch modifies scsi_init_cmd_errh() to zero the residual before
the command is (re)submitted.  This results in the correct residual being
returned when the command finally completes.

The patch was posted to linux-scsi mailing list on September 17, 2007, and
applies to 2.6.23-rc6-git7.  It should apply (+/-) to rhel5-latest.

--- linux-2.6.23-rc6-git7.orig/drivers/scsi/scsi_lib.c  2007-09-17
14:02:03.000000000 -0700
+++ linux-2.6.23-rc6-git7/drivers/scsi/scsi_lib.c       2007-09-17
14:05:51.000000000 -0700
@@ -443,6 +443,7 @@ EXPORT_SYMBOL_GPL(scsi_execute_async);
 static void scsi_init_cmd_errh(struct scsi_cmnd *cmd)
 {
        cmd->serial_number = 0;
+       cmd->resid = 0;
        memset(cmd->sense_buffer, 0, sizeof cmd->sense_buffer);
        if (cmd->cmd_len == 0)
                cmd->cmd_len = COMMAND_SIZE(cmd->cmnd[0]);

Version-Release number of selected component (if applicable):

How reproducible:

During tape write when immediate mode rewind request is in progress.

Steps to Reproduce:

See description above.
  
Actual results:

Incorrect residual value returned.

Expected results:

Residual value 0 returned.

Additional info:

A diff to the RHEL5.1 kernel source will be posted here once the patch has been
confirmed to be upstream.

Comment 2 RHEL Program Management 2007-09-26 18:54:19 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Jonathan Lim 2007-09-26 18:55:19 UTC
Created attachment 207371 [details]
linux-2.6-scsi-clear-resid.patch

Diff against 2.6.18-47.el5 kernel source.

Comment 4 Jonathan Lim 2007-09-27 18:39:46 UTC
Changed status per instructions from Don Zickus.


Comment 6 Don Zickus 2007-11-29 17:07:00 UTC
in 2.6.18-58.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 7 Jonathan Lim 2008-01-04 21:37:02 UTC
Fix verified at SGI.

Comment 11 errata-xmlrpc 2008-05-21 14:55:50 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html