Bug 165825 - Inquiry (sg) command hang after a write to tape with mptscsi driver
Inquiry (sg) command hang after a write to tape with mptscsi driver
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
:
Depends On:
Blocks: 168424
  Show dependency treegraph
 
Reported: 2005-08-12 12:24 EDT by Wendy Cheng
Modified: 2010-10-21 23:15 EDT (History)
3 users (show)

See Also:
Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-15 11:24:00 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strace and scsi trace (233.85 KB, text/plain)
2005-08-12 12:30 EDT, Wendy Cheng
no flags Details
U6-beta.strace.txt.data-3-1 (2.40 KB, text/plain)
2005-08-12 12:50 EDT, Wendy Cheng
no flags Details
U6-beta-messages-data-3-2 (34.85 KB, text/plain)
2005-08-12 12:51 EDT, Wendy Cheng
no flags Details
U6-beta-sgdump-data-3-3 (448.58 KB, text/plain)
2005-08-12 12:52 EDT, Wendy Cheng
no flags Details
data requested by dledford - 2-1: dmesg file (15.56 KB, text/plain)
2005-08-15 16:28 EDT, Wendy Cheng
no flags Details
data requested by dledford - 2-2: messages file (43.19 KB, text/plain)
2005-08-15 16:29 EDT, Wendy Cheng
no flags Details
patch, to release bufffer after write completes (1.01 KB, patch)
2005-09-16 10:43 EDT, Tom Coughlan
no flags Details | Diff

  None (edit)
Description Wendy Cheng 2005-08-12 12:24:07 EDT
Description of problem:

Problem reported by Jim Kam (jim.kam@hp.com -  HP 3rd level escalation engineer)
Red Hat TAM: Chris Williams
                                                                               
                      
[Hardware]
The machine is a Proliant Server (DL 360) with an LSI Ultra 320 controller
(which uses the mptscsi driver), to which is attached a HP Ultrium 460 tape
drive. The box runs RHEL 3 and has tried U3, U4, and U6-beta with the same problem.
                                                                               
                      
[Application]
The application is Legato Tape backup software with a tape diags utility that
lets users send SCSI commands to the device. They can send an Inquiry and it
works fine. However once they do a write to tape then follow it with an Inquiry,
the tape device "hangs" until a signal is sent to the device.
                                                                               
                      
[Additional Info]
When an Adaptec controller is used, the operation completes normally. In talking
to the Adaptec folk (between HP and Adaptec) some time ago, they had mentioned
that they bypassed some of the mid layer in favor of their own routines.
                                                                               
                      
With the straces as well as SCSI traces, Jim observed that after the tape has
done the write, nothing else gets sent to the driver.

Upload two sets of data sent by Jim Kam - first data is an strace log showing
the difference between an Adaptec controller and the LSI controller on U3. The
second set repeats the experiment with U6-beta kernel (2.4.21-34.ELsmp).

Version-Release number of selected component (if applicable):
smp-2.4.21-34.EL.i686
Comment 2 Wendy Cheng 2005-08-12 12:30:19 EDT
Created attachment 117676 [details]
strace and scsi trace
Comment 3 Wendy Cheng 2005-08-12 12:50:06 EDT
Created attachment 117679 [details]
U6-beta.strace.txt.data-3-1
Comment 4 Wendy Cheng 2005-08-12 12:51:21 EDT
Created attachment 117680 [details]
U6-beta-messages-data-3-2
Comment 5 Wendy Cheng 2005-08-12 12:52:13 EDT
Created attachment 117681 [details]
U6-beta-sgdump-data-3-3
Comment 6 Wendy Cheng 2005-08-12 12:59:25 EDT
Highlight of Jim's trace - the process hung at #788 __wait_event_interruptible()
whenever an INQ command is followed by tape write on mptscsi driver:

    772     case SG_IO:
    773         {
    774             int blocking = 1;   /* ignore O_NONBLOCK flag */
    775
    776             if (sdp->detached)
    777                 return -ENODEV;
    778             if(! scsi_block_when_processing_errors(sdp->device) )
    779                 return -ENXIO;
    780             result = verify_area(VERIFY_WRITE, (void *)arg, SZ_SG_IO_HDR);
    781             if (result) return result;
    782             result = sg_new_write(sfp, (const char *)arg, SZ_SG_IO_HDR,
    783                                   blocking, read_only, &srp);
    784             if (result < 0) return result;
    785             srp->sg_io_owned = 1;
    786             while (1) {
    787                 result = 0;  /* following macro to beat race condition */
    788                 __wait_event_interruptible(sfp->read_wait,
    789                        (sdp->detached || sfp->closed || srp->done), result);
    790                 if (sdp->detached)
    791                     return -ENODEV;
    792                 if (sfp->closed)
    793                     return 0;       /* request packet dropped already */
    794                 if (0 == result)
"drivers/scsi/sg.c" line 794 of 3104 --25%-- col 1-8
Comment 7 Wendy Cheng 2005-08-15 16:28:20 EDT
Created attachment 117772 [details]
data requested by dledford - 2-1: dmesg file

data generated via:

echo "scsi log mlqueue 3" > /proc/scsi/scsi
echo "scsi log mlcomplete 3" > /proc/scsi/scsi
Comment 8 Wendy Cheng 2005-08-15 16:29:45 EDT
Created attachment 117773 [details]
data requested by dledford - 2-2: messages file
Comment 9 Wendy Cheng 2005-08-23 14:55:41 EDT
From Jim:

On the CD, I have included some rpms - nsrserv-7.1A00-04.i386.rpm and
nsrdiag.rpm. I believe I also included openmotif as well. Before nsrserv
can be installed, openmotif must first be installed. Then install
nsrserv, then nsrdiag.

To duplicate the problem, run the following:

/opt/nsr/diag/tapediag -vvv /dev/st0

From here  you can send commands to the tape drive. For instance, from
the prompt you can send an 

inq

Command to do an inqiry.

Here is the sequence of commands.

inq                /*send it an initial inquiry to make sure that it is
talking to the tape drive */
readonly off       /*set it so that it can write to tape */
open wr            /*open for writing */
write              /*write to tape*/
inq                /* Here's where it will "Hang" until you issue a ^C.
Prior inquires are fine. Once you do the 
                      write, inq will result in the hang */


If you have any questions, please contact me by email of phone. If I do
not pick up the phone, please feel free to page me.

Thanks,
Jimk 

Jim M. Kam

Engineering Problem Resolution
* Jim.Kam@hp.com
 * 281-518-1076, Pager 713-710-6504
Comment 10 Tom Coughlan 2005-08-31 18:39:26 EDT
The reason aic79xx works is because it always has a queue depth of at
least 2. From the aic79xx code:

           /*
            * We allow the OS to queue 2 untagged transactions to
            * us at any time even though we can only execute them
            * serially on the controller/device.  This should
            * remove some latency.
            */
            scsi_adjust_queue_depth(dev->scsi_device,
                             /*NON-TAGGED*/0,
                             /*queue depth*/2);

mpt fusion sets queue_depth to 1 for tapes. This is appropriate, since tapes do
not support SCSI tagged commands. Other drivers probably do this as well. 
Unfortunately, it causes a problem with the way st driver does asynchronous
writes (the default behavior). 

The way st driver works is that it holds on to the write command structure after
the write completes. Then, when the next request comes along st driver frees the
old command structure (in write_behind_check) and then issues the new command.
Everything works fine, even if there is only one command structure. This is why
a write followed by another write, or followed by a read, work fine even on mpt
fusion. 

The problem comes up when other commands, like Inquiry and Rewind, that use sg
driver instead of st driver are issued. If these requests are issued, and sg
driver finds that the one and only command structure is still being held by st
driver, then the process hangs waiting for a free command structure.      

As a test, I changed the min. mpt fusion queue_depth to 2, and now I see
the same behavior for both mpt fusion and aic79xx.

Unfortunately, this is not an appropriate long-term solution. This is because,
as I mentioned above, a queue depth of one is appropriate for non-tagged
devices. It would be very difficult to ensure that all the drivers issue just
one command at a time to non-tagged devices eventhough we increased the queue
depth to >1. 

So, the right solution is to modify st driver so that it releases the command
structure after the write is complete, and not wait for the next request to
cause it to do so. I am currently investigating such a patch. 

One of the issues that we will need to deal with is to determine whether this
problem exists in the upstream 2.6 kernel, and if so, to get the fix reviewed
and approved by them. Have you tried this test on RHEL-4, or some other 2.6
kernel? Would you be able to? If you can not, can you give me a version of
nsrserv and nsrdiag that work on 2.6 (I have not tried the ones I have)?

Thanks.

Tom

Comment 11 Wendy Cheng 2005-09-01 20:25:44 EDT
Email from Jim:

In answer to Tom's question -

We did test it with RHEL 4, and the problem did not show up. The Legato files
that I had sent do work with the 2.6 kernel.

BTW we do appreciate the sterling job that RH has done (in particular Tom
Coughlan and Doug Ledford have done to get to the source of the issue). Thank
you so much for your help.

jimk
Comment 13 Tom Coughlan 2005-09-16 10:43:01 EDT
Created attachment 118894 [details]
patch, to release bufffer after write completes

Here is a patch to fix the problem, thanks to Doug Ledford. Initial testing
looks good.

Jim, please test this thoroughly and let us know the results.
Comment 14 Tom Coughlan 2005-10-05 15:39:12 EDT
On Sept. 20 I received the following from Jim Kam:

"I have been doing some testing on it, and thus far it looks great. I need to
test it on an Ultrium drive, since that is what the customer has. I should have
that done sometime tomorrow. Plus we are also going to do some large backup jobs
to make sure that it works properly with no other discernable issues."

Please let us know the results of these tests. If the patch is accepted, it will
go in RHEL 3 U7. It can be provided as a hotfix as needed prior to U7.

Comment 16 Ernie Petrides 2005-10-20 01:44:59 EDT
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.6.EL).
Comment 19 Red Hat Bugzilla 2006-03-15 11:24:00 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html

Note You need to log in before you can comment on or make changes to this bug.