158459 – RHEL 3 configures non-existent SCSI target devices

Bug 158459 - RHEL 3 configures non-existent SCSI target devices

Summary: RHEL 3 configures non-existent SCSI target devices

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	156320
TreeView+	depends on / blocked

Reported:	2005-05-22 15:43 UTC by Tom Coughlan
Modified:	2007-11-30 22:07 UTC (History)
CC List:	7 users (show)
Fixed In Version:	RHSA-2005-663
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2005-09-28 15:11:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2005:663	0	qe-ready	SHIPPED_LIVE	Important: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 6	2005-09-28 04:00:00 UTC

Description Tom Coughlan 2005-05-22 15:43:38 UTC

Emulex resolved the following problem for Morgan Stanley. 

For some types of storage device (Clariion is one example) an extra,
non-existent target device is configured. This can also cause real LUNs to not
be configured. The problem is a side-effect of a change made in U4. 

There is a one-line fix, in scan_scsis_single: 

        if (lun != 0 && (scsi_result[0] >> 5) == 1) {
                scsi_release_request(SRpnt);
+              scsi_release_commandblocks(SDpnt);
                return 0;
        }

I have reproduced the problem. It will require 2 hours to prepare the patch and
test it. Additional regression testing will be needed, but the risk is low.

Detailed analysis from Emulex:

We found the "ghost" issue on RHEL3U4 with Morgan Stanley.

It's an issue in scsi_scan.c when the device returns a PQ = 1.

Attached is a summary of the bug, and I've also attached some debugging 
that we did internally, including a 1 line fix that resolves it. Note: 
one thing I didn't point out is, once you hit this PQ error, if the 
next target exists, it will actually have a incorrect LUN 0 created (bad 
inquiry data, state hosed, etc) as it's using the old command data. 
Things don't straighten out until LUN 1 is probed on the next target

I'd assume it's in U5 as well.


We believe we understand the "Ghost Target" issue (issue #3) and 
understand how this could be causing the non-detection of luns as well
(issue #2).

This is a midlayer bug, but we happen to have a workaround in the
driver already that can be utilized. We will be working with Red Hat 
and
the upstream kernel to resolve the bug.

Please test with the workaround suggested and let us know the results.

Update 4/7/05:
  The array in question is returning a Peripheral Qualifier (PQ) value
  of 001b (not present, but could be). The SCSI midlayer has a bug
  whereby it exits the scan when it sees this value, but does not free
  command blocks. When it attempts to scan the next target/lun,
  it reuses the command blocks, which has the old address information
  in it. The midlayer thinks it's scanning the next tgt/lun
  combination, but the request to the driver is scanning the old
  address that returned the PQ value of 1. Additionally, the midlayer
  makes exceptions for Lun 0 if it returns a PQ value of 1, allowing
  it to be added to the system (thus the potential for a ghost target).

  Note: If the array returns a PQ value of 011b (not present), then
  the midlayer takes a similar code path, but it frees the command
  As such, the addressing information for the next scan will always
  be correct.

  As the PQ value returned for an unconfigured lun is device-specific
  (and both are allowed by SCSI spec), some devices may exhibit the
  "ghost" target, while others will not.

Work Around:
  The lpfc driver has a parameter that will cause us to replace 
  Peripheral Qualifier values of 001b with 011b. To enable this
  feature, turn on the following options for the lpfc driver in
  modules.conf:
       options lpfc lpfc_inq_pqb_filter=1

  Note: We instituted this feature as we had encountered lun
  skip issues in the past and had noticed that the Qlogic adapter
  was silently replacing PQ values before handling the results to
  the driver. We default this parameter to 0 (off) so that traffic
  remains un-modified unless instructed by the admin.

Formally Fixing the Issue:
  We will be communicating this problem the Red Hat and potentially
  to the 2.4 kernel maintainers.


Detailed Example of the bug:

  Configuration:  1 target (tgt #0), luns 0 and 2

  Midlayer starts scan:
    Allocates temporary device struct. No command blocks allocated.
  Midlayer scans Target 0 lun 0 :
    As no command blocks, allocate and initialize to tgt 0 lun 0
    Inquiry sent, returns PQ value of 0 (present)
    Device struct linked into system, new temporary struct allocated
      (no command blocks for it allocated).
  Midlayer scans Target 0 lun 1 :
    As no command blocks, allocate and initialize to tgt 0 lun 1
    Inquiry sent, returns PQ value of 1 (not present, but could be)
    As not lun 0 and PQ=1, exit scan  (bug here)
  Midlayer scans Target 0 lun 2 :
    As command blocks exist, don't allocate new (thus they still
       have the old address info)
    Inquiry sent. Note: Driver sees address tgt 0 lun 1 in the
       command blocks. As such, it sends it to tgt 0 lun 1, which
       responds with PQ=1 again.
    As not lun 0 and PQ=1, exit scan  (bug here)

  ... This continues until the midlayer cycles to the next target id

  Midlayer scans Target 1 lun 0 :
    As command blocks exist, don't allocate new (thus they still
       have the old address info)
    Inquiry sent. Note: Driver sees address tgt 0 lun 1 in the
       command blocks. As such, it sends it to tgt 0 lun 1, which
       responds with PQ=1 again.
    As the midlayer believes it is lun 0 and PQ=1, device struct is
       linked into system, and a new temporary struct is allocated
       (with no command blocks).
  Midlayer scans Target 1 lun 1 :
    As no command blocks, allocate and initialize to tgt 0 lun 1
    (everything is valid at this point and normal discovery resumes)


  The result of the above is:
    Target 0 lun 0 is found
    Target 0 lun 2 is not found as we never sent it an i/o
    Target 1 lun 0 is erroneously found.


============================================

The mid-layer starts it's bus scan.  It calls scan_scsis which goes 
into a loop scanning all channels, targets, and luns.  For each unique 
channel:target:lun it calls scan_scsis_single.  At this level the 
mid-layer is doing the correct thing.  For each channel and target it's looping 
from lun 0 to 255.  The problem is somwhere between scan_scsis_single 
and queuecommand.  When scan_scsis_single is called with anything 
greater than 0:0:3 lpfc_queuecommand is ALWAYS being called with 0:0:2.  Here 
is section from the log:

JIMP: scan_scsis_single - channel: 0, dev: 0, lun: 3, lun0_scsi_level: 
4, max_dev_lun: 256, sparse_lun: 1
JIMP: lpfc_queuecommand - cmd: 0x12, channel: 0, target: 0, lun: 2
lpfc0:0205:DIi:Create SCSI LUN 2 on Target 0
lpfc0:0729:FPw:FCP cmd x12 failed, x0 x2, status: x1 result: x3f Data: 
xd x5
lpfc0:0730:FPw:FCP command failed: RSP Data: x8 x0 x3f x0 x0 x0
lpfc0:0716:FPi:FCP Read Underrun, expected 256, residual 63 Data: x100 
x12 x0

Notice that scan_scsis_single is being called with lun 3 but 
lpfc_queuecommand is being called with lun 2.  Now, it get's real interesting 
when the mid-layer tries to scan 0:1:0.  Here is the logging for that:

JIMP: scan_scsis_single - channel: 0, dev: 1, lun: 0, lun0_scsi_level: 
3, max_dev_lun: 1, sparse_lun: 0
JIMP: lpfc_queuecommand - cmd: 0x12, channel: 0, target: 0, lun: 2
lpfc0:0205:DIi:Create SCSI LUN 2 on Target 0
lpfc0:0729:FPw:FCP cmd x12 failed, x0 x2, status: x1 result: x3f Data: 
x10a x102
lpfc0:0730:FPw:FCP command failed: RSP Data: x8 x0 x3f x0 x0 x0
lpfc0:0716:FPi:FCP Read Underrun, expected 256, residual 63 Data: x100 
x12 x0
  Vendor: DGC       Model:                   Rev: 0207
  Type:   Direct-Access                      ANSI SCSI revision: 04

Notice the midlayer is trying to scan channel 0, target 1, lun 0 but 
the inquiry is sent to lpfc_queuecommand with channel 0, target 0, lun 2.  
Because the inquiry to 0:0:2 completes successfully it assumes there is 
a target 1 and creates an sd device.  So the mid-layer is obviously 
using stale data somewhere.  In looking at the mid-layer diffs I noticed a 
small change made between RHEL3U3 and RHEL3U4 in 
scsi_build_commandblocks.  The change was:

+        /*
+        * Only init things once.
+         */
+        if (SDpnt->has_cmdblocks)
+                return;

This causes scsi_build_commandblocks to exit at the top of the routine 
if the scsi device has existing scsi_cmnd blocks.  Which means we won't 
create new scsi_cmnd blocks with the correct lun, target, channel.  We 
seem to have hit a condition in the mid-layer where the has_cmdblocks 
bit is not properly cleared.  As a result the scsi_cmnd blocks being 
used are still populated with 0:0:2.  When we eventual "discover" the 
ghost target as seen above this condition clears.

Looking at the scan_scsis_single command there is only one place where 
we can bail out without clearing the has_cmdblocks bit after issuing an 
INQUIRY.  
That's where we check the peripheral qualifier bit:

        if (lun != 0 && (scsi_result[0] >> 5) == 1) {
                scsi_release_request(SRpnt);
                return 0;
        }

I think this should really be:

        if (lun != 0 && (scsi_result[0] >> 5) == 1) {
                scsi_release_request(SRpnt);
+              scsi_release_commandblocks(SDpnt);
                return 0;
        }

So setting the lpfc_inq_pqb_filter parameter to 1 really works by 
accident in this case by forcing us down a different code path.  I tested a 
custom kernel with the one line addition above and it solves the 
problem.

Comment 4 Ernie Petrides 2005-06-09 03:31:14 UTC

A fix for this problem has just been committed to the RHEL3 U6
patch pool this evening (in kernel version 2.4.21-32.7.EL).

Comment 5 Tom Coughlan 2005-06-16 17:48:39 UTC

I did a test to confirm this fix. I used Emulex attached to Clariion. Saw the
problem on 2.4.21-32.ELsmp. Saw it was fixed on 2.4.21-32.8.ELsmp.

Comment 9 Eric Sandeen 2005-07-18 22:41:36 UTC

FWIW, we saw a related problem, with the same fix.  Wish I'd found this bug
before I debugged it independently :)

In our case, we have a raid in RDAC mode, where luns 1-3 belong to
one controller, and luns 4-6 belong to another controller.

When sending the INQUIRY command to lun 1, owned by the opposite controller,
we got back the offline status.

As above, the commandblocks would be re-used, and the original offline lun
would continue to be queried, even when we should have moved on to the
luns which -are- available on this controller.

So, if lun 1 was offline, no luns were found on that channel.

So in our case, we did not see ghost luns, but rather had missing luns.

Glad to see you've fixed it, eagerly awaiting U6 :)

-Eric

Comment 14 Tom Coughlan 2005-08-29 18:35:32 UTC

Yes. It is kernel version 2.4.21-32.7.EL, and later.

Comment 16 Red Hat Bugzilla 2005-09-28 15:11:28 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html

Note You need to log in before you can comment on or make changes to this bug.