Bug 27456 - qla2x00 sees double luns
Summary: qla2x00 sees double luns
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-02-13 23:52 UTC by Danny Trinh
Modified: 2007-04-18 16:31 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-03-20 21:51:10 UTC
Embargoed:


Attachments (Terms of Use)
/proc/scsi/scsi file from the server has Qlogic card QLA2200 installed (2.02 KB, text/plain)
2001-02-13 23:57 UTC, Danny Trinh
no flags Details
Test patch against the scsi_scan.c file (2.80 KB, patch)
2001-03-15 03:46 UTC, Doug Ledford
no flags Details | Diff

Description Danny Trinh 2001-02-13 23:52:06 UTC
After installing fisher, I checked the /proc/scsi/scsi file and saw that
scsi1 (qla2x00) connected to 8 luns instead of 4 luns that Qlogic card
(QLA2200) sees when boot up server.
Below is the text I copied from /proc/scsi/scsi:

Attached devices: 
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: DELL     Model: 1x6 U2W SCSI BP  Rev: 5.43
  Type:   Processor                        ANSI SCSI revision: 02
Host: scsi0 Channel: 01 Id: 06 Lun: 00
  Vendor: DELL     Model: 1x2 U2W SCSI BP  Rev: 5.22
  Type:   Processor                        ANSI SCSI revision: 02
Host: scsi0 Channel: 04 Id: 00 Lun: 00
  Vendor: MegaRAID Model: LD0 RAID5 38924R Rev: h132
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 04 Id: 00 Lun: 01
  Vendor: MegaRAID Model: LD1 RAID0  8682R Rev: h132
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: DGC      Model: DISK             Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 01
  Vendor: DGC      Model: RAID 1           Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 02
  Vendor: DGC      Model: RAID 5           Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 03
  Vendor: DGC      Model: RAID 10          Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 04
  Vendor: DGC      Model:                  Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 05
  Vendor: DGC      Model:                  Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 06
  Vendor: DGC      Model:                  Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 07
  Vendor: DGC      Model:                  Rev: 0511
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 05 Lun: 00
  Vendor: NEC      Model: CD-ROM DRIVE:466 Rev: 1.06
  Type:   CD-ROM                           ANSI SCSI revision: 02

Comment 1 Danny Trinh 2001-02-13 23:57:30 UTC
Created attachment 9964 [details]
/proc/scsi/scsi file from the server has Qlogic card QLA2200 installed

Comment 2 Michael Fulbright 2001-02-14 04:00:40 UTC
Is the problem that a different kernel module was used during the install
process than was used when you booted the installed system?

Comment 3 Tesfamariam Michael 2001-02-14 15:50:55 UTC
It is the same kernel to install and reboot the system. Even the installer 
shows additional /dev/sdX where X is greater than the disks/RAID volumes 
present in the system. From the above /proc/scsi/scsi, there is nothing in LUN 
04 to LUN 08 but they should up during installation and get added 
to /proc/scsi/scsi.

Comment 4 Glen Foster 2001-02-15 23:08:49 UTC
We (Red Hat) should really try to resolve this before next release.

Comment 5 Michael E Brown 2001-02-23 17:03:28 UTC
This issue is not about seeing double Luns. The problem is this:

echo "scsi add-single-device 3 0 0 n" > /proc/scsi/scsi

where 3 is your qlogic host adapter, 0 is the channel, and you have an
_existing_  device at id 0. This command will add lun 'n' regardless of whether
or not that lun actually exists, for all values of 'n' from 0 to 222.

summary: LUN detection is broken in the new qla2x00 driver. LUNS are always
detected as being present.

Comment 6 Michael Fulbright 2001-02-23 17:21:57 UTC
Sounds like a kernel issue.

Comment 7 Arjan van de Ven 2001-02-24 11:28:07 UTC
Sounds like something needs to be on the LUN blacklist. Doug ?

Comment 8 Doug Ledford 2001-02-26 18:44:01 UTC
It's not the LUN blacklist that's the problem here.  When detecting drives on a
high LUN, the controller should return some sort of failure if that LUN isn't
present.  That failure is dictated by the drive though (specifically, the drive
doesn't know which LUN you want to talk to until it accepts the command, so it
can't simply ignore the command, it has to take the command, check which LUN the
command is against, then either perform the command or return an error, usually
a DEVICE_NOT_PRESENT sense code if there is no active device at the requested
LUN).  Now, the question is why isn't that sense code getting returned?  Do we
know when the last working qla2x00 driver was?  If we do, then we can do a
search of the changes to the driver to try and isolate the cause of the problem.

On a second note though, the phantom drives in the listing above are *not*
missing.  They are there as far as linux is concerned.  The have a vendor name,
they have a revision, they have a SCSI version, they have a correct type.  They
have all the info they are suppossed to have, they just happen to have a series
of spaces for a name instead of a descriptive string.  This makes me think that
either A) the mid layer SCSI code didn't zero out the command buffer before
sending the command and this is left over information from the previous SCSI
INQUIRY, or B) that the PV array chassis is sending this information through
before failing the command outright.  In either case, it's valid return data and
unless linux gets an error code to tell it not to use it, it will show up as a
drive.

Comment 9 Michael K. Johnson 2001-03-01 03:48:01 UTC
Doug seems to be the right person to deal with this.

Comment 10 Michael E Brown 2001-03-06 16:19:50 UTC
Don't know which version that this bug showed up in, but I will check fisher.

Still present in RC2 -- kernel 2.4.2-0.1.19


Comment 11 Michael E Brown 2001-03-06 16:28:48 UTC
Symptoms of bug have changed slightly in RC2. On initial qla2x00 module load, we
are only detecting Luns 0 and 1 of a 10 lun device.  (Two test systems, one with
10 luns and one with 4 luns, haven't checked the 4 lun device, yet.)

In RC1, we were detecting luns 0-7 no matter what.  (even on a 4 lun device).
What is the same, though, is that if you "echo 'scsi add-single-device w x y z"
for nonexistent "z", you will still see a 'ghost' lun in /proc/scsi/scsi. 



Comment 12 Danny Trinh 2001-03-06 16:41:20 UTC
With the same configuration, now I see only the first 2 luns when I boot from 
bootnet.img diskette and trying to use "expert noprobe" mode + ftp installation 
method.


Comment 13 Tesfamariam Michael 2001-03-14 21:08:56 UTC
This issue also exists with megaraid driver. 
When installing RC2 and later on PERC2/DC which has 4 logical volumes of RAID1 
(9GB each), it detects only two (sda and sdb). With RC1 and earlier, it detects 
all the logical volumes. I expect PERC3/DC, PERC3/DCL, PERC3/QC, and PERC2/SC 
to behave the same.

Comment 14 Doug Ledford 2001-03-15 03:45:05 UTC
The bug was in the mid layer SCSI code.  I sent a test patch to the linux-kernel
mailing list because before I can integrate the patch with any degree of
certainty that it won't break other devices it needs to be tested on autochanger
tape backups and on multidisc CD changers.  It should also be tested on other
RAID controllers (especially ones like the MegaRAID).  I'm attaching the test
patch I sent to the linux-kernel mailing list here so that people so inclined
may test it out.

Comment 15 Doug Ledford 2001-03-15 03:46:43 UTC
Created attachment 12702 [details]
Test patch against the scsi_scan.c file

Comment 16 Tesfamariam Michael 2001-03-20 16:15:41 UTC
After recompiling a kernel with this patch, the correct number of LUNs where 
detected on both qla2200 and megaRAID cards.

Comment 17 Danny Trinh 2001-03-20 16:32:53 UTC
The scsi_scan.c patch is not on qa0319 ISO images yet.

Comment 18 Matt Domsch 2001-03-20 21:51:06 UTC
Doug indicated that he would commit this patch to Red Hat's kernel today.

Comment 19 Arjan van de Ven 2001-03-21 12:50:08 UTC
Patch is in the tree -> closing the bug.


Note You need to log in before you can comment on or make changes to this bug.