After installing fisher, I checked the /proc/scsi/scsi file and saw that scsi1 (qla2x00) connected to 8 luns instead of 4 luns that Qlogic card (QLA2200) sees when boot up server. Below is the text I copied from /proc/scsi/scsi: Attached devices: Host: scsi0 Channel: 00 Id: 06 Lun: 00 Vendor: DELL Model: 1x6 U2W SCSI BP Rev: 5.43 Type: Processor ANSI SCSI revision: 02 Host: scsi0 Channel: 01 Id: 06 Lun: 00 Vendor: DELL Model: 1x2 U2W SCSI BP Rev: 5.22 Type: Processor ANSI SCSI revision: 02 Host: scsi0 Channel: 04 Id: 00 Lun: 00 Vendor: MegaRAID Model: LD0 RAID5 38924R Rev: h132 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi0 Channel: 04 Id: 00 Lun: 01 Vendor: MegaRAID Model: LD1 RAID0 8682R Rev: h132 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: DGC Model: DISK Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 01 Vendor: DGC Model: RAID 1 Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 02 Vendor: DGC Model: RAID 5 Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 03 Vendor: DGC Model: RAID 10 Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 04 Vendor: DGC Model: Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 05 Vendor: DGC Model: Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 06 Vendor: DGC Model: Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 07 Vendor: DGC Model: Rev: 0511 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 05 Lun: 00 Vendor: NEC Model: CD-ROM DRIVE:466 Rev: 1.06 Type: CD-ROM ANSI SCSI revision: 02
Created attachment 9964 [details] /proc/scsi/scsi file from the server has Qlogic card QLA2200 installed
Is the problem that a different kernel module was used during the install process than was used when you booted the installed system?
It is the same kernel to install and reboot the system. Even the installer shows additional /dev/sdX where X is greater than the disks/RAID volumes present in the system. From the above /proc/scsi/scsi, there is nothing in LUN 04 to LUN 08 but they should up during installation and get added to /proc/scsi/scsi.
We (Red Hat) should really try to resolve this before next release.
This issue is not about seeing double Luns. The problem is this: echo "scsi add-single-device 3 0 0 n" > /proc/scsi/scsi where 3 is your qlogic host adapter, 0 is the channel, and you have an _existing_ device at id 0. This command will add lun 'n' regardless of whether or not that lun actually exists, for all values of 'n' from 0 to 222. summary: LUN detection is broken in the new qla2x00 driver. LUNS are always detected as being present.
Sounds like a kernel issue.
Sounds like something needs to be on the LUN blacklist. Doug ?
It's not the LUN blacklist that's the problem here. When detecting drives on a high LUN, the controller should return some sort of failure if that LUN isn't present. That failure is dictated by the drive though (specifically, the drive doesn't know which LUN you want to talk to until it accepts the command, so it can't simply ignore the command, it has to take the command, check which LUN the command is against, then either perform the command or return an error, usually a DEVICE_NOT_PRESENT sense code if there is no active device at the requested LUN). Now, the question is why isn't that sense code getting returned? Do we know when the last working qla2x00 driver was? If we do, then we can do a search of the changes to the driver to try and isolate the cause of the problem. On a second note though, the phantom drives in the listing above are *not* missing. They are there as far as linux is concerned. The have a vendor name, they have a revision, they have a SCSI version, they have a correct type. They have all the info they are suppossed to have, they just happen to have a series of spaces for a name instead of a descriptive string. This makes me think that either A) the mid layer SCSI code didn't zero out the command buffer before sending the command and this is left over information from the previous SCSI INQUIRY, or B) that the PV array chassis is sending this information through before failing the command outright. In either case, it's valid return data and unless linux gets an error code to tell it not to use it, it will show up as a drive.
Doug seems to be the right person to deal with this.
Don't know which version that this bug showed up in, but I will check fisher. Still present in RC2 -- kernel 2.4.2-0.1.19
Symptoms of bug have changed slightly in RC2. On initial qla2x00 module load, we are only detecting Luns 0 and 1 of a 10 lun device. (Two test systems, one with 10 luns and one with 4 luns, haven't checked the 4 lun device, yet.) In RC1, we were detecting luns 0-7 no matter what. (even on a 4 lun device). What is the same, though, is that if you "echo 'scsi add-single-device w x y z" for nonexistent "z", you will still see a 'ghost' lun in /proc/scsi/scsi.
With the same configuration, now I see only the first 2 luns when I boot from bootnet.img diskette and trying to use "expert noprobe" mode + ftp installation method.
This issue also exists with megaraid driver. When installing RC2 and later on PERC2/DC which has 4 logical volumes of RAID1 (9GB each), it detects only two (sda and sdb). With RC1 and earlier, it detects all the logical volumes. I expect PERC3/DC, PERC3/DCL, PERC3/QC, and PERC2/SC to behave the same.
The bug was in the mid layer SCSI code. I sent a test patch to the linux-kernel mailing list because before I can integrate the patch with any degree of certainty that it won't break other devices it needs to be tested on autochanger tape backups and on multidisc CD changers. It should also be tested on other RAID controllers (especially ones like the MegaRAID). I'm attaching the test patch I sent to the linux-kernel mailing list here so that people so inclined may test it out.
Created attachment 12702 [details] Test patch against the scsi_scan.c file
After recompiling a kernel with this patch, the correct number of LUNs where detected on both qla2200 and megaRAID cards.
The scsi_scan.c patch is not on qa0319 ISO images yet.
Doug indicated that he would commit this patch to Red Hat's kernel today.
Patch is in the tree -> closing the bug.