On RedHat 7.1, the kernel is shipped with the following FLAG turned off. # CONFIG_SCSI_MULTI_LUN is not set As a result of this only the first SCSI device is being queried/configured for each SCSI or Fibre Channel adapter. In GPFS environement, this is not suitable since we definitely have storage sub-systems attached to the storage nodes that have more than one device configured. Basically I had to re-build the kernel after setting the variable CONFIG_SCSI_MULTI_LUN=y in the kernel config file. As it stands now, GPFS customers cannot use the stock kernel that ships with RedHat 7.1. They will need to re-build the kernel in order for GPFS to have access to all the disks on a node. The thing that needs to be addresed is that are we going to ask/expect the customers to re-build the kernel?
We tried shipping a kernel with CONFIG_SCSI_MULTI_LUN=y before and it will *NEVER* happen again. There are simply *way* too many broken devices out there that lock up when you scan them for possible devices on luns > 0. There are three other ways to solve this problem besides the nightmare you are proposing. 1) Use the scsi-add-single-device command: echo "scsi-add-single-device a b c d" > /proc/scsi/scsi where a == SCSI host #, b == SCSI Channel # on SCSI host, c == Target ID #, and d == LUN #. If there is a device at the given address, this will add it to the running kernel so it can be accessed. There is also an analogous scsi-remove-single-device command that is operated in exactly the same way. 2) If, and only if, the system uses *only* devices that can withstand probes for luns > 0, then you can pass the option max_scsi_luns=8 to the scsi_mod.o module. To do this, modify the /etc/modules.conf file by adding the line: options scsi_mod max_scsi_luns=8 then remake all of the initrd images on the system and re-run lilo to active the new initrd images, then reboot the system. At that point, all of the devices on luns 0 through 7 should have been scanned and discovered. If you have only fiber channel devices in the system, and not scsi devices, then it's possible to change that number to something higher (whatever the fiber channel controllers will support, for example, 255 on the QLogic qla2x00 driven cards). 3) Add an entry to the SCSI blacklist in scsi_scan.c that tags your particular attached storage hardware as being multi-lun and forces a lun scan on your devices. Look in the file scsi_scan.c for examples of the usage of BLIST_FORCELUN and BLIST_SPARSELUN to see what I'm referring to. The FORCELUN flag is used on devices that always allocate their luns sequentially (aka, Dell Percraid controllers) while the SPARSELUN is used on devices that don't always allocate luns sequentially (aka, lots of EMC raid arrays). I'm closing this out as NOTABUG. If you want your devices added to the scsi blacklist, then attach the information on the drive's VENDOR and MODEL strings to this bug report (preferably, configure multiple types of array on the device, and at least one pass-through disk device, then attach the /proc/scsi/scsi file contents so that we can see what all of the possible device names look like).