Bug 123331
Summary: | LUN i not getting registered | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Satish Mohan <smohan> | ||||||
Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | hinz, petrides, riel, thomas.zhang | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | ia64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHSA-2005-663 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-09-28 14:23:22 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 156320 | ||||||||
Attachments: |
|
Description
Satish Mohan
2004-05-17 09:16:26 UTC
how big is the lun ? different sizes. Min - 2Gb Max as on today is 15Gb This is data center with 150 machines running multiple operating systems. Please attach the /var/log/messages that show the Qlogic driver being loaded and configured. Did you try an rmmod then modprobe after the system was booted? You could also try /* * Usage: echo "scsi scan-new-devices" >/proc/scsi/scsi * * Scans all host adapters again to see if there are any * new devices. */ Tom Created attachment 100318 [details]
output log of IA 64 machine with qlogic module loading
contains the following output
1. qlogic module loading at boot time
2. qlogic module loading using modprobe
3. output after sacn scsi_new_devices
4. /etc/modules.conf
Your current LUN numbering is LUN 0 - the Hitachi processor device - used for managing the box LUN 14 - presumably the disk device you are trying to configure Please try the following command to configure the disk device: echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi Also, are you able to try a test with the disk device at LUN 1? echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi this is attaching the device. how we can automate this to attach devices at the time of boot (i mean different systems, multiple luns) can we follow similar steps for IA32 also echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi this is attaching the device. how we can automate this to attach devices at the time of boot (i mean different systems, multiple luns) can we follow similar steps for IA32 also Yes, this will work on any architecture. The problem is caused by the gap in the LUN numbering. The system stops scanning when it hits a gap. It is not immediately clear why this is happening in your case because your storage device is listed in scsi_scan.c as being okay with sparse LUNs. As a temporary workaround, you could put a script like rescan-scsi-bus.sh in rc.local. See "Rescan SCSI bus" on http://www.garloff.de/kurt/linux/ The Qlogic driver has been updated several times since 6.06.00b11. Please re-test with RHEL 3 Update 4. Post the results here. Thanks. The bug is still exist with qla 7.03.00 (original Qlogic or -RH) and Kernel 2.4. 21-27.0.2.ELsmp. I suggest changing the SUMMARY to "LUNs not found with Qlogic-FC-adapters". On the other hand I cannot say wether its a general issue or just qla-specific. My configuration is qlogic on a dell pv136t library connected via fibre-channel. When loading the driver only the FC-Connector of the library (LUN 0) is found: qla2x00_set_info starts at address = f89ca060 qla2x00: Found VID=1077 DID=2312 SSVID=1077 SSDID=100 scsi(1): Found a QLA2312 @ bus 10, device 0x3, irq 77, iobase 0xf895d000 scsi(1): 64 Bit PCI Addressing Enabled. scsi(1): Allocated 4096 SRB(s). scsi(1): Configure NVRAM parameters... scsi(1): Verifying loaded RISC code... scsi(1): Verifying chip... scsi(1): Waiting for LIP to complete... scsi(1): LOOP UP detected. scsi(1): Port database changed. scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0 scsi(1): Failed SNS login: loop_id=80 mb[0]=4005 mb[1]=5 mb[2]=0 mb[6]=600 mb[7] =0 scsi-qla0-adapter-node=200000e08b1b91c4\; scsi-qla0-adapter-port=210000e08b1b91c4\; scsi-qla0-tgt-0-di-0-port=200100308c036c30\; qla2x00_detect num_hosts=0 scsi1 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 10 device 3 irq 77 Firmware version: 3.03.01, Driver version 7.01.01 scsi(1): Waiting for LIP to complete... scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0 blk: queue f762c618, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi: unknown type 12 Vendor: DELL Model: PV-136T-SNC2 Rev: 42b1 Type: Unknown ANSI SCSI revision: 03 blk: queue f762c418, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 12 resize_dma_pool: unknown device type 12 When you take a look to the other FC-devices you see: backup07:~# cat /proc/scsi/qla2300/1 QLogic PCI to Fibre Channel Host Adapter for QLA2340: Firmware version: 3.03.01, Driver version 7.01.01 Entry address = f89ca060 HBA: QLA2312 , Serial# S19793 Request Queue = 0x37000000, Response Queue = 0x36ff0000 Request Queue count= 512, Response Queue count= 512 Total number of active commands = 0 Total number of interrupts = 124 Total number of IOCBs (used/max) = (0/600) Total number of queued commands = 0 Device queue depth = 0x20 Number of free request entries = 493 Number of mailbox timeouts = 0 Number of ISP aborts = 0 Number of loop resyncs = 1 Number of retries for empty slots = 0 Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0 Host adapter:loop state= <READY>, flags= 0x860813 Dpc flags = 0x1000000 MBX flags = 0x0 SRB Free Count = 4096 Link down Timeout = 000 Port down retry = 045 Login retry count = 045 Commands retried with dropped frame(s) = 0 Configured characteristic impedence: 50 ohms Configured data rate: 1-2 Gb/sec auto-negotiate SCSI Device Information: scsi-qla0-adapter-node=200000e08b1b91c4; scsi-qla0-adapter-port=210000e08b1b91c4; scsi-qla0-target-0=200100308c036c30; SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 0): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01, ( 0: 2): Total reqs 14, Pending reqs 0, flags 0x0, 0:0:01, ( 0: 4): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01, ( 0: 5): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01, So you see LUN 2,4 and 5 are other devices not detected by the Kernel. If you "workaround" ala: backup07:~# cat workaround.sh echo "scsi add-single-device 1 0 0 2" > /proc/scsi/scsi echo "scsi add-single-device 1 0 0 4" > /proc/scsi/scsi echo "scsi add-single-device 1 0 0 5" > /proc/scsi/scsi Those devices get detected: scsi singledevice 1 0 0 2 blk: queue c3544c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) Vendor: DELL Model: PV-136T Rev: 3.11 Type: Medium Changer ANSI SCSI revision: 02 blk: queue c37b2c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. Attached scsi generic sg3 at scsi1, channel 0, id 0, lun 2, type 8 resize_dma_pool: unknown device type 12 scsi singledevice 1 0 0 4 Vendor: IBM Model: ULTRIUM-TD2 Rev: 37RH Type: Sequential-Access ANSI SCSI revision: 03 blk: queue f6f56418, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. resize_dma_pool: unknown device type 12 scsi singledevice 1 0 0 5 Vendor: IBM Model: ULTRIUM-TD2 Rev: 37RH Type: Sequential-Access ANSI SCSI revision: 03 blk: queue f6e93c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. resize_dma_pool: unknown device type 12 But that workaround is not really a satifying solution.... Joerg BTW I saw my posting showed the old 7.01.01 driver. With the recent 7.03.00 its the same: qla2x00_set_info starts at address = f89ca060 qla2x00: Found VID=1077 DID=2312 SSVID=1077 SSDID=100 scsi(1): Found a QLA2312 @ bus 10, device 0x3, irq 77, iobase 0xf895f000 scsi(1): 64 Bit PCI Addressing Enabled. scsi(1): Allocated 4096 SRB(s). scsi(1): Configure NVRAM parameters... scsi(1): Verifying loaded RISC code... scsi(1): Verifying chip... scsi(1): Waiting for LIP to complete... scsi(1): LOOP UP detected. scsi(1): Port database changed. scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0 scsi(1): Failed SNS login: loop_id=80 mb[0]=4005 mb[1]=5 mb[2]=0 mb[6]=47b1 mb[7]=f89e scsi-qla0-adapter-node=200000e08b1b91c4\; scsi-qla0-adapter-port=210000e08b1b91c4\; scsi-qla0-tgt-0-di-0-port=200100308c036c30\; qla2x00_detect num_hosts=0 scsi1 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 10 device 3 irq 77 Firmware version: 3.03.06, Driver version 7.03.00 scsi(1): Waiting for LIP to complete... scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0 blk: queue f6e93a18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi: unknown type 12 Vendor: DELL Model: PV-136T-SNC2 Rev: 42b1 Type: Unknown ANSI SCSI revision: 03 blk: queue f6e93818, I/O limit 4294967295Mb (mask 0xffffffffffffffff) scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 12 resize_dma_pool: unknown device type 12 (no other devices found) We think the problem is kernel- and not qlogic-related. Joerg You are right, the problem is not related to the QLogic driver. By default the system does not scan LUNs greater than zero. You can over-ride this by adding the following to /etc/modules.conf: options scsi_mod max_scsi_luns=256 Re-make the initrd, and reboot. Please do this, if you have not already. This will cause the system to scan LUNs sequentially until there is no response. Your LUNs are 0, 2, 4, 5, so the system will stop scanning when LUN 1 does not answer. If you can re-number the LUNs sequentially, this will be the simplest fix. Otherwise, in order for the system to scan past gaps in the LUN number space, your device must be listed in scsi_scan.c with the BLIST_SPARSELUN flag set. See the attached patch. If you confirm that this patch solves the problem I will include it in an RHEL 3 update. Created attachment 112415 [details]
add pv-136 to sparselun list
> If you confirm that this patch solves the problem I will
> include it in an RHEL 3 update.
Yes, the patch solved the problem.
So you can put this into the next update to support PV-136T libraries with LUN
gaps.
Today I found that re-numbering of the LUNs sequentially is ONLY possibile with
the Windows Dell SNC-Manager... NOT via the serial console...
What a .... ;->
Thanks for the patch.
Joerg
BTW you might close this bug, since it's not really a bug but a generic problem of the linux kernel? What about a new kernel parameter, scan_max_luns=1 to force the scsi_scan.c to scan up to the max_scsi_luns-Parameter? Joerg Thanks for testing the patch. It is too late for RHEL 3 U5, so this will go in U6. I'll keep the bug open to track status. There have been discussions about adding more dynamic controls over the LUN scanning behavior. The prevailing opinion seems to be that it is too late for the 2.4 kernel, and scanning is done differently in 2.6, where the Report LUNs command is used if it is supported by the device. Tom I am testing EM64T on HP DL380 with QLogic 2312. I got same error. Can you tell me how to fix this problem please? Thomas, please post /var/log/messages that shows the messages when the qla2xxx driver loads. Also let me know what the LUN numbers are. Did you try echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi with the appropriate values filled in? U6 status update: will do. One hour of work. A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-33.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html |