Bug 123331
| Summary: | LUN i not getting registered | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 3 | Reporter: | Satish Mohan <smohan> | ||||||
| Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.0 | CC: | hinz, petrides, riel, thomas.zhang | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | ia64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | RHSA-2005-663 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2005-09-28 14:23:22 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 156320 | ||||||||
| Attachments: |
|
||||||||
how big is the lun ? different sizes. Min - 2Gb Max as on today is 15Gb This is data center with 150 machines running multiple operating systems. Please attach the /var/log/messages that show the Qlogic driver being
loaded and configured.
Did you try an rmmod then modprobe after the system was booted?
You could also try
/*
* Usage: echo "scsi scan-new-devices" >/proc/scsi/scsi
*
* Scans all host adapters again to see if there are any
* new devices.
*/
Tom
Created attachment 100318 [details]
output log of IA 64 machine with qlogic module loading
contains the following output
1. qlogic module loading at boot time
2. qlogic module loading using modprobe
3. output after sacn scsi_new_devices
4. /etc/modules.conf
Your current LUN numbering is LUN 0 - the Hitachi processor device - used for managing the box LUN 14 - presumably the disk device you are trying to configure Please try the following command to configure the disk device: echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi Also, are you able to try a test with the disk device at LUN 1? echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi this is attaching the device. how we can automate this to attach devices at the time of boot (i mean different systems, multiple luns) can we follow similar steps for IA32 also echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi this is attaching the device. how we can automate this to attach devices at the time of boot (i mean different systems, multiple luns) can we follow similar steps for IA32 also Yes, this will work on any architecture. The problem is caused by the gap in the LUN numbering. The system stops scanning when it hits a gap. It is not immediately clear why this is happening in your case because your storage device is listed in scsi_scan.c as being okay with sparse LUNs. As a temporary workaround, you could put a script like rescan-scsi-bus.sh in rc.local. See "Rescan SCSI bus" on http://www.garloff.de/kurt/linux/ The Qlogic driver has been updated several times since 6.06.00b11. Please re-test with RHEL 3 Update 4. Post the results here. Thanks. The bug is still exist with qla 7.03.00 (original Qlogic or -RH) and Kernel 2.4. 21-27.0.2.ELsmp. I suggest changing the SUMMARY to "LUNs not found with Qlogic-FC-adapters". On
the other hand I cannot say wether its a general issue or just qla-specific.
My configuration is qlogic on a dell pv136t library connected via fibre-channel.
When loading the driver only the FC-Connector of the library (LUN 0) is found:
qla2x00_set_info starts at address = f89ca060
qla2x00: Found VID=1077 DID=2312 SSVID=1077 SSDID=100
scsi(1): Found a QLA2312 @ bus 10, device 0x3, irq 77, iobase 0xf895d000
scsi(1): 64 Bit PCI Addressing Enabled.
scsi(1): Allocated 4096 SRB(s).
scsi(1): Configure NVRAM parameters...
scsi(1): Verifying loaded RISC code...
scsi(1): Verifying chip...
scsi(1): Waiting for LIP to complete...
scsi(1): LOOP UP detected.
scsi(1): Port database changed.
scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0
scsi(1): Failed SNS login: loop_id=80 mb[0]=4005 mb[1]=5 mb[2]=0 mb[6]=600 mb[7]
=0
scsi-qla0-adapter-node=200000e08b1b91c4\;
scsi-qla0-adapter-port=210000e08b1b91c4\;
scsi-qla0-tgt-0-di-0-port=200100308c036c30\;
qla2x00_detect num_hosts=0
scsi1 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 10 device 3 irq 77
Firmware version: 3.03.01, Driver version 7.01.01
scsi(1): Waiting for LIP to complete...
scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0
blk: queue f762c618, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi: unknown type 12
Vendor: DELL Model: PV-136T-SNC2 Rev: 42b1
Type: Unknown ANSI SCSI revision: 03
blk: queue f762c418, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.
Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 12
resize_dma_pool: unknown device type 12
When you take a look to the other FC-devices you see:
backup07:~# cat /proc/scsi/qla2300/1
QLogic PCI to Fibre Channel Host Adapter for QLA2340:
Firmware version: 3.03.01, Driver version 7.01.01
Entry address = f89ca060
HBA: QLA2312 , Serial# S19793
Request Queue = 0x37000000, Response Queue = 0x36ff0000
Request Queue count= 512, Response Queue count= 512
Total number of active commands = 0
Total number of interrupts = 124
Total number of IOCBs (used/max) = (0/600)
Total number of queued commands = 0
Device queue depth = 0x20
Number of free request entries = 493
Number of mailbox timeouts = 0
Number of ISP aborts = 0
Number of loop resyncs = 1
Number of retries for empty slots = 0
Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0
Host adapter:loop state= <READY>, flags= 0x860813
Dpc flags = 0x1000000
MBX flags = 0x0
SRB Free Count = 4096
Link down Timeout = 000
Port down retry = 045
Login retry count = 045
Commands retried with dropped frame(s) = 0
Configured characteristic impedence: 50 ohms
Configured data rate: 1-2 Gb/sec auto-negotiate
SCSI Device Information:
scsi-qla0-adapter-node=200000e08b1b91c4;
scsi-qla0-adapter-port=210000e08b1b91c4;
scsi-qla0-target-0=200100308c036c30;
SCSI LUN Information:
(Id:Lun) * - indicates lun is not registered with the OS.
( 0: 0): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01,
( 0: 2): Total reqs 14, Pending reqs 0, flags 0x0, 0:0:01,
( 0: 4): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01,
( 0: 5): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:01,
So you see LUN 2,4 and 5 are other devices not detected by the Kernel.
If you "workaround" ala:
backup07:~# cat workaround.sh
echo "scsi add-single-device 1 0 0 2" > /proc/scsi/scsi
echo "scsi add-single-device 1 0 0 4" > /proc/scsi/scsi
echo "scsi add-single-device 1 0 0 5" > /proc/scsi/scsi
Those devices get detected:
scsi singledevice 1 0 0 2
blk: queue c3544c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
Vendor: DELL Model: PV-136T Rev: 3.11
Type: Medium Changer ANSI SCSI revision: 02
blk: queue c37b2c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.
Attached scsi generic sg3 at scsi1, channel 0, id 0, lun 2, type 8
resize_dma_pool: unknown device type 12
scsi singledevice 1 0 0 4
Vendor: IBM Model: ULTRIUM-TD2 Rev: 37RH
Type: Sequential-Access ANSI SCSI revision: 03
blk: queue f6f56418, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.
resize_dma_pool: unknown device type 12
scsi singledevice 1 0 0 5
Vendor: IBM Model: ULTRIUM-TD2 Rev: 37RH
Type: Sequential-Access ANSI SCSI revision: 03
blk: queue f6e93c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.
resize_dma_pool: unknown device type 12
But that workaround is not really a satifying solution....
Joerg
BTW I saw my posting showed the old 7.01.01 driver.
With the recent 7.03.00 its the same:
qla2x00_set_info starts at address = f89ca060
qla2x00: Found VID=1077 DID=2312 SSVID=1077 SSDID=100
scsi(1): Found a QLA2312 @ bus 10, device 0x3, irq 77, iobase 0xf895f000
scsi(1): 64 Bit PCI Addressing Enabled.
scsi(1): Allocated 4096 SRB(s).
scsi(1): Configure NVRAM parameters...
scsi(1): Verifying loaded RISC code...
scsi(1): Verifying chip...
scsi(1): Waiting for LIP to complete...
scsi(1): LOOP UP detected.
scsi(1): Port database changed.
scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0
scsi(1): Failed SNS login: loop_id=80 mb[0]=4005 mb[1]=5 mb[2]=0 mb[6]=47b1
mb[7]=f89e
scsi-qla0-adapter-node=200000e08b1b91c4\;
scsi-qla0-adapter-port=210000e08b1b91c4\;
scsi-qla0-tgt-0-di-0-port=200100308c036c30\;
qla2x00_detect num_hosts=0
scsi1 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 10 device 3 irq 77
Firmware version: 3.03.06, Driver version 7.03.00
scsi(1): Waiting for LIP to complete...
scsi(1): Topology - (N_Port-to-N_Port), Host Loop address 0x0
blk: queue f6e93a18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi: unknown type 12
Vendor: DELL Model: PV-136T-SNC2 Rev: 42b1
Type: Unknown ANSI SCSI revision: 03
blk: queue f6e93818, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.
Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 12
resize_dma_pool: unknown device type 12
(no other devices found)
We think the problem is kernel- and not qlogic-related.
Joerg
You are right, the problem is not related to the QLogic driver. By default the system does not scan LUNs greater than zero. You can over-ride this by adding the following to /etc/modules.conf: options scsi_mod max_scsi_luns=256 Re-make the initrd, and reboot. Please do this, if you have not already. This will cause the system to scan LUNs sequentially until there is no response. Your LUNs are 0, 2, 4, 5, so the system will stop scanning when LUN 1 does not answer. If you can re-number the LUNs sequentially, this will be the simplest fix. Otherwise, in order for the system to scan past gaps in the LUN number space, your device must be listed in scsi_scan.c with the BLIST_SPARSELUN flag set. See the attached patch. If you confirm that this patch solves the problem I will include it in an RHEL 3 update. Created attachment 112415 [details]
add pv-136 to sparselun list
> If you confirm that this patch solves the problem I will
> include it in an RHEL 3 update.
Yes, the patch solved the problem.
So you can put this into the next update to support PV-136T libraries with LUN
gaps.
Today I found that re-numbering of the LUNs sequentially is ONLY possibile with
the Windows Dell SNC-Manager... NOT via the serial console...
What a .... ;->
Thanks for the patch.
Joerg
BTW you might close this bug, since it's not really a bug but a generic problem of the linux kernel? What about a new kernel parameter, scan_max_luns=1 to force the scsi_scan.c to scan up to the max_scsi_luns-Parameter? Joerg Thanks for testing the patch. It is too late for RHEL 3 U5, so this will go in U6. I'll keep the bug open to track status. There have been discussions about adding more dynamic controls over the LUN scanning behavior. The prevailing opinion seems to be that it is too late for the 2.4 kernel, and scanning is done differently in 2.6, where the Report LUNs command is used if it is supported by the device. Tom I am testing EM64T on HP DL380 with QLogic 2312. I got same error. Can you tell me how to fix this problem please? Thomas, please post /var/log/messages that shows the messages when the qla2xxx driver loads. Also let me know what the LUN numbers are. Did you try echo "scsi add-single-device 2 0 0 14" >/proc/scsi/scsi with the appropriate values filled in? U6 status update: will do. One hour of work. A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-33.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040207 Firefox/0.8 Description of problem: The current setup of the customer is with machines (IA64 and IA32) with qlogic hba cards attahced to a hitachi SAN of 20TB. The san is being shared by different OS's and LUNS hve been assingned to all of them. in linux the qlogic driver is able to see the luns whihc is allocated to it, but not abl to register the same with OS as a device. am attaching the proc details of qla2300. The Lun numbering is random. we tried options ghost lun and max luns options with scsi_mod, but no results. the current info pasted is of an IA 64 machine. if needed i will put the IA 32 output also. QLogic PCI to Fibre Channel Host Adapter for QLA2340: Firmware version: 3.02.13, Driver version 6.06.00b11 Entry address = a000000000135a50 HBA: QLA2312 , Serial# J56045 Request Queue = 0x81b0000, Response Queue = 0x81a0000 Request Queue count= 512, Response Queue count= 512 Total number of active commands = 0 Total number of interrupts = 3 Total number of IOCBs (used/max) = (0/600) Total number of queued commands = 0 Device queue depth = 0x20 Number of free request entries = 510 Number of mailbox timeouts = 0 Number of ISP aborts = 0 Number of loop resyncs = 0 Number of retries for empty slots = 0 Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0 Host adapter:loop state= <READY>, flags= 0xc048e0813 Dpc flags = 0x0 MBX flags = 0x0 SRB Free Count = 4096 Link down Timeout = 000 Port down retry = 030 Login retry count = 030 Commands retried with dropped frame(s) = 0 SCSI Device Information: scsi-qla0-adapter-node=200000e08b0e8d96; scsi-qla0-adapter-port=210000e08b0e8d96; scsi-qla0-target-0=500060e8027b4d14; SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 0): Total reqs 1, Pending reqs 0, flags 0x0*, 0:0:81, ( 0:14): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81, Version-Release number of selected component (if applicable): 2.4.21-9 How reproducible: Always Steps to Reproduce: 1.Install RHEL3 2.try updates also 3. Actual Results: LUn's are not getting registered, where as all other OS(non linux) are able to do so Expected Results: Lun should be registed and user should see it as a device Additional info: