516683 – spurious DM multipath failures due to incorrect SCSI err handling on full scsi tag queue.

Bug 516683 - spurious DM multipath failures due to incorrect SCSI err handling on full scsi tag queue.

Summary: spurious DM multipath failures due to incorrect SCSI err handling on full scs...

Keywords:
Status:	CLOSED DUPLICATE of bug 519447
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.4
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	rc
Target Release:	5.5
Assignee:	Tom Coughlan
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	499522 525215 533192
TreeView+	depends on / blocked

Reported:	2009-08-11 05:47 UTC by Mark Goodwin
Modified:	2009-11-05 15:14 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-10-20 15:45:31 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Customer analysis (118 bytes, text/plain) 2009-08-13 07:53 UTC, Jan van Eldik	no flags	Details
RHEL5.3 based patch to make sd_max_retries tunable via sysfs. (3.92 KB, patch) 2009-09-08 07:24 UTC, Mark Goodwin	no flags	Details \| Diff
View All

Description Mark Goodwin 2009-08-11 05:47:40 UTC

Description of problem:
Full SCSI tag queue / BUSY errors are causing spurious DM multipath
path failures due to incorrect SCSI error handling. This is in a large
clustered SAN environment where the storage controllers experience
very heavy IOPS loads. HBA driver is qla2xxx; the work-around is to
lower ql2xmaxqdepth (or enable verbose logging to alter timing).

Version-Release number of selected component (if applicable):
All versions of RHEL5, including 5.4-beta. This issue did not
occur on identical hardware with RHEL4 and so is believed to
be a regression in RHEL5.

How reproducible:
always and the customer has a strong grasp of the issue - see their
lengthy analysis in the related IT.

Steps to Reproduce:
Large clustered SAN environment required with tailored workload.
Customer has suitable test suits to reproduce the problem and
is willing to cooperate to help resolve.
  
Actual results:
multipath paths down/failover, causing performance disruption

Expected results:
Correct SCSI error handling rather than DM multipath failover

Comment 2 Jan van Eldik 2009-08-13 07:53:40 UTC

Created attachment 357281 [details]
Customer analysis

Comment 10 Mark Goodwin 2009-09-08 07:24:24 UTC

Created attachment 360041 [details]
RHEL5.3 based patch to make sd_max_retries tunable via sysfs.

Creates /sys/module/sd_mod/parameters/sd_max_retries with default value of 5.

Comment 23 Mark Goodwin 2009-09-25 00:05:44 UTC

Mike (and qlogic engineers) :

The HBAs are stock ISP2432 cards:

04:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02)
04:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02)

04:00.0 0c04: 1077:2432 (rev 02)
        Subsystem: 1077:0138
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 128 bytes
        Interrupt: pin A routed to IRQ 169
        Region 0: I/O ports at 3000 [size=256]
        Region 1: Memory at dc300000 (64-bit, non-prefetchable) [size=16K]
        [virtual] Expansion ROM at d1000000 [disabled] [size=256K]
        Capabilities: [44] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [4c] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 1024 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <4us, L1 <1us
                Device: AtnBtn+ AtnInd+ PwrInd+
                Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                Device: MaxPayload 128 bytes, MaxReadReq 2048 bytes
                Link: Supported Speed 2.5Gb/s, Width x4, ASPM L0s, Port 0
                Link: Latency L0s <4us, L1 unlimited
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch+
                Link: Speed 2.5Gb/s, Width x4
        Capabilities: [64] Message Signalled Interrupts: 64bit+ Queue=0/4 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [74] Vital Product Data
        Capabilities: [7c] MSI-X: Enable- Mask- TabSize=16
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003000
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [138] Power Budgeting

The driver reports the following during initialization :

QLogic Fibre Channel HBA Driver
ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 16 (level, low) -> IRQ 169
qla2xxx 0000:04:00.0: Found an ISP2432, irq 169, iobase 0xffffc2000001a000
qla2xxx 0000:04:00.0: Configuring PCI space...
PCI: Setting latency timer of device 0000:04:00.0 to 64
qla2xxx 0000:04:00.0: Configure NVRAM parameters...
qla2xxx 0000:04:00.0: Verifying loaded RISC code...
qla2xxx 0000:04:00.0: Allocated (64 KB) for EFT...
qla2xxx 0000:04:00.0: Allocated (1413 KB) for firmware dump...
scsi1 : qla2xxx
qla2xxx 0000:04:00.0:
 QLogic Fibre Channel HBA Driver: 8.02.00.06.05.03-k
  QLogic QLE2462 - PCI-Express to 4Gb FC, Dual Channel
  ISP2432: PCIe (2.5Gb/s x4) @ 0000:04:00.0 hdma+, host#=1, fw=4.04.05 [IP] [Multi-ID] [84XX]
qla2xxx 0000:04:00.0: LIP reset occured (f8f7).
qla2xxx 0000:04:00.0: LIP occured (f8f7).
qla2xxx 0000:04:00.0: LIP reset occured (f7f7).
qla2xxx 0000:04:00.0: LOOP UP detected (4 Gbps).
GSI 21 sharing vector 0xD1 and IRQ 21
ACPI: PCI Interrupt 0000:04:00.1[B] -> GSI 17 (level, low) -> IRQ 209
qla2xxx 0000:04:00.1: Found an ISP2432, irq 209, iobase 0xffffc2000001c000
qla2xxx 0000:04:00.1: Configuring PCI space...
PCI: Setting latency timer of device 0000:04:00.1 to 64
qla2xxx 0000:04:00.1: Configure NVRAM parameters...
qla2xxx 0000:04:00.1: Verifying loaded RISC code...
  Vendor: transtec  Model: T6100F08R1-E      Rev: 347B
  Type:   Direct-Access                      ANSI SCSI revision: 03<6>qla2xxx 0000:04:00.1: Allocated (64 KB) for EFT...
qla2xxx 0000:04:00.1: Allocated (1413 KB) for firmware dump...

We have some details of the FC switches, SAN topology and raid controllers
too if needed.

Thanks
-- Mark

Comment 28 Mike Christie 2009-10-12 00:43:06 UTC

Created attachment 364406 [details]
fix queue full handling

Here is a patch from Qlogic, which they think should fix this. They were mishandling queue fulls as underruns.

Comment 35 Mark Goodwin 2009-10-13 02:07:59 UTC


> Qlogic is already sending the patch I attached with its RHEL 5.5 update.

Do you know if Qlogic were able to reproduce the problem (and thus demonstrate
the fix)?

Thanks
-- Mark Goodwin

Comment 37 Tom Coughlan 2009-10-20 15:45:31 UTC

The patch in Comment #28 that QLogic believes will fix this problem is included in Bug 519447, the planned driver 5.5 driver update. I will close this BZ as a duplicate of 519447. Any test results that the customer can provide will be much appreciated.

*** This bug has been marked as a duplicate of bug 519447 ***

Note You need to log in before you can comment on or make changes to this bug.