Bug 175385
Summary: | HP StorageWorks DAT 72 doesn't work with mptscsih driver | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | David Milburn <dmilburn> | ||||||
Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | coldwell, petrides, tao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2006-12-21 15:55:51 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 170417 | ||||||||
Attachments: |
|
Description
David Milburn
2005-12-09 18:21:12 UTC
Here is a portion of the dmesg output for the 2.4.21-20.0.1.ELsmp with debugging turned on in the mptscsih driver, tape device (scsi 0 id 3 lun 0) reports that it is busy leading to the bus resets and aborts. mptscsih: ioc0: ScsiDone (mf=c81806c0,mr=c818c480,sc=c8570000,idx=18) Uh-Oh! (0:3:0) mf=c81806c0, mr=c818c480, sc=c8570000 IOCStatus=0000h, SCSIState=01h, SCSIStatus=02h, IOCLogInfo=00000000h sc->result set to 00000002h scsi0 channel 0 : resetting for second half of retries. SCSI bus is being reset for host 0 channel 0. mptscsih: OldReset scheduling BUS_RESET (sc=c8570000) scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00 mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000) scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00 mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000) scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00 mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000) Here is a bit of status so far, but no conclusion yet. I ran 2.4.21-20.ELsmp and had no problem at all: kernel: Fusion MPT base driver 2.05.16 kernel: Copyright (c) 1999-2004 LSI Logic Corporation kernel: mptbase: Initiating ioc0 bringup kernel: ioc0: 53C1030: Capabilities={Initiator,Target} kernel: mptbase: Initiating ioc1 bringup kernel: ioc1: 53C1030: Capabilities={Initiator,Target} kernel: mptbase: 2 MPT adapters found, 2 installed. kernel: Fusion MPT SCSI Host driver 2.05.16 kernel: scsi2 : ioc0: LSI53C1030, FwRev=01032740h, Ports=1, MaxQ=255, IRQ=19 kernel: scsi3 : ioc1: LSI53C1030, FwRev=01032740h, Ports=1, MaxQ=255, IRQ=19 kernel: blk: queue f7600e18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) kernel: Vendor: SEAGATE Model: DAT DAT72-000 Rev: A060 kernel: Type: Sequential-Access ANSI SCSI revision: 03 kernel: blk: queue f6ec5618, I/O limit 4294967295Mb (mask 0xffffffffffffffff) kernel: Attached scsi tape st0 at scsi2, channel 0, id 6, lun 0 kernel: resize_dma_pool: unknown device type 12 kernel: SCSI device sdd: 0 512-byte hdwr sectors (0 MB) kernel: blk: queue f6ec5218, I/O limit 4294967295Mb (mask 0xffffffffffffffff) kernel: resize_dma_pool: unknown device type 12 kernel: SCSI device sdd: 0 512-byte hdwr sectors (0 MB) This driver and kernel version are sligthly different from the customer's. My next step is to re-test with the driver the customer has. It might be helpful to get the tape model and revision info. from the customer's system, so we can see if it matches what I have. It will be in dmsg on one of the earlier kernels that works. The driver was updated in U6, so if the customer could try that (or U7 beta) it may also provide some useful info. Here is the tape model and revision from the 2.4.21-4.ELsmp dmesg: Fusion MPT base driver 2.05.05+ Copyright (c) 1999-2002 LSI Logic Corporation mptbase: Initiating ioc0 bringup ioc0: 53C1030: Capabilities={Initiator,Target} mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator,Target} mptbase: 2 MPT adapters found, 2 installed. Fusion MPT SCSI Host driver 2.05.05+ scsi0 : ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=24 scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=25 Starting timer : 0 0 blk: queue c6d70e18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) Vendor: HP Model: C7438A Rev: V312 Type: Sequential-Access ANSI SCSI revision: 03 Starting timer : 0 0 blk: queue c6d70c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff) mptscsih: ioc0: scsi0: Id=3 Lun=0: Queue depth=1 Okay, I tested the same kernel and driver as the customer: 2.4.21-32.0.1.ELsmp 2.05.16.02 on the DAT 72 tape model that I have. No problem. I also tested 2.4.21-20.ELsmp (mentioned above), and 2.4.21-38.ELsmp (U7 beta, with driver version 2.06.16.01). No problem. So, the issue may be specific to the particular HBA model, HBA FW version (001032700h vs my 01032740h), the differenece between the HP and Seagate version of this drive, or something else in the kernel. Sometimes when all the commands issued to the drive timeout, it is due to an interrupt routing problem. Can the customer try the U7 beta, just so they are running the latest? Next, ask then to post the full dmesg after booting a kernel that works and one that does not work, so we can look at interrupt issues, etc. Are they willing to run some test drivers for us? Also ask them to check with HP and see if they have the latest drive firmware. I guess I'll need to ask HP to summarize the differences between the drive models. Created attachment 123306 [details]
dmesg from 2.4.21-4
Created attachment 123307 [details]
dmesg from 2.4.21-38
|