Bug 175385 - HP StorageWorks DAT 72 doesn't work with mptscsih driver
HP StorageWorks DAT 72 doesn't work with mptscsih driver
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
:
Depends On:
Blocks: 170417
  Show dependency treegraph
 
Reported: 2005-12-09 13:21 EST by David Milburn
Modified: 2007-11-30 17:07 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-12-21 10:55:51 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg from 2.4.21-4 (22.43 KB, text/plain)
2006-01-17 11:32 EST, Brad Hinson
no flags Details
dmesg from 2.4.21-38 (23.39 KB, text/plain)
2006-01-17 11:33 EST, Brad Hinson
no flags Details

  None (edit)
Description David Milburn 2005-12-09 13:21:12 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc3 Firefox/1.0.7

Description of problem:
mptscsih driver resets the SCSI bus followed by ABORTS during startup

Here is a portion of dmesg:

mptbase: 2 MPT adapters found, 2 installed.
Fusion MPT SCSI Host driver 2.05.16.02
scsi0 : ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=24
scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=25
blk: queue c8158e18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
scsi0 channel 0 : resetting for second half of retries.
SCSI bus is being reset for host 0 channel 0.
mptscsih: OldReset scheduling BUS_RESET (sc=c8158000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8158000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8158000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0 Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8158000)




Version-Release number of selected component (if applicable):
kernel-2.4.21-32.0.1.ELsmp

How reproducible:
Always

Steps to Reproduce:
1. Connect HP Storage Works DAT 72 Tape Drive to LSI Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI controller.
2. Boot system.
  

Actual Results:  Tape drive is not accessible

Expected Results:  Tape drive should be accessible.

Additional info:

This breaks in 2.4.21-20.0.1.EL

2.4.21-4.ELsmp - works
2.4.21-9.0.3.ELsmp -works
2.4.21-15.0.4.ELsmp -works
2.4.21-20.0.1.ELsmp - no longer works

These kernels were tested on the same system.
Comment 1 David Milburn 2005-12-09 13:27:17 EST
Here is a portion of the dmesg output for the 2.4.21-20.0.1.ELsmp with debugging
turned on in the mptscsih driver, tape device (scsi 0 id 3 lun 0) reports that
it is busy leading to the bus resets and aborts.

mptscsih: ioc0: ScsiDone (mf=c81806c0,mr=c818c480,sc=c8570000,idx=18)
  Uh-Oh! (0:3:0) mf=c81806c0, mr=c818c480, sc=c8570000
  IOCStatus=0000h, SCSIState=01h, SCSIStatus=02h, IOCLogInfo=00000000h
  sc->result set to 00000002h
scsi0 channel 0 : resetting for second half of retries.
SCSI bus is being reset for host 0 channel 0.
mptscsih: OldReset scheduling BUS_RESET (sc=c8570000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0
Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0
Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000)
scsi : aborting command due to timeout : pid 4, scsi0, channel 0, id 3, lun 0
Inquiry 00 00 00 ff 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c8570000)
Comment 3 Tom Coughlan 2006-01-07 14:35:11 EST
Here is a bit of status so far, but no conclusion yet.

I ran 2.4.21-20.ELsmp

and had no problem at all:

kernel: Fusion MPT base driver 2.05.16
kernel: Copyright (c) 1999-2004 LSI Logic Corporation
kernel: mptbase: Initiating ioc0 bringup
kernel: ioc0: 53C1030: Capabilities={Initiator,Target}
kernel: mptbase: Initiating ioc1 bringup
kernel: ioc1: 53C1030: Capabilities={Initiator,Target}
kernel: mptbase: 2 MPT adapters found, 2 installed.
kernel: Fusion MPT SCSI Host driver 2.05.16
kernel: scsi2 : ioc0: LSI53C1030, FwRev=01032740h, Ports=1, MaxQ=255, IRQ=19
kernel: scsi3 : ioc1: LSI53C1030, FwRev=01032740h, Ports=1, MaxQ=255, IRQ=19
kernel: blk: queue f7600e18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
kernel:   Vendor: SEAGATE   Model: DAT    DAT72-000  Rev: A060
kernel:   Type:   Sequential-Access                  ANSI SCSI revision: 03
kernel: blk: queue f6ec5618, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
kernel: Attached scsi tape st0 at scsi2, channel 0, id 6, lun 0
kernel: resize_dma_pool: unknown device type 12
kernel: SCSI device sdd: 0 512-byte hdwr sectors (0 MB)
kernel: blk: queue f6ec5218, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
kernel: resize_dma_pool: unknown device type 12
kernel: SCSI device sdd: 0 512-byte hdwr sectors (0 MB)


This driver and kernel version are sligthly different from the customer's. My
next step is to re-test with the driver the customer has. 

It might be helpful to get the tape model and revision info. from the customer's
system, so we can see if it matches what I have. It will be in dmsg on one of
the earlier kernels that works.

The driver was updated in U6, so if the customer could try that (or U7 beta) it
may also provide some useful info. 
Comment 5 David Milburn 2006-01-09 12:07:14 EST
Here is the tape model and revision from the 2.4.21-4.ELsmp dmesg:

Fusion MPT base driver 2.05.05+
Copyright (c) 1999-2002 LSI Logic Corporation
mptbase: Initiating ioc0 bringup
ioc0: 53C1030: Capabilities={Initiator,Target}
mptbase: Initiating ioc1 bringup
ioc1: 53C1030: Capabilities={Initiator,Target}
mptbase: 2 MPT adapters found, 2 installed.
Fusion MPT SCSI Host driver 2.05.05+
scsi0 : ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=24
scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=25
Starting timer : 0 0
blk: queue c6d70e18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
  Vendor: HP        Model: C7438A            Rev: V312
  Type:   Sequential-Access                  ANSI SCSI revision: 03
Starting timer : 0 0
blk: queue c6d70c18, I/O limit 4294967295Mb (mask 0xffffffffffffffff)
mptscsih: ioc0: scsi0: Id=3 Lun=0: Queue depth=1
Comment 6 Tom Coughlan 2006-01-15 20:00:02 EST
Okay, I tested the same kernel and driver as the customer:

2.4.21-32.0.1.ELsmp 
2.05.16.02

on the DAT 72 tape model that I have. No problem. I also tested 2.4.21-20.ELsmp
(mentioned above), and 2.4.21-38.ELsmp (U7 beta, with driver version
2.06.16.01). No problem. 

So, the issue may be specific to the particular HBA model, HBA FW version
(001032700h vs my 01032740h), the differenece between the HP and Seagate version
of this drive, or something else in the kernel. Sometimes when all the commands
issued to the drive timeout, it is due to an interrupt routing problem. 

Can the customer try the U7 beta, just so they are running the latest? Next, 
ask then to post the full dmesg after booting a kernel that works and one that
does not work, so we can look at interrupt issues, etc. Are they willing to run
some test drivers for us? Also ask them to check with HP and see if they have
the latest drive firmware.

I guess I'll need to ask HP to summarize the differences between the drive models. 
Comment 7 Brad Hinson 2006-01-17 11:32:04 EST
Created attachment 123306 [details]
dmesg from 2.4.21-4
Comment 8 Brad Hinson 2006-01-17 11:33:20 EST
Created attachment 123307 [details]
dmesg from 2.4.21-38

Note You need to log in before you can comment on or make changes to this bug.