Bug 29304 - [aic7xxx scsi] timeout + reset issues
[aic7xxx scsi] timeout + reset issues
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Doug Ledford
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-02-24 17:44 EST by Jeremy Katz
Modified: 2007-04-18 12:31 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-02-12 15:12:46 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
RH 8.0B1 traceback with 4x CDR Mashu (65.92 KB, text/plain)
2002-02-12 15:12 EST, R P Herrold
no flags Details

  None (edit)
Description Jeremy Katz 2001-02-24 17:44:51 EST
Seeing scsi timeouts with lots of disk access on an Adaptec 2930U2 (aic7xxx
driver) under 2.4.1-0.1.9, 2.4.1-0.1.9smp, and 2.4.1-0.1.13smp.

dmesg of errors (this is from 2.4.1-0.1.13smp)
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 7d 00 00 08 00
(scsi0:0:0:0) SCSISIGI 0xe6, SEQADDR 0x95, SSTAT0 0x2, SSTAT1 0x13
(scsi0:0:0:0) SG_CACHEPTR 0x6, SSTAT2 0x40, STCNT 0x400
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 85 00 00 08 00
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 8d 00 00 08 00
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 65 00 00 08 00
(scsi0:0:0:0) SCSISIGI 0xe6, SEQADDR 0x95, SSTAT0 0x2, SSTAT1 0x13
(scsi0:0:0:0) SG_CACHEPTR 0x6, SSTAT2 0x40, STCNT 0x200
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 6d 00 00 08 00
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 75 00 00 08 00
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:6:0) Synchronous at 80.0 Mbyte/sec, offset 15.

lspci gives:
00:12.0 SCSI storage controller: Adaptec 2930U2
        Subsystem: Adaptec: Unknown device 0181
        Flags: bus master, medium devsel, latency 64, IRQ 10
        BIST result: 00
        I/O ports at e800 [size=256]
        Memory at febff000 (64-bit, non-prefetchable) [size=4K]
        Expansion ROM at febc0000 [disabled] [size=128K]


SCSI init looks like
(scsi0) <Adaptec AHA-293X Ultra2 SCSI host adapter> found at PCI 0/18/0
(scsi0) Wide Channel, SCSI ID=7, 32/255 SCBs
(scsi0) Downloading sequencer code... 398 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.3/5.2.0
       <Adaptec AHA-293X Ultra2 SCSI host adapter>
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 63.
  Vendor: WDIGTL    Model: WD183 ULTRA2      Rev: 1.00
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: TEAC      Model: CD-R56S           Rev: 1.0F
  Type:   CD-ROM                             ANSI SCSI revision: 02
  Vendor: IBM       Model: DDRS-34560D       Rev: DC1B
  Type:   Direct-Access                      ANSI SCSI revision: 02
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 6, lun 0
SCSI device sda: 35761710 512-byte hdwr sectors (18310 MB)
 sda: sda1 sda2 sda3 sda4 < sda5 >
(scsi0:0:6:0) Synchronous at 80.0 Mbyte/sec, offset 15.
SCSI device sdb: 8925000 512-byte hdwr sectors (4570 MB)
 sdb: sdb1 sdb2


Been running with this in various kernels with all of the above disks for
close to a year now, close to two years with all but the WD drive.  Haven't
had any problems and didn't have any problems under the Fisher kernel
(although the later 2.4.0-0.99.23smp oopsed on module load, but since that
was a different driver, should be unrelated)
Comment 1 Glen Foster 2001-02-26 18:55:25 EST
This defect is considered MUST-FIX for Florence Gold release
Comment 2 Doug Ledford 2001-02-26 23:29:18 EST
After talking to the bug poster in separate email, it was found through testing
that the following change:

Can you recompile an aic7xxx module and test it for me?  If so, then go to
line 9438 in the function aic7xxx_configure_bugs().  Change it from:

        case AHC_AIC7890:
                if(pci_rev == 0)
                {
                        p->bugs |= AHC_BUG_AUTOFLUSH | AHC_BUG_CACHETHEN;
                }

to

        case AHC_AIC7890:
                p->bugs |= AHC_BUG_AUTOFLUSH;
                if(pci_rev == 0)
                {
                        p->bugs |= AHC_BUG_CACHETHEN;
                }

and see if that doesn't solve your problem.

Did indeed solve the problem.  This is being integrated into CVS as of Mon Feb
26th and will be in the next release following RC1.
Comment 3 Jeremy Katz 2001-03-05 15:19:03 EST
Ugh, still seeing this in 2.4.2-0.1.19 although it's harder to trigger.  Doing a
test hard drive install yesterday triggered it extensively throughout the
install (reading the cd images off of scsi drive, installing to a test partition
on the ide drive).  New ideas?
Comment 4 Doug Ledford 2001-04-03 04:43:00 EDT
aic7xxx_new is the next option.  See if that works better for you (in the latest
ISOs we should have aic7xxx_new version 6.1.7 which should work pretty reliably
at this point).
Comment 5 R P Herrold 2002-02-12 15:11:33 EST
Hi Doug -- 80B1 testing 

First I did a cold RH 7.1 gold install with the CD drive (my unit oldpokey, 4x
SCSI CDR Mashushita, 3.99 Dom; Adaptec AHA-2940UW controller, on the SCSI-II
channel)-- absolutely no problems doing the RH 7.1 install on a wiped IBM 4G UW
drive.

Cut over to a cleanly verifying 80B1 set, first cd, and get reset errors out the
wazoo ... same hardware.  This may be an Anaconda error on reset detection.

Swapped out drive and trying with another SCSI CR drlve (non CDR) ... testing

Trace in a moment.
Comment 6 R P Herrold 2002-02-12 15:12:41 EST
Created attachment 45423 [details]
RH 8.0B1 traceback with 4x CDR Mashu
Comment 7 Doug Ledford 2002-02-13 17:15:46 EST
This actually looks like a generic case of "this CD drive doesn't like this
particular CD media", not a driver issue.  Try with a new disk instead of with a
new drive and see if that helps.  Also, this problem you just reported is
decidedly different than the original problem in this bug report.  I'm closing
this bug out (it's horribly stale at this point anyway).  Please open a new bug
if you still can't get your CD drive to read the CD *and* if it doesn't seem to
be a media issue instead of a driver issue.

Note You need to log in before you can comment on or make changes to this bug.