29304 – [aic7xxx scsi] timeout + reset issues

Bug 29304 - [aic7xxx scsi] timeout + reset issues

Summary: [aic7xxx scsi] timeout + reset issues

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	7.1
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	Brock Organ
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-02-24 22:44 UTC by Jeremy Katz
Modified:	2007-04-18 16:31 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-02-12 20:12:46 UTC
Embargoed:

Attachments	(Terms of Use)
RH 8.0B1 traceback with 4x CDR Mashu (65.92 KB, text/plain) 2002-02-12 20:12 UTC, R P Herrold	no flags	Details
View All

Description Jeremy Katz 2001-02-24 22:44:51 UTC

Seeing scsi timeouts with lots of disk access on an Adaptec 2930U2 (aic7xxx
driver) under 2.4.1-0.1.9, 2.4.1-0.1.9smp, and 2.4.1-0.1.13smp.

dmesg of errors (this is from 2.4.1-0.1.13smp)
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 7d 00 00 08 00
(scsi0:0:0:0) SCSISIGI 0xe6, SEQADDR 0x95, SSTAT0 0x2, SSTAT1 0x13
(scsi0:0:0:0) SG_CACHEPTR 0x6, SSTAT2 0x40, STCNT 0x400
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 85 00 00 08 00
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d5 f9 8d 00 00 08 00
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 65 00 00 08 00
(scsi0:0:0:0) SCSISIGI 0xe6, SEQADDR 0x95, SSTAT0 0x2, SSTAT1 0x13
(scsi0:0:0:0) SG_CACHEPTR 0x6, SSTAT2 0x40, STCNT 0x200
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 6d 00 00 08 00
scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun
0 Read (10) 00 00 d6 4f 75 00 00 08 00
SCSI host 0 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 31.
(scsi0:0:6:0) Synchronous at 80.0 Mbyte/sec, offset 15.

lspci gives:
00:12.0 SCSI storage controller: Adaptec 2930U2
        Subsystem: Adaptec: Unknown device 0181
        Flags: bus master, medium devsel, latency 64, IRQ 10
        BIST result: 00
        I/O ports at e800 [size=256]
        Memory at febff000 (64-bit, non-prefetchable) [size=4K]
        Expansion ROM at febc0000 [disabled] [size=128K]


SCSI init looks like
(scsi0) <Adaptec AHA-293X Ultra2 SCSI host adapter> found at PCI 0/18/0
(scsi0) Wide Channel, SCSI ID=7, 32/255 SCBs
(scsi0) Downloading sequencer code... 398 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.3/5.2.0
       <Adaptec AHA-293X Ultra2 SCSI host adapter>
(scsi0:0:0:0) Synchronous at 80.0 Mbyte/sec, offset 63.
  Vendor: WDIGTL    Model: WD183 ULTRA2      Rev: 1.00
  Type:   Direct-Access                      ANSI SCSI revision: 02
  Vendor: TEAC      Model: CD-R56S           Rev: 1.0F
  Type:   CD-ROM                             ANSI SCSI revision: 02
  Vendor: IBM       Model: DDRS-34560D       Rev: DC1B
  Type:   Direct-Access                      ANSI SCSI revision: 02
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 6, lun 0
SCSI device sda: 35761710 512-byte hdwr sectors (18310 MB)
 sda: sda1 sda2 sda3 sda4 < sda5 >
(scsi0:0:6:0) Synchronous at 80.0 Mbyte/sec, offset 15.
SCSI device sdb: 8925000 512-byte hdwr sectors (4570 MB)
 sdb: sdb1 sdb2


Been running with this in various kernels with all of the above disks for
close to a year now, close to two years with all but the WD drive.  Haven't
had any problems and didn't have any problems under the Fisher kernel
(although the later 2.4.0-0.99.23smp oopsed on module load, but since that
was a different driver, should be unrelated)

Comment 1 Glen Foster 2001-02-26 23:55:25 UTC

This defect is considered MUST-FIX for Florence Gold release

Comment 2 Doug Ledford 2001-02-27 04:29:18 UTC

After talking to the bug poster in separate email, it was found through testing
that the following change:

Can you recompile an aic7xxx module and test it for me?  If so, then go to
line 9438 in the function aic7xxx_configure_bugs().  Change it from:

        case AHC_AIC7890:
                if(pci_rev == 0)
                {
                        p->bugs |= AHC_BUG_AUTOFLUSH | AHC_BUG_CACHETHEN;
                }

to

        case AHC_AIC7890:
                p->bugs |= AHC_BUG_AUTOFLUSH;
                if(pci_rev == 0)
                {
                        p->bugs |= AHC_BUG_CACHETHEN;
                }

and see if that doesn't solve your problem.

Did indeed solve the problem.  This is being integrated into CVS as of Mon Feb
26th and will be in the next release following RC1.

Comment 3 Jeremy Katz 2001-03-05 20:19:03 UTC

Ugh, still seeing this in 2.4.2-0.1.19 although it's harder to trigger.  Doing a
test hard drive install yesterday triggered it extensively throughout the
install (reading the cd images off of scsi drive, installing to a test partition
on the ide drive).  New ideas?

Comment 4 Doug Ledford 2001-04-03 08:43:00 UTC

aic7xxx_new is the next option.  See if that works better for you (in the latest
ISOs we should have aic7xxx_new version 6.1.7 which should work pretty reliably
at this point).

Comment 5 R P Herrold 2002-02-12 20:11:33 UTC

Hi Doug -- 80B1 testing 

First I did a cold RH 7.1 gold install with the CD drive (my unit oldpokey, 4x
SCSI CDR Mashushita, 3.99 Dom; Adaptec AHA-2940UW controller, on the SCSI-II
channel)-- absolutely no problems doing the RH 7.1 install on a wiped IBM 4G UW
drive.

Cut over to a cleanly verifying 80B1 set, first cd, and get reset errors out the
wazoo ... same hardware.  This may be an Anaconda error on reset detection.

Swapped out drive and trying with another SCSI CR drlve (non CDR) ... testing

Trace in a moment.

Comment 6 R P Herrold 2002-02-12 20:12:41 UTC

Created attachment 45423 [details]
RH 8.0B1 traceback with 4x CDR Mashu

Comment 7 Doug Ledford 2002-02-13 22:15:46 UTC

This actually looks like a generic case of "this CD drive doesn't like this
particular CD media", not a driver issue.  Try with a new disk instead of with a
new drive and see if that helps.  Also, this problem you just reported is
decidedly different than the original problem in this bug report.  I'm closing
this bug out (it's horribly stale at this point anyway).  Please open a new bug
if you still can't get your CD drive to read the CD *and* if it doesn't seem to
be a media issue instead of a driver issue.

Note You need to log in before you can comment on or make changes to this bug.