Bug 59573 - scsi bus reset when using aic7xxx driver to access Ultra Wide drive
scsi bus reset when using aic7xxx driver to access Ultra Wide drive
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Doug Ledford
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-02-10 14:50 EST by Dana Hudes
Modified: 2007-04-18 12:40 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-07 21:19:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dana Hudes 2002-02-10 14:50:32 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Q312461)

Description of problem:
The current aic7xxx driver does not work properly with, at a minimum, any Ultra 
drives you get scsi bus resets all over the place eventually resulting in data 
loss (I lost tens of gigabytes of data
I thought was safe on a RAID5 array thanks to this ).
The system in question has an AIC7895 controller. Connected is a system disk 
which is narrow @ 20MB/sec and new drives I add are Ultra Wide 40MB/sec.  
Attempts to access the new drives result in frequent bus resets due to timeout.
Discussion with Justin Gibbs, now of Adaptec, indicate that the drives are 
compatible but the problem is that despite their being out for over a year, 
RedHat has not put the official Adaptec drivers into the kernel. I now have to 
rebuild the kernel with Justin's patches
and see that it works.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
Install an ultra wide disk. If you can get it partitioned you're doing well.
This problem has gotten worse with the new kernel, I had drives from my RAID 
array under 7.0 and hooked them up under 7.2 but could not partition and format 
a single Fujitsu 18Gb drive under 7.2.

I have the 4 23 Gb seagate ultra wide scsi drives connected to bus A
of the 7895.
I configured them for RAID 0, then raidstart. I then use the 100baseT LAN to 
copy a few gigabytes of files across using Samba, from a Windows XP laptop.

Actual Results:  eb 10 13:31:20 harmony kernel: scsi : aborting command due to 
timeout : pid 0, scsi0, channel 0, id 0, lun 0 Write (10) 00 00 cb 3e
 c9 00 00 02 00
Feb 10 13:31:20 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 0a 20
 01 00 00 02 00
Feb 10 13:31:20 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 a2 2b
 06 00 00 04 00
Feb 10 13:31:20 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 00 80
 24 00 00 02 00
Feb 10 13:31:20 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 10 13:31:20 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 10 13:31:20 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 10 13:31:20 harmony kernel: (scsi0:0:4:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:20 harmony kernel: (scsi0:0:6:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:20 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 10 13:31:20 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 10 13:31:20 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 10 13:31:21 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 10 13:31:21 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 10 13:31:21 harmony kernel: (scsi0:0:4:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:21 harmony kernel: SCSI host 0 channel 0 reset (pid 0) timed out - 
trying harder
Feb 10 13:31:21 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 10 13:31:21 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 10 13:31:21 harmony kernel: (scsi0:0:4:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:21 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 07 ca
 0b 00 00 1c 00
Feb 10 13:31:21 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 10 13:31:21 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 10 13:31:21 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 10 13:31:21 harmony kernel: (scsi0:0:4:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:21 harmony kernel: (scsi0:0:6:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 10 13:31:21 harmony kernel: (scsi0:0:2:0) Synchronous at 40.0 Mbyte/sec, 
offse


Expected Results:  files should have copied without resets.

Additional info:

It is URGENT that RedHat incorporate the Adaptec driver into the release tree. 
My machine and those of many other RedHat customers (including my 15 students 
at Yeshiva University)
are suffering badly.
Comment 1 Arjan van de Ven 2002-02-10 15:00:07 EST
Justin seems to really not know what he is talking about. We include BOTH the
"old" and the "new" driver, and have been for over a year, ever since his driver
came out. The "old" driver is the default because the "new" driver is giving
quite a lot of problems. However you can easily switch by changing the driver
that is used.
Just edit the /etc/modules.conf file to use "aic7xxx_mod" instead of "aic7xxx".
You will have to recreate the initrd file (I assume you already know how to do
that, if not just ask)
Comment 2 Justin T. Gibbs 2002-02-10 16:43:27 EST
>Justin seems to really not know what he is talking about. We include BOTH the
>"old" and the "new" driver, and have been for over a year, ever since his driver
>came out. The "old" driver is the default because the "new" driver is giving
>quite a lot of problems.

Try telling a newby how to install using the "New and still marked
experimental" driver under RH7.1 or 7.2:

LILO: expert noprobe
Select "New Experimental aic7xxx driver" from the SCSI drivers menu.  Most
people don't even know that it exists because it doesn't show up next to the
"old" driver.

Continue install
Before quiting the install, switch to the shell VTY, manually edit the
soon to be /etc/modules.conf to remove any references to the *wrong* aic7xxx
module, rebuild your initrd to ensure the system won't try to load both 
modules, and tell the user never to add additional cards to the system because
kudzu will spam improper entries to their /etc/modules.conf.

Switching to the other driver is certainly trivial, especially if you've
never created an initrd, nor installed RedHat before.

As to instability issues, please point me to the appropriate bugzilla
entries so I can fix them.  I haven't received a bug report in my mailbox
from a RedHat employee in perhaps a year.  Bugs that aren't reported, can't
be fixed.
Comment 3 Dana Hudes 2002-02-10 16:55:30 EST
I am in process of downloading the Kernel SRPM. I realize that I do not need
to rebuild the kernel in order to make the change but I started the download
before the info regards aic7xxx_mod and will wait for it to complete before 
messing with things.

Meanwhile the remarks from gibbs regarding the process for switching over
are indeed dead on target. 
I'm afraid indeed of what Kudzu will do to me if I add even a USB device.
It will see that the other module is there and try to put the 'right' one in.
Preventing this will require somehow hacking the Kudzu db to change the name of 
the module for an aic7xxx adapter.

Rebuilding initrd is not trivial.
I do think I know how to make it happen, but I pity anyone who hasn't got
18 years of UNIX administration on a half-dozen flavors plus a Master's in 
Computer Science trying to dope this out.
I certainly don't think your typical RHCE is going to manage this stunt.
If RedHat feel there's a good reason to offer both drivers, at least make
the choice available.
Comment 4 Dana Hudes 2002-02-10 23:27:24 EST
I changed /etc/modules.conf and rebuild initrd for 2.4.9-21smp.
The resets and timeouts went away but the system still hangs if I copy
a hundred megabytes or so of data via Samba on 100baseT LAN (full duplex; I've 
never seen more than 30Mbit/second though).
monitoring logs shows that Samba is losing connection then reconnecting.
iostat -d 30 shows good stuff (hundreds of writes/seconds) for a few periods
and then nothing. The data source quits with an error that it lost the network 
resource. The Linux system is unresponsive albeit still churning out iostat -d 
with 0 activity and you can still ping it.


From kernel.log:
Feb 10 22:11:40 harmony kernel: SCSI subsystem driver Revision: 1.00
Feb 10 22:11:40 harmony kernel: ahc_pci:0:15:1: Using left over BIOS settings
Feb 10 22:11:40 harmony kernel: ahc_pci:0:15:0: Using left over BIOS settings
Feb 10 22:11:40 harmony kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 6.2.1
Feb 10 22:11:40 harmony kernel:         <Adaptec aic7895 Ultra SCSI adapter>
Feb 10 22:11:40 harmony kernel:         aic7895: Ultra Wide Channel A, SCSI 
Id=7, 255 SCBs
Feb 10 22:11:40 harmony kernel:
Feb 10 22:11:40 harmony kernel: scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 6.2.1
Feb 10 22:11:40 harmony kernel:         <Adaptec aic7895 Ultra SCSI adapter>
Feb 10 22:11:40 harmony kernel:         aic7895: Ultra Wide Channel B, SCSI 
Id=7, 255 SCBs
Feb 10 22:11:40 harmony kernel:
Feb 10 22:11:40 harmony kernel: blk: queue cff65e18, I/O limit 4095Mb (mask 
0xffffffff)
Feb 10 22:11:40 harmony kernel:   Vendor: MICROP    Model: 3391 -22          
Rev: P409
Feb 10 22:11:41 harmony kernel:   Type:   Direct-Access                      
ANSI SCSI revision: 02
Feb 10 22:11:41 harmony kernel: blk: queue cff65c18, I/O limit 4095Mb (mask 
0xffffffff)
Feb 10 22:11:41 harmony kernel:   Vendor: SEAGATE   Model: SX4234514         
Rev: 9E21
Feb 10 22:11:41 harmony kernel:   Type:   Direct-Access                      
ANSI SCSI revision: 02

Feb 10 22:11:42 harmony kernel: md: raid0 personality registered as nr 2
Feb 10 22:11:42 harmony kernel: Journalled Block Device driver loaded
Feb 10 22:11:42 harmony kernel: md: Autodetecting RAID arrays.
Feb 10 22:11:42 harmony kernel: md: autorun ...
Feb 10 22:11:42 harmony kernel: md: ... autorun DONE.

Feb 10 22:11:44 harmony kernel: raid0: checking sdb ... contained as device 0
Feb 10 22:11:44 harmony kernel:   (22661248) is smallest!.
Feb 10 22:11:44 harmony kernel: raid0: checking sdc ... contained as device 1
Feb 10 22:11:44 harmony kernel: raid0: checking sdd ... contained as device 2
Feb 10 22:11:44 harmony kernel: raid0: checking sde ... contained as device 3
Feb 10 22:11:44 harmony kernel: raid0: zone->nb_dev: 4, size: 90644992
Feb 10 22:11:44 harmony kernel: raid0: current zone offset: 22661248


In order to see if perhaps a different drive, and nonRAID, makes a difference I 
hooked up the SCA Fujitsu 18Gb drive.Not good:
eb 10 22:23:15 harmony kernel: (scsi1:A:9): 40.000MB/s transfers (20.000MHz, 
offset 8, 16bit)
Feb 10 22:23:15 harmony kernel: SCSI device sdf: 35680750 512-byte hdwr sectors 
(18269 MB)
Feb 10 22:23:15 harmony kernel:  sdf:(scsi1:A:9:0): parity error detected in 
Data-in phase. SEQADDR(0x18f) SCSIRATE(0x88)
Feb 10 22:23:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:23:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:23:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8e) SCSIRATE(0x88)
Feb 10 22:23:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:23:15 harmony kernel: SCSI disk error : host 1 channel 0 id 9 lun 0 
return code = 8000002
Feb 10 22:23:15 harmony kernel: Current sd08:50: sense key Aborted Command
Feb 10 22:23:15 harmony kernel: Additional sense indicates Initiator detected 
error message received
Feb 10 22:23:15 harmony kernel:  I/O error: dev 08:50, sector 0
Feb 10 22:23:15 harmony kernel:  unable to read partition table

Feb 10 22:27:05 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:27:05 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:28:05 harmony kernel: scsi1:0:9:0: Attempting to queue an ABORT 
message
Feb 10 22:28:05 harmony kernel: scsi1:0:9:0: Device is active, asserting ATN
Feb 10 22:28:05 harmony kernel: Recovery code sleeping
Feb 10 22:28:10 harmony kernel: Recovery code awake
Feb 10 22:28:10 harmony kernel: Timer Expired
Feb 10 22:28:10 harmony kernel: aic7xxx_abort returns 8195
Feb 10 22:28:10 harmony kernel: scsi1:0:9:0: Attempting to queue a TARGET RESET 
message
Feb 10 22:28:10 harmony kernel: aic7xxx_dev_reset returns 8195
Feb 10 22:28:10 harmony kernel: Recovery SCB completes
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x18f) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x18f) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8e) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x18f) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x18f) SCSIRATE(0x88)
Feb 10 22:28:15 harmony kernel: SCSI disk error : host 1 channel 0 id 9 lun 0 
return code = 8000002
Feb 10 22:28:15 harmony kernel: Current sd08:50: sense key Hardware Error
Feb 10 22:28:15 harmony kernel:  I/O error: dev 08:50, sector 0

Feb 10 22:29:15 harmony kernel: scsi1:0:9:0: Device is active, asserting ATN
Feb 10 22:29:15 harmony kernel: Recovery code sleeping
Feb 10 22:29:15 harmony kernel: Recovery code awake
Feb 10 22:29:15 harmony kernel: aic7xxx_abort returns 8194
Feb 10 22:29:25 harmony kernel: scsi1:0:9:0: Attempting to queue an ABORT 
message
Feb 10 22:29:25 harmony kernel: scsi1:0:9:0: Cmd aborted from QINFIFO
Feb 10 22:29:25 harmony kernel: aic7xxx_abort returns 8194
Feb 10 22:29:25 harmony kernel: scsi1:0:9:0: Attempting to queue a TARGET RESET 
message
Feb 10 22:29:25 harmony kernel: aic7xxx_dev_reset returns 8195
Feb 10 22:29:25 harmony kernel: Recovery SCB completes
Feb 10 22:29:30 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8e) SCSIRATE(0x88)
Feb 10 22:29:30 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)
Feb 10 22:30:30 harmony kernel: scsi1:0:9:0: Attempting to queue an ABORT 
message
Feb 10 22:30:30 harmony kernel: scsi1:0:9:0: Device is active, asserting ATN
Feb 10 22:30:30 harmony kernel: Recovery code sleeping
Feb 10 22:30:30 harmony kernel: Recovery code awake
Feb 10 22:30:30 harmony kernel: aic7xxx_abort returns 8194
Feb 10 22:30:30 harmony kernel: scsi: device set offline - not ready or command 
retry failed after bus reset: host 1 channel 0 id 9
lun 0
Feb 10 22:30:30 harmony kernel: SCSI disk error : host 1 channel 0 id 9 lun 0 
return code = 83f0000
Feb 10 22:30:30 harmony kernel: Current sd08:50: sense key Aborted Command
Feb 10 22:30:30 harmony kernel: Additional sense indicates Initiator detected 
error message received
Feb 10 22:30:30 harmony kernel:  I/O error: dev 08:50, sector 0


At this point I do not have a kernel build environment (I'm using 686 SMP 
kernel). Scrounging disk space is a problem (I have lots of disks I can't use 
because of this problem, over 100Gb of Ultra SCSI disks can be online....).

Comment 5 Arjan van de Ven 2002-02-11 03:57:07 EST
Feb 10 22:23:15 harmony kernel: (scsi1:A:9:0): parity error detected in Data-in 
phase. SEQADDR(0x8d) SCSIRATE(0x88)

Hmmmmm this doesn't look like a driver problem but more like a physical issue.
(eg cabling)


Comment 6 Dana Hudes 2002-02-11 08:54:05 EST
indeed the fujitsu drive may or may not be defective. Without a working Linux 
kernel with Ultra Wide support there is no way to be sure since I don't have 
another controller elsewhere.

At this time the system is running with the aic7xxx_mod and if one so much
as copies (local, via cp, between the ultra drive RAID array and the narrow 
system disk) a single file the entire machine locks up.
This is worse than aic7xxx which at least , if you stop messing with the ultra 
drive, gets on with its business and also allows some writing to the ultra 
drive (which is how the files got there).
I also can no longer boot the other kernels, presumably I have to put 
modules.conf back first.
Comment 7 Doug Ledford 2002-02-13 17:29:46 EST
This is definitely a hardware cabling/termination issue.  In order to be of any
help at all, I would have to know how the machine is both cabled and terminated.
 Without that information, I'm afraid you are mostly on your own to figure out
the problem :-(
Comment 8 Dana Hudes 2002-02-13 22:49:07 EST
This is not a cabling issue on the seagates; the Fujitsu could well be a 
defective drive and until the Seagates work reliably I am putting it aside. 
Just because I teach in University does not mean I am stupid and clueless 
academic. I have extensive experience in a variety of areas including 
electronics. I've been a UNIX admin since 1983, see http://www.tcp-
ip.info/resume.html 
The seagates worked for a long while under RH7.0, one day it had massive scsi 
timeouts and I lost 30Gb of data on my RAID5 array.
I disconnected everything until I could return to the issue. With the new RH7.2 
and new improved (per Neil Brown) RAID code I thought I'd give it a shot esp. 
since I really need the disk space, the 9Gb system disk is awful cramped.

FYI, the 4 seagates are in an external disk tower. It has 2 external Ultra Wide 
connectors (and all cabling inside is Ultra Wide done by the dealer's 
technician). One connector is a 3ft cable (there was an 8ft cable, I 
experienced some timeouts and replaced with 3ft and the problem went away).
The other is an external terminator, whose LED is currently green and which 
blinks when the system issues the bus reset. The external cable connects to an 
external connector on the PC. That is wired on a ribbon cable, purchased new 
last summer, which goes to the Ultra Wide connector on Bus A (I have tried both 
bus A and bus B). 

Bus A has also the Micropolis narrow SCSI drive which works until the Ultra 
Wide is added to the system. The Micropolis is on a ribbon cable with the 
Yamaha SCSI CDRW drive, both are connected to the narrow connector on Bus A.


Currently the SCSI BIOS is set for host adapter termination on for both high 
and low bytes. 

If there is anything else you need, I will gladly tell you. If you need me to 
run any test, I will do so. If you have a debug version of a module, I can run 
that if you give it to me.

At this time I have hundreds of dollars worth of drives sitting doing nothing 
and I need them online. 
While I myself haven't purchased the support contract, my client (Guyana Net) 
has a multisystem support contract covering my machine too.

Comment 9 Dana Hudes 2002-02-14 03:21:50 EST
I just tried the badblocks test myself. I unmounted and stopped the RAID0 array 
(not a typo) and then did 
 badblocks /dev/sdc

which produced SCSI errors as I expected:
Feb 14 03:11:14 harmony kernel: md: export_rdev(sdb)
Feb 14 03:13:26 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 3, lun 0 Read (10) 00 00 00 01
b8 00 00 60 00
Feb 14 03:13:26 harmony kernel: (scsi0:0:3:0) SCSISIGI 0x44, SEQADDR 0x111, 
SSTAT0 0x0, SSTAT1 0x2
Feb 14 03:13:26 harmony kernel: (scsi0:0:3:0) SG_CACHEPTR 0x0, SSTAT2 0x0, 
STCNT 0x2
Feb 14 03:13:27 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 14 03:13:27 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 14 03:13:29 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 14 03:13:29 harmony kernel: (scsi0:0:3:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 14 03:14:29 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 3, lun 0 Read (10) 00 00 00 44
d8 00 00 60 00
Feb 14 03:14:29 harmony kernel: (scsi0:0:3:0) SCSISIGI 0x44, SEQADDR 0x111, 
SSTAT0 0x0, SSTAT1 0x2
Feb 14 03:14:29 harmony kernel: (scsi0:0:3:0) SG_CACHEPTR 0x0, SSTAT2 0x0, 
STCNT 0x2
Feb 14 03:14:30 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 14 03:14:30 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 14 03:14:32 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 14 03:14:32 harmony kernel: (scsi0:0:3:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.
Feb 14 03:15:32 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 3, lun 0 Read (10) 00 00 00 62
18 00 00 60 00
Feb 14 03:15:32 harmony kernel: (scsi0:0:3:0) SCSISIGI 0x44, SEQADDR 0x111, 
SSTAT0 0x0, SSTAT1 0x2
Feb 14 03:15:32 harmony kernel: (scsi0:0:3:0) SG_CACHEPTR 0x0, SSTAT2 0x0, 
STCNT 0x2
Feb 14 03:15:32 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 07 09
 77 00 00 02 00
Feb 14 03:15:35 harmony kernel: scsi : aborting command due to timeout : pid 0, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 07 a3
 21 00 00 02 00
Feb 14 03:15:35 harmony kernel: SCSI host 0 abort (pid 0) timed out - resetting
Feb 14 03:15:35 harmony kernel: SCSI bus is being reset for host 0 channel 0.
Feb 14 03:15:35 harmony kernel: (scsi0:0:0:0) Synchronous at 20.0 Mbyte/sec, 
offset 15.
Feb 14 03:15:35 harmony kernel: (scsi0:0:3:0) Synchronous at 40.0 Mbyte/sec, 
offset 8.

Comment 10 Doug Ledford 2002-03-20 12:14:21 EST
> The seagates worked for a long while under RH7.0, one day it had massive scsi 
> timeouts and I lost 30Gb of data on my RAID5 array.

This is still a hardware issue.  What you have described is exactly what I've
seen *numerous* times when a terminator on a controller dies suddenly.  At this
point, you are likely trying to run the SCSI bus with just one terminator
instead of two and that's the probably cause for your problem.  Replace your
Adaptec card and most likely the problems will go away.
Comment 11 Alan Cox 2003-06-07 21:19:21 EDT
No reply for a year, closing

Note You need to log in before you can comment on or make changes to this bug.