Bug 170699 - sata_sil: SiI 3112 data corruption / hang
sata_sil: SiI 3112 data corruption / hang
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-10-13 16:35 EDT by Joe Orton
Modified: 2007-11-30 17:07 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-26 11:32:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/dmesg (11.35 KB, text/plain)
2005-10-13 16:35 EDT, Joe Orton
no flags Details
"lspci -v" output (3.85 KB, text/plain)
2005-10-13 16:36 EDT, Joe Orton
no flags Details
current dmesg output (54.96 KB, text/plain)
2005-10-13 16:38 EDT, Joe Orton
no flags Details

  None (edit)
Description Joe Orton 2005-10-13 16:35:07 EDT
Description of problem:
Using a SiI 3112A card, I'm getting data corruption and eventually the devices
hang.  Reproduced with a pair of Hitachi 250Gb disks (7K250).

Version-Release number of selected component (if applicable):
kernel-2.6.9-22.EL

How reproducible:
always

Steps to Reproduce:
1. mdadm -Cv /dev/md0 -l0 -n2 -c128 /dev/sd{a,b}1
2. mke2fs -m0 -R stride=64 /dev/md0
3. mount /dev/md0 /foo
4. copy lots of data on /foo
  
Actual results:
The drives initially produced lots of errors like:

ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata1: status=0x51 { DriveReady SeekComplete Error }

and data written is corrupted.  Eventually, the drives

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x1
ata1: status=0xd8 { Busy }
SCSI error : <0 0 0 0> return code = 0x8000002
Current sda: sense key Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sda, sector 289145151
ATA: abnormal status 0xD8 on port 0xD082A087
ATA: abnormal status 0xD8 on port 0xD082A087
ATA: abnormal status 0xD8 on port 0xD082A087
ata1: command 0x35 timeout, stat 0xd8 host_stat 0x1
ata1: status=0xd8 { Busy }
Comment 1 Joe Orton 2005-10-13 16:35:07 EDT
Created attachment 119945 [details]
/var/log/dmesg
Comment 2 Joe Orton 2005-10-13 16:36:21 EDT
Created attachment 119946 [details]
"lspci -v" output
Comment 3 Joe Orton 2005-10-13 16:38:18 EDT
Created attachment 119947 [details]
current dmesg output

The comment should have read... "eventually the devices hang "
Comment 4 Dan Carpenter 2005-10-13 18:16:02 EDT
What firmware are you using?  I had some problems with software RAID on sata sil
card.  It worked fine under a lot of stress but I couldn't install software RAID
on it until I upgraded to the 5.0.48 firmware.

Comment 5 Joe Orton 2005-10-14 04:00:32 EDT
No idea, how do I tell?
Comment 6 Dan Carpenter 2005-10-14 17:03:59 EDT
It should say in BIOS post while you boot.
Comment 7 Joe Orton 2005-10-17 06:18:42 EDT
Ah, it says version 4.1.34 on boot; I can find 4.2.50 available from:

http://www.siliconimage.com/support/supportsearchresults.aspx?pid=63&cid=15&ctid=2&

but no 5.0.48.  Am I missing anything?
Comment 8 Joe Orton 2005-10-17 12:04:22 EDT
I tried flashing one of the cards to the 4.2.50 firmware and now the machine
refuses to boot with any drives attached to that card.
Comment 9 Dan Carpenter 2005-10-18 19:58:14 EDT
Gah...  That really bites.  :/

I'm going to hide in the corner.  Try pinging support@siliconimage.com to get
the old firmware back.

Comment 10 Joe Orton 2005-10-26 11:32:00 EDT
Well, moving the card to a new machine seems to have fixed it, so I think this
was some issue with the creaky old motherboard in the original test box.

Note You need to log in before you can comment on or make changes to this bug.