Bug 170699

Summary:

sata_sil: SiI 3112 data corruption / hang

Product:

Red Hat Enterprise Linux 4

Reporter:

Joe Orton <jorton>

Component:

kernel

Assignee:

Kernel Maintainer List <kernel-maint>

Status:

CLOSED NOTABUG

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.0

CC:

jgarzik

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-10-26 15:32:00 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
/var/log/dmesg	none
"lspci -v" output	none
current dmesg output	none

Description Joe Orton 2005-10-13 20:35:07 UTC

Description of problem:
Using a SiI 3112A card, I'm getting data corruption and eventually the devices
hang.  Reproduced with a pair of Hitachi 250Gb disks (7K250).

Version-Release number of selected component (if applicable):
kernel-2.6.9-22.EL

How reproducible:
always

Steps to Reproduce:
1. mdadm -Cv /dev/md0 -l0 -n2 -c128 /dev/sd{a,b}1
2. mke2fs -m0 -R stride=64 /dev/md0
3. mount /dev/md0 /foo
4. copy lots of data on /foo
  
Actual results:
The drives initially produced lots of errors like:

ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata1: status=0x51 { DriveReady SeekComplete Error }

and data written is corrupted.  Eventually, the drives

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x1
ata1: status=0xd8 { Busy }
SCSI error : <0 0 0 0> return code = 0x8000002
Current sda: sense key Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sda, sector 289145151
ATA: abnormal status 0xD8 on port 0xD082A087
ATA: abnormal status 0xD8 on port 0xD082A087
ATA: abnormal status 0xD8 on port 0xD082A087
ata1: command 0x35 timeout, stat 0xd8 host_stat 0x1
ata1: status=0xd8 { Busy }

Comment 1 Joe Orton 2005-10-13 20:35:07 UTC

Created attachment 119945 [details]
/var/log/dmesg

Comment 2 Joe Orton 2005-10-13 20:36:21 UTC

Created attachment 119946 [details]
"lspci -v" output

Comment 3 Joe Orton 2005-10-13 20:38:18 UTC

Created attachment 119947 [details]
current dmesg output

The comment should have read... "eventually the devices hang "

Comment 4 Dan Carpenter 2005-10-13 22:16:02 UTC

What firmware are you using?  I had some problems with software RAID on sata sil
card.  It worked fine under a lot of stress but I couldn't install software RAID
on it until I upgraded to the 5.0.48 firmware.

Comment 5 Joe Orton 2005-10-14 08:00:32 UTC

No idea, how do I tell?

Comment 6 Dan Carpenter 2005-10-14 21:03:59 UTC

It should say in BIOS post while you boot.

Comment 7 Joe Orton 2005-10-17 10:18:42 UTC

Ah, it says version 4.1.34 on boot; I can find 4.2.50 available from:

http://www.siliconimage.com/support/supportsearchresults.aspx?pid=63&cid=15&ctid=2&

but no 5.0.48.  Am I missing anything?

Comment 8 Joe Orton 2005-10-17 16:04:22 UTC

I tried flashing one of the cards to the 4.2.50 firmware and now the machine
refuses to boot with any drives attached to that card.

Comment 9 Dan Carpenter 2005-10-18 23:58:14 UTC

Gah...  That really bites.  :/

I'm going to hide in the corner.  Try pinging support to get
the old firmware back.

Comment 10 Joe Orton 2005-10-26 15:32:00 UTC

Well, moving the card to a new machine seems to have fixed it, so I think this
was some issue with the creaky old motherboard in the original test box.