Bug 176107

Summary: sata-nv crashes on multiple SATA disks
Product: Red Hat Enterprise Linux 4 Reporter: Frank Bures <fbures>
Component: kernelAssignee: Jeff Garzik <jgarzik>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: jasone, jbaron, linville, peterm
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0575 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 21:46:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181409    

Description Frank Bures 2005-12-19 14:45:29 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.8) Gecko/20051118 Firefox/1.5

Description of problem:
HW: Dual Opteron 275 on TYAN S2895 MoBo with NV-RAID and 4 WD2500KS 250GB SATA disks.  BIOS version on MoBo 1.02 (the latest).
If RAID is defined during installation process, installation would not proceed beyond formatting.
Repeated installation on a single disk was sucessful.
When more than one SATA disk is accessed at the time, the system crushes.  If USB keyboard is used, any connection to the keyboard is lost as well.
Following messages are found in /var/log/messages:
kernel: ata3: command 0x25 timeout, stat 0x50 host_stat 0x24
kernel: ata4: command 0x25 timeout, stat 0x50 host_stat 0x24


Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-22.0.1.EL

How reproducible:
Always

Steps to Reproduce:
1. Install RHEL4 x86_64 on Dual Opteron 275 with four 250GB SATA disks on NV-RAID controller (RAID disabled in BIOS).  Install on a single disk.
2. Partition remaining three disks, single primary partition per disk
3. Run mkfs -t ext3 -c /dev/sdb1 and
4. Run mkfs -t ext3 -c /dev/sdc1
5. As soon as the second mkfs is started, system crashes.  Combination of disks
for testing is irrelevant (sdb1, sdc1, sdd1)
6. If USB keyboard is used, the keyboard becomes inaccessible.
  

Actual Results:  System crash.  Logs:
kernel: ata3: command 0x25 timeout, stat 0x50 host_stat 0x24
kernel: ata4: command 0x25 timeout, stat 0x50 host_stat 0x24

Expected Results:  Files systems should be created on disks

Additional info:

MoBo TYAN S2895, BIOS 1.02 (the latest).  Four WD2500KS 250GB SATA disks.

Comment 1 Frank Bures 2005-12-19 19:51:29 UTC
Just an addition to eliminate HW considerations:
I installed WinXP64 on one disk of the said machine and then proceeded to long
format three remaining SATA disk simultaneously.  There were no problems.
That would suggest that the problem is indeed inherent to Linux.



Comment 2 Jason Evans 2006-01-29 20:12:24 UTC
I'm seeing the same errors, but with a different HDD layout.

Hardware: Tyan S2895 (BIOS v1.02), dual Opteron 275, 3 SATA HDDs (150 GB, 150 GB, 400 GB).

Installation of RHEL WS 4 x86_64 update 2 proceeds without apparent issues.  '/' is on sda1, '/home' is 
sda2, '/space' is sda3.  During first boot, initialization gets as far as xfs, then hangs.  Console 
messages start printing every 10-15 seconds:

ata1: command 0x25 timeout, stat 0x50 host_stat 0x24
ata1: command 0x35 timeout, stat 0x50 host_stat 0x25

After the first boot attempt, further attempts only get as far as finishing "hardware initialization".

I reproduced this behavior 3 out of 3 installation attempts.

Comment 3 Jason Evans 2006-02-03 03:32:02 UTC
(In reply to comment #2)

My troubles turned out to be related to a bad SATA cable connection.  After replacing the faulty cable, I 
had no further problems.

Comment 4 Jason Baron 2006-05-01 15:52:51 UTC
committed in stream U4 build 34.27. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 5 Frank Bures 2006-05-01 16:01:32 UTC
Unfortunately I have currently no available hardware to test it on.



Comment 6 Dan Carpenter 2006-05-02 07:23:53 UTC
My experience is that the 1.02 BIOS breaks the onboard SATA on the 2895 on
RHEL4u2 and RHEL4u3.  The 1.03 BIOS also breaks the on board SATA.  The 1.01
BIOS works with the onboard SATA.

I haven't tested the RAID on that card.



Comment 10 Red Hat Bugzilla 2006-08-10 21:46:24 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html