I am filing this on behalf of Jeff Garzick's entry to RH-Kernel. Jeff, please provide additional details as necessary. The AHCI driver was incorrectly resetting the hardware on error. People haven't been screaming about this like they did about the oops-on-error-path. I honestly don't know the effects of the incorrect reset, but it's the right thing to do, as verified by testing in the field. That's why I put it in the "if there is a respin" category rather than "this really wants another respin" category. Serious but not critical fix. No specific RHEL3/RHEL4 testing, just upstream testing. But it's obviously applicable to both, and doesn't change/break anything. --- linux-2.4.21/drivers/scsi/ahci.c 2005-03-14 18:53:39.000000000 -0500 +++ libata-2.4/drivers/scsi/ahci.c 2005-03-14 18:54:12.000000000 -0500 @@ -176,6 +176,7 @@ static int ahci_port_start(struct ata_port *ap); static void ahci_port_stop(struct ata_port *ap); static void ahci_host_stop(struct ata_host_set *host_set); +static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf); static void ahci_qc_prep(struct ata_queued_cmd *qc); static u8 ahci_check_status(struct ata_port *ap); static u8 ahci_check_err(struct ata_port *ap); @@ -209,6 +210,8 @@ .check_err = ahci_check_err, .dev_select = ata_noop_dev_select, + .tf_read = ahci_tf_read, + .phy_reset = ahci_phy_reset, .qc_prep = ahci_qc_prep, @@ -462,6 +465,14 @@ return (readl(mmio + PORT_TFDATA) >> 8) & 0xFF; } +static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf) +{ + struct ahci_port_priv *pp = ap->private_data; + u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG; + + ata_tf_from_fis(d2h_fis, tf); +} + static void ahci_fill_sg(struct ata_queued_cmd *qc) { struct ahci_port_priv *pp = qc->ap->private_data; @@ -538,7 +549,7 @@ /* stop DMA */ tmp = readl(port_mmio + PORT_CMD); - tmp &= PORT_CMD_START | PORT_CMD_FIS_RX; + tmp &= ~PORT_CMD_START; writel(tmp, port_mmio + PORT_CMD); /* wait for engine to stop. TODO: this could be @@ -570,11 +581,11 @@ /* re-start DMA */ tmp = readl(port_mmio + PORT_CMD); - tmp |= PORT_CMD_START | PORT_CMD_FIS_RX; + tmp |= PORT_CMD_START; writel(tmp, port_mmio + PORT_CMD); readl(port_mmio + PORT_CMD); /* flush */ - printk(KERN_WARNING "ata%u: error occurred, port reset\n", ap->port_no); + printk(KERN_WARNING "ata%u: error occurred, port reset\n", ap->id); } static void ahci_eng_timeout(struct ata_port *ap)
A fix for this problem was committed to the RHEL3 U6 patch pool on 20-Apr-2005 (in kernel version 2.4.21-32.1.EL).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html