Description of problem: I have a Promise PCI SATA card, with two SATA disks (see lspci output below). Sometimes during heavy I/O (or just at seemingly random times), the machine will hang/lock up. I have had this problem since installing the PCI card and disks, both on FC7 kernels and FC9 kernels (fresh install), but it is brand new hardware (and from googling it seems other people have had this issue, so I suspect it's not a hardware problem). When it hangs it often stops responding to sysrq etc, requiring a manual reset. I have just hooked up a serial cable to get any errors from the machine. Here is the error message from the last time it happened: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x180000 action 0x6 frozen ata2: SError: { 10B8B Dispar } ata2.00: cmd 35/00:00:d9:05:6e/00:04:01:00:00/e0 tag 0 dma 524288 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata2.00: status: { DRDY } ata2: hard resetting link Have also seen it with: ata2.00: cmd 35/00:00:d9:cf:c5/00:04:07:00:00/e0 tag 0 dma 524288 out ata2.00: cmd 35/00:80:59:1f:ad/00:03:00:00:00/e0 tag 0 dma 458752 out The above error seems like it should be recoverable (read timeout; try again, or at least fail and let the other disk in my RAID take over), but it never comes back from trying to hard reset it. I have just added the nmi_watchdog and acpi=off options to see if they do anything. Version-Release number of selected component (if applicable): kernel-2.6.26.5-45.fc9.i686 How reproducible: Hard to reproduce -- happens only when you don't want it to. At the moment I have a nearly 500GB md raid1 partition with an unsynced disk, which usually only gets a few percent through recovering before my system locks up (so it starts again on the next reboot). It does seem related to multiple concurrent disk access - when I try now to run two `hdparm -t` commands at once the problem can be triggered quite quickly. Additional info: # lspci -v ... 00:0a.0 RAID bus controller: Promise Technology, Inc. PDC20376 (FastTrak 376) (rev 02) Subsystem: Promise Technology, Inc. PDC20376 (FastTrak 376) Flags: bus master, 66MHz, medium devsel, latency 96, IRQ 11 I/O ports at c400 [size=64] I/O ports at c800 [size=16] I/O ports at cc00 [size=128] Memory at e3020000 (32-bit, non-prefetchable) [size=4K] Memory at e3000000 (32-bit, non-prefetchable) [size=128K] [virtual] Expansion ROM at 50010000 [disabled] [size=64K] Capabilities: [60] Power Management version 2 Kernel driver in use: sata_promise Kernel modules: sata_promise ... # lsmod | grep ata ata_generic 8452 0 pata_acpi 7680 0 pata_via 11140 0 sata_promise 13700 4 libata 132456 4 ata_generic,pata_acpi,pata_via,sata_promise scsi_mod 122876 4 sr_mod,sg,libata,sd_mod # dmesg ... libata version 3.00 loaded. sata_promise 0000:00:0a.0: version 2.12 ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level, low) -> IRQ 11 scsi0 : sata_promise scsi1 : sata_promise scsi2 : sata_promise ata1: SATA max UDMA/133 mmio m4096@0xe3020000 ata 0xe3020200 irq 11 ata2: SATA max UDMA/133 mmio m4096@0xe3020000 ata 0xe3020280 irq 11 ata3: PATA max UDMA/133 mmio m4096@0xe3020000 ata 0xe3020300 irq 11 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-8: WDC WD5000AACS-00G8B0, 05.04C05, max UDMA/133 ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-8: WDC WD5000AACS-00G8B0, 05.04C05, max UDMA/133 ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA WDC WD5000AACS-0 05.0 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 sd 0:0:0:0: [sda] Attached SCSI disk scsi 1:0:0:0: Direct-Access ATA WDC WD5000AACS-0 05.0 PQ: 0 ANSI: 5 sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sdb2 sd 1:0:0:0: [sdb] Attached SCSI disk ... # grep 11 /proc/interrupts 11: 22987 XT-PIC-XT ehci_hcd:usb1, uhci_hcd:usb4, sata_promise, VIA8233
Should be fixed in 2.6.27.4-24 and later: https://admin.fedoraproject.org/updates/kernel-2.6.27.4-24.fc9
Hi, Unfortunately I no longer have the machine in question to test the new kernel (could only put up with a flaky server for so long). It is likely that I will use the card in a different machine at some time though, so it's good to know that it should be fixed. Out of curiosity, do you know which kernel patch it was fixed in? Thanks.
kernel-2.6.27.4-26.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.4-26.fc9
kernel-2.6.27.4-26.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9467
kernel-2.6.27.5-32.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-32.fc9
kernel-2.6.27.5-32.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9583
kernel-2.6.27.5-37.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-37.fc9
kernel-2.6.27.5-41.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-41.fc9
Marking as CANTFIX, because I no longer have the hardware combination to test the updated kernels.
kernel-2.6.27.5-41.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.