Hide Forgot
Description of problem: Generation 3 white macbook running Fedora 14 64bit. CPU Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz Version-Release number of selected component (if applicable): Kernel - 2.6.35.10-74.fc14.x86_64 OS = Fedora 14 64bit How reproducible: Consistently Steps to Reproduce: 1. Boot Fedora 14 on Macbook and login 2. Suspend via menu or by closing lid 3. Resume system and perform normal operations 4. Repeat steps 2 & 3 until the following appears in the system logs. 5. Once error occurs I/O performance is seriously degraded as we have no DMA. Actual results: [ 5172.307016] irq 18: nobody cared (try booting with the "irqpoll" option) [ 5172.307022] Pid: 0, comm: swapper Tainted: P 2.6.35.10-74.fc14.x86_64 #1 [ 5172.307024] Call Trace: [ 5172.307026] <IRQ> [<ffffffff810a6fdb>] __report_bad_irq.clone.1+0x3d/0x8b [ 5172.307035] [<ffffffff810a7143>] note_interrupt+0x11a/0x17f [ 5172.307039] [<ffffffff810a7c23>] handle_fasteoi_irq+0xa8/0xce [ 5172.307043] [<ffffffff8100c2ea>] handle_irq+0x88/0x90 [ 5172.307046] [<ffffffff8146fb44>] do_IRQ+0x5c/0xb4 [ 5172.307050] [<ffffffff8146a093>] ret_from_intr+0x0/0x11 [ 5172.307051] <EOI> [<ffffffff8128f900>] ? raw_local_irq_enable+0x10/0x12 [ 5172.307058] [<ffffffff81290526>] acpi_idle_enter_c1+0x98/0xb6 [ 5172.307062] [<ffffffff81394201>] cpuidle_idle_call+0x8b/0xe9 [ 5172.307066] [<ffffffff81008325>] cpu_idle+0xaa/0xcc [ 5172.307069] [<ffffffff81451906>] rest_init+0x8a/0x8c [ 5172.307074] [<ffffffff81ba1c49>] start_kernel+0x40b/0x416 [ 5172.307077] [<ffffffff81ba12c6>] x86_64_start_reservations+0xb1/0xb5 [ 5172.307080] [<ffffffff81ba13c2>] x86_64_start_kernel+0xf8/0x107 [ 5172.307082] handlers: [ 5172.307083] [<ffffffff81314106>] (ata_bmdma_interrupt+0x0/0x1a) [ 5172.307088] [<ffffffff813335a4>] (usb_hcd_irq+0x0/0x7c) [ 5172.307092] Disabling IRQ #18 [ 5200.736090] ata3: lost interrupt (Status 0x51) [ 5200.736123] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 5200.736131] ata3.00: BMDMA stat 0x6, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, [ 5200.736140] ata3.00: failed command: READ DMA EXT [ 5200.736155] ata3.00: cmd 25/00:00:7a:9d:29/00:01:2d:00:00/e0 tag 0 dma 131072 in [ 5200.736158] res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x24 (host bus error) [ 5200.736166] ata3.00: status: { DRDY } [ 5200.736189] ata3: soft resetting link [ 5201.008176] ata3.00: configured for UDMA/133 [ 5201.008190] ata3.00: device reported invalid CHS sector 0 [ 5201.008217] ata3: EH complete [ 5259.744199] ata3: lost interrupt (Status 0x51) [ 5259.744235] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 5259.744244] ata3.00: BMDMA stat 0x6, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, [ 5259.744282] ata3.00: failed command: READ DMA EXT [ 5259.744298] ata3.00: cmd 25/00:00:ba:15:62/00:02:2d:00:00/e0 tag 0 dma 262144 in [ 5259.744301] res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x24 (host bus error) [ 5259.744310] ata3.00: status: { DRDY } [ 5259.744335] ata3: soft resetting link [ 5260.008298] ata3.00: configured for UDMA/133 [ 5260.008311] ata3.00: device reported invalid CHS sector 0 [ 5260.008337] ata3: EH complete Expected results: Suspend/Resume should not cause DMA errors. Additional info: Once issue has occurred a full power cycle won't fix the issue unless the Macbook is booted into OS-X before re-running fedora. Whilst we get DMA back on the reboot after a short period the above error messages will re-appear and we will loose DMA. After the error has occurred the DMA issue persists across suspend/resume and we can't get DMA back without a power cycle
Output of /proc/interrupts CPU0 CPU1 0: 55314 60166 IO-APIC-edge timer 8: 1 0 IO-APIC-edge rtc0 9: 7062 1161 IO-APIC-fasteoi acpi 16: 151300 11718 IO-APIC-fasteoi uhci_hcd:usb4, uhci_hcd:usb5, eth1 18: 19990 8744 IO-APIC-fasteoi ata_piix, uhci_hcd:usb6 19: 1 0 IO-APIC-fasteoi firewire_ohci 20: 241 243 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb3 21: 28931 28675 IO-APIC-fasteoi ata_piix, ehci_hcd:usb1, uhci_hcd:usb7 40: 0 0 PCI-MSI-edge pciehp 41: 0 0 PCI-MSI-edge pciehp 42: 0 0 PCI-MSI-edge pciehp 43: 1801 838 PCI-MSI-edge i915 44: 1 0 PCI-MSI-edge sky2@pci:0000:03:00.0 45: 1631 117 PCI-MSI-edge hda_intel NMI: 0 0 Non-maskable interrupts LOC: 124710 119642 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 6926 8436 Rescheduling interrupts CAL: 2087 1766 Function call interrupts TLB: 1006 1123 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 4 4 Machine check polls ERR: 1 MIS: 0
Created attachment 475315 [details] Dmidecode output
hdparm output dparm -i /dev/sda /dev/sda: Model=WDC WD5000BEKT-00KA9T0, FwRev=01.01A01, SerialNo=WD-WXM1E60CC325 Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50 BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=976773168 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=yes: unknown setting WriteCache=enabled Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7 * signifies the current active mode
Created attachment 475316 [details] Output of sdparm -a /dev/sda before the error occurs.
Created attachment 475317 [details] Output of smartctl -a /dev/sda before the issue occurs
APM status of the disk before we suspend [root@macdora steve]# hdparm -B /dev/sda /dev/sda: APM_level = 128
Tried booting kernel with various combinations of irqpoll and noacpi neither of which resolved the issue.
Had the same issue with a Seagate ST9500420ASG drive. Using https://bugzilla.redhat.com/show_bug.cgi?id=549981 to try and trouble shoot this. Looks like a different problem. Checking for NCQ which isn't enabled cat /sys/block/sd[abc]/device/queue_depth 1 Output from lspci 00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 03) 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 03) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 03) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3) 00:1f.0 ISA bridge: Intel Corporation 82801HEM (ICH8M) LPC Interface Controller (rev 03) 00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03) 00:1f.2 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) SATA IDE Controller (rev 03) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03) 02:00.0 Network controller: Broadcom Corporation BCM4321 802.11a/b/g/n (rev 03) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8058 PCI-E Gigabit Ethernet Controller (rev 13) 04:03.0 FireWire (IEEE 1394): Agere Systems FW322/323 (rev 61)
(In reply to comment #7) > Tried booting kernel with various combinations of irqpoll and noacpi neither of > which resolved the issue. Based on https://bugzilla.redhat.com/show_bug.cgi?id=462425#c80 i actuall tried noapic. I didn't change acpi.
Got a fresh trace after two suspend/resume events and plugging the laptop into mains Jan 26 13:35:39 macdora kernel: [ 3754.946362] irq 18: nobody cared (try booting with the "irqpoll" option) Jan 26 13:35:39 macdora kernel: [ 3754.946367] Pid: 0, comm: swapper Tainted: P 2.6.35.10-74.fc14.x86_64 #1 Jan 26 13:35:39 macdora kernel: [ 3754.946369] Call Trace: Jan 26 13:35:39 macdora kernel: [ 3754.946371] <IRQ> [<ffffffff810a6fdb>] __report_bad_irq.clone.1+0x3d/0x8b Jan 26 13:35:39 macdora kernel: [ 3754.946381] [<ffffffff810a7143>] note_interrupt+0x11a/0x17f Jan 26 13:35:39 macdora kernel: [ 3754.946384] [<ffffffff810a7c23>] handle_fasteoi_irq+0xa8/0xce Jan 26 13:35:39 macdora kernel: [ 3754.946388] [<ffffffff8100c2ea>] handle_irq+0x88/0x90 Jan 26 13:35:39 macdora kernel: [ 3754.946392] [<ffffffff8146fb44>] do_IRQ+0x5c/0xb4 Jan 26 13:35:39 macdora kernel: [ 3754.946396] [<ffffffff8146a093>] ret_from_intr+0x0/0x11 Jan 26 13:35:39 macdora kernel: [ 3754.946397] <EOI> [<ffffffff8128f900>] ? raw_local_irq_enable+0x10/0x12 Jan 26 13:35:39 macdora kernel: [ 3754.946404] [<ffffffff81290526>] acpi_idle_enter_c1+0x98/0xb6 Jan 26 13:35:39 macdora kernel: [ 3754.946408] [<ffffffff81394201>] cpuidle_idle_call+0x8b/0xe9 Jan 26 13:35:39 macdora kernel: [ 3754.946412] [<ffffffff81008325>] cpu_idle+0xaa/0xcc Jan 26 13:35:39 macdora kernel: [ 3754.946416] [<ffffffff81451906>] rest_init+0x8a/0x8c Jan 26 13:35:39 macdora kernel: [ 3754.946420] [<ffffffff81ba1c49>] start_kernel+0x40b/0x416 Jan 26 13:35:39 macdora kernel: [ 3754.946423] [<ffffffff81ba12c6>] x86_64_start_reservations+0xb1/0xb5 Jan 26 13:35:39 macdora kernel: [ 3754.946426] [<ffffffff81ba13c2>] x86_64_start_kernel+0xf8/0x107 Jan 26 13:35:39 macdora kernel: [ 3754.946428] handlers: Jan 26 13:35:39 macdora kernel: [ 3754.946430] [<ffffffff81314106>] (ata_bmdma_interrupt+0x0/0x1a) Jan 26 13:35:39 macdora kernel: [ 3754.946434] [<ffffffff813335a4>] (usb_hcd_irq+0x0/0x7c) Jan 26 13:35:39 macdora kernel: [ 3754.946438] Disabling IRQ #18 Jan 26 13:36:08 macdora kernel: [ 3783.776065] ata3: lost interrupt (Status 0x51) Jan 26 13:36:08 macdora kernel: [ 3783.776091] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jan 26 13:36:08 macdora kernel: [ 3783.776095] ata3.00: BMDMA stat 0x6, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, Jan 26 13:36:08 macdora kernel: [ 3783.776102] ata3.00: failed command: READ DMA EXT Jan 26 13:36:08 macdora kernel: [ 3783.776112] ata3.00: cmd 25/00:00:b2:a8:9f/00:01:2e:00:00/e0 tag 0 dma 131072 in Jan 26 13:36:08 macdora kernel: [ 3783.776119] res 40/00:00:09:4f:c2/00:00:00:00:00/00 Emask 0x24 (host bus error) Jan 26 13:36:08 macdora kernel: [ 3783.776122] ata3.00: status: { DRDY } Jan 26 13:36:08 macdora kernel: [ 3783.776133] ata3: soft resetting link Jan 26 13:36:08 macdora kernel: [ 3784.046178] ata3.00: configured for UDMA/133 Jan 26 13:36:08 macdora kernel: [ 3784.046193] ata3: EH complete Jan 26 13:37:10 macdora kernel: [ 3846.708082] ata3: lost interrupt (Status 0x51) Jan 26 13:37:10 macdora kernel: [ 3846.708110] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jan 26 13:37:10 macdora kernel: [ 3846.708115] ata3.00: BMDMA stat 0x6, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, Jan 26 13:37:10 macdora kernel: [ 3846.708122] ata3.00: failed command: READ DMA EXT Jan 26 13:37:10 macdora kernel: [ 3846.708132] ata3.00: cmd 25/00:00:32:3a:a6/00:02:2e:00:00/e0 tag 0 dma 262144 in Jan 26 13:37:10 macdora kernel: [ 3846.708134] res 40/00:00:09:4f:c2/00:00:00:00:00/00 Emask 0x24 (host bus error) Jan 26 13:37:10 macdora kernel: [ 3846.708139] ata3.00: status: { DRDY } Jan 26 13:37:10 macdora kernel: [ 3846.708157] ata3: soft resetting link Jan 26 13:37:11 macdora kernel: [ 3846.963154] ata3.00: configured for UDMA/133 Jan 26 13:37:11 macdora kernel: [ 3846.963172] ata3: EH complete
Similar issue under Ubuntu * https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/664400
This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping