libreport version: 2.0.7 abrt_version: 2.0.6 cmdline: BOOT_IMAGE=/vmlinuz-3.1.1-1.fc16.x86_64 root=/dev/mapper/VGsystem-LVroot ro LANG=en_US.UTF-8 rd.dm=0 KEYTABLE=us quiet SYSFONT=latarcyrheb-sun16 rhgb rd.lvm.lv=VGsystem/LVswap rd.md.uuid=c5bcc704:aad5bdaf:07ae3b56:fc61f7dc rd.lvm.lv=VGsystem/LVroot rd.luks=0 irqpoll comment: At random intervals. Can't get a fix on what is causing this. kernel: 3.1.1-1.fc16.x86_64 reason: [348158.080600] irq 16: nobody cared (try booting with the "irqpoll" option) time: mar 22 nov 2011 15:20:34 CET smolt_data: Text file, 3136 bytes backtrace: :[348158.080600] irq 16: nobody cared (try booting with the "irqpoll" option) :[348158.080603] Pid: 0, comm: swapper Tainted: G W 3.1.1-1.fc16.x86_64 #1 :[348158.080605] Call Trace: :[348158.080606] <IRQ> [<ffffffff810b2222>] __report_bad_irq+0x38/0xc3 :[348158.080613] [<ffffffff810b24bc>] note_interrupt+0x176/0x1fa :[348158.080615] [<ffffffff810b0a0f>] handle_irq_event_percpu+0x15d/0x1a5 :[348158.080617] [<ffffffff810b0a92>] handle_irq_event+0x3b/0x59 :[348158.080619] [<ffffffff81078268>] ? sched_clock_cpu+0x42/0xc6 :[348158.080621] [<ffffffff810b2c7c>] handle_fasteoi_irq+0x80/0xa4 :[348158.080624] [<ffffffff81010af9>] handle_irq+0x88/0x8e :[348158.080626] [<ffffffff814c03cd>] do_IRQ+0x4d/0xa5 :[348158.080628] [<ffffffff814b752e>] common_interrupt+0x6e/0x6e :[348158.080629] <EOI> [<ffffffff813a5c92>] ? poll_idle+0x2f/0x65 :[348158.080634] [<ffffffff813a5c7e>] ? poll_idle+0x1b/0x65 :[348158.080636] [<ffffffff813a5fae>] cpuidle_idle_call+0xe8/0x182 :[348158.080638] [<ffffffff8100e2e3>] cpu_idle+0xa4/0xe8 :[348158.080641] [<ffffffff81494a5e>] rest_init+0x72/0x74 :[348158.080643] [<ffffffff81b76b7d>] start_kernel+0x3ab/0x3b6 :[348158.080645] [<ffffffff81b762c4>] x86_64_start_reservations+0xaf/0xb3 :[348158.080647] [<ffffffff81b76140>] ? early_idt_handlers+0x140/0x140 :[348158.080648] [<ffffffff81b763ca>] x86_64_start_kernel+0x102/0x111 :[348158.080649] handlers: :[348158.080652] [<ffffffffa0338df3>] rtl8139_interrupt :[348158.080653] Disabling IRQ #16 event_log: :2011-11-22-15:22:57> Smolt profile successfully saved :2011-11-22-15:30:22> Invio in corso della notifica di oops a http://submit.kerneloops.org/submitoops.php :2011-11-22-15:31:30 Kernel oops has not been sent due to Couldn't connect to server :2011-11-22-15:31:30* (exited with 1)
Created attachment 535047 [details] File: smolt_data
This got even worse with kernel 3.1.2-1.fc16.x86_64, now it's triggered every 10 minutes or so and it disables my Internet-connected LAN card. Also, I think it's related to bug 717211, since they appear one after another in messages. About irqpoll, seems like that option doesn't work due to another bug (could't find the reference in RH bugzilla, I found about it on launchpad: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/855199 )
(In reply to comment #2) > This got even worse with kernel 3.1.2-1.fc16.x86_64, now it's triggered every > 10 minutes or so and it disables my Internet-connected LAN card. Also, I think > it's related to bug 717211, since they appear one after another in messages. Could you attach the dmesg and /proc/interrupts output? Also, if you boot with pcie_aspm=off, does it help? > About irqpoll, seems like that option doesn't work due to another bug (could't > find the reference in RH bugzilla, I found about it on launchpad: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/855199 ) There are two patches to fix irqpoll. One should already be in the latest f15/f16 kernels, and the other should be included when we rebase to 3.1.5.
Created attachment 541300 [details] dmesg as requested Attached dmesg (with some error messages, at about 1 day, 19 hours uptime).
Created attachment 541301 [details] contents of /proc/interrupts contents of /proc/interrupts at about 1 day, 19 hours uptime
This isn't related to 717211, as that is for the atl1c driver. Your issue is coming from the 8139too driver. It seems it's interrupt is triggering and the interrupt handler doesn't see an interrupt in the status register, so it bails. It would be good to know a few things. 1) Does pcie_aspm=off help on the kernel command line? 2) Have there been previous kernels when you did not see this issue, if so what versions? 3) Have you always had to pass the irqpoll command line parameter, or does your dmesg just show that because you tried the suggestion? If the option from #1 doesn't help, it might be beneficial to get some debug data from the driver. You can do: echo -n 'module 8139too +p' > /sys/kernel/debug/dynamic_debug/control and it will enable all debug messages from the 8319too driver. This includes a printk for the interrupt status register (which may result in a lot of printks).
1) it doesn't seems so, the error presented again after I booted with pcie_aspm=off 2) I did test some other distros, it *seems* that 2.6.32 kernel found in Scientific Linux 6.1 LiveCD isn't affected, the bug didn't trigger in several hours. 2.6.38 and above (but I didn't test anything between .32 and .38) did show the bug. 3) No, I didn't pass irqpoll before I enabled the debug messages, I'll post as soon as there are some in the logs. I did refer to bug 717211 because abrt did connect another bug I have ("WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf0/0x150()" that has, as I can see, the same effects, killing the card) to that one. Maybe I should create another bug for 8139too marking it as duplicate of bug 702723 ?
(In reply to comment #7) > 1) it doesn't seems so, the error presented again after I booted with > pcie_aspm=off Bummer, ok. > 2) I did test some other distros, it *seems* that 2.6.32 kernel found in > Scientific Linux 6.1 LiveCD isn't affected, the bug didn't trigger in several > hours. 2.6.38 and above (but I didn't test anything between .32 and .38) did > show the bug. OK. > 3) No, I didn't pass irqpoll before OK. > I enabled the debug messages, I'll post as soon as there are some in the logs. Thank you. They should show up as KERN_DEBUG messages in dmesg. > I did refer to bug 717211 because abrt did connect another bug I have > ("WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf0/0x150()" that has, > as I can see, the same effects, killing the card) to that one. > > Maybe I should create another bug for 8139too marking it as duplicate of bug > 702723 ? No, I don't think we need to do that. The dev_watchdog error seems to be a direct side-effect of the interrupt being disabled so if that gets fixed the other error should go away as well.
It's possible that some other device you don't have a driver for is generating those interrupts. Does booting with the "noirqdebug" option help? (That will just ignore the extra interrupts.) Also, please attach the output of the command 'lspci -vvv' (run as root to get the full output.)
Created attachment 542030 [details] lspci -vvv output
Created attachment 542035 [details] lspci -vvvnn output (just in case)
I'll try noirqdebug at next boot. Still haven't got any debug message from the driver, but I have a cronjob to reload it when the network is down, so maybe it was lost on reload. I changed the script, so it will reconfigure the debugging properly after reloading the module.
Created attachment 542268 [details] commented kernel debug info I configured rsyslog to log kern.* to a separate file. The message repeated several thousands times: for second in 129684 129685 129686 129687 ; do grep $second\. /var/log/kernel.messages | wc -l ; done 0 91967 64845 2
It seems that noirqdebug is doing good as a workaround, now at 5 days uptime and the bug didn't show up.
Your motherboard is using the ASM108x PCI bridge. There is a problem identified with this particular chip upstream that might be causing this issue: http://thread.gmane.org/gmane.linux.kernel/1245767
We're going to consolidate all of these bugs with the impacted hardware into a single bug. The latest F15 and F16 kernel updates that should hit the mirrors soon have a patch to at least fall back to the irqpoll method when this happens. Hopefully it results in a bit better experience for you.
*** Bug 770210 has been marked as a duplicate of this bug. ***
*** Bug 799106 has been marked as a duplicate of this bug. ***
*** Bug 770866 has been marked as a duplicate of this bug. ***
*** Bug 784050 has been marked as a duplicate of this bug. ***
*** Bug 773438 has been marked as a duplicate of this bug. ***
*** Bug 761699 has been marked as a duplicate of this bug. ***
*** Bug 785339 has been marked as a duplicate of this bug. ***
*** Bug 756540 has been marked as a duplicate of this bug. ***
*** Bug 784751 has been marked as a duplicate of this bug. ***
Hello, I am writing here to ask for information. I don't know if I can comment here or post a new bug but I think I have the same problem. Can you help to be sure ? I just bought an Asus P8P67 Evo and I got the irq 16: nobody cared too ! I run kernel 3.2.7-1.fc16.x86_64 I have this chip of Asmedia : extract of lspci -v: 06:00.0 PCI bridge: ASMedia Technology Inc. Device 1080 (rev 01) (prog-if 01 [Subtractive decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=06, secondary=07, subordinate=07, sec-latency=32 I/O behind bridge: 0000d000-0000dfff Memory behind bridge: fb100000-fb1fffff Capabilities: [c0] Subsystem: ASMedia Technology Inc. Device 1080 extract of /proc/interrupts: 16: 46710 0 0 0 IO-APIC-fasteoi p6p1, nvidia There is a point I don't understand: my irq 16 seems to be linked to my graphic card (nivdia) and to my mother board ethernet card realtek (p6p1). Is it the same problem ? I am not really familiar with looking at this things. I found this on internet http://www.gossamer-threads.com/lists/linux/kernel/1466185, so last evening, I desinstalled my wifi PCI card and put irqpoll option but I always have the irq 16 disabled... Josh, are you speaking about the kernel update 3.2.7-1.fc16.x86_64 ? It is available this morning. I will try it. Thanks for your help and tell me if I need to post any other log or trace...
Kernel 3.2.9-1 since yesterday. No more irq 16 disabled but now it's irq 17. [38142.742404] irq 17: nobody cared (try booting with the "irqpoll" option) [38142.742407] Pid: 0, comm: swapper/0 Tainted: P O 3.2.9-1.fc16.x86_64 #1 [38142.742408] Call Trace: [38142.742409] <IRQ> [<ffffffff810e11ad>] __report_bad_irq+0x3d/0xe0 [38142.742415] [<ffffffff810e146d>] note_interrupt+0x16d/0x220 [38142.742417] [<ffffffff8101b9c9>] ? sched_clock+0x9/0x10 [38142.742419] [<ffffffff810dec39>] handle_irq_event_percpu+0xa9/0x220 [38142.742421] [<ffffffff810dedf4>] handle_irq_event+0x44/0x70 [38142.742422] [<ffffffff810e1edf>] handle_fasteoi_irq+0x5f/0xf0 [38142.742425] [<ffffffff81016226>] handle_irq+0x46/0xb0 [38142.742427] [<ffffffff815ed5da>] do_IRQ+0x5a/0xe0 [38142.742430] [<ffffffff815e2f2e>] common_interrupt+0x6e/0x6e [38142.742431] <EOI> [<ffffffff81093fa9>] ? enqueue_hrtimer+0x39/0xc0 [38142.742435] [<ffffffff81310b2d>] ? intel_idle+0xed/0x150 [38142.742437] [<ffffffff81310b0f>] ? intel_idle+0xcf/0x150 [38142.742440] [<ffffffff81493671>] cpuidle_idle_call+0xc1/0x280 [38142.742441] [<ffffffff8101322a>] cpu_idle+0xca/0x120 [38142.742443] [<ffffffff815bffce>] rest_init+0x72/0x74 [38142.742446] [<ffffffff81aebbfe>] start_kernel+0x3ba/0x3c5 [38142.742448] [<ffffffff81aeb347>] x86_64_start_reservations+0x132/0x136 [38142.742450] [<ffffffff81aeb140>] ? early_idt_handlers+0x140/0x140 [38142.742452] [<ffffffff81aeb44d>] x86_64_start_kernel+0x102/0x111 [38142.742453] handlers: [38142.742461] [<ffffffffa002a270>] irq_handler [38142.742463] [<ffffffffa010c810>] azx_interrupt [38142.742464] Disabling IRQ #17 CPU0 CPU1 CPU2 CPU3 17: 200003 0 0 0 IO-APIC-fasteoi firewire_ohci, snd_hda_intel
happens during normal boot takes about 30 seconds to clear sometimes it's irq 16 and sometimes irq 18 Package: kernel OS Release: Fedora release 16 (Verne)
Now getting disabling irq 16 messages all the time. Kernel 3.2.9-2.fc16.x86_64 on 2 machines
Maybe this patch has been applied: http://www.gossamer-threads.com/lists/linux/kernel/1466185?do=post_view_threaded#1466185 And we got messages every few minute due to reenabling IRQ 16 and ... : [root@server:~] $ dmesg | grep "IRQ 16" [ 3434.929999] Disabling IRQ 16 [ 3434.939447] Polling IRQ 16 [ 3435.939812] Reenabling IRQ 16 [ 3441.763261] Disabling IRQ 16 [ 3441.773108] Polling IRQ 16 [ 3442.773471] Reenabling IRQ 16 [ 3560.440243] Disabling IRQ 16 [ 3560.449707] Polling IRQ 16 [ 3561.449072] Reenabling IRQ 16 [ 3685.798417] Disabling IRQ 16 This is annoying, I disabled emergency messages from /etc/rsyslog.conf: #*.emerg *
(In reply to comment #30) > Maybe this patch has been applied: > http://www.gossamer-threads.com/lists/linux/kernel/1466185?do=post_view_threaded#1466185 > Yes. > This is annoying, I disabled emergency messages from /etc/rsyslog.conf: > #*.emerg * I've made it less verbose in the 3.2.10-1 kernel in updates-testing.
Now I have Kernel 3.2.9-2.fc16.x86_64 too and I have the same things in my log (polling/reenabling) I would like to know if this is a serious problem ? This log messages seems to be related to this patch, but is it the final solution ? I have uninstalled my wifi pci card, but this irq problem appears anyway, why ? I didn't find any recent news about this problem (http://www.kernelhub.org/?p=2&msg=14224) Does someone know if this is a bug in the chip or a bug in the kernel ? In both case, do we know if a solution will be possible ? I don't unerstand all the very specific discussion on this topic... Maybe I should change my mother board ? I bought it a few days ago. But I am not really enthusiast with this :) I imagine it is not easy (Asus might not give all required information) and maybe users could help with a little pressure ;) Whatever happens, I would like to thank all kernel developpers and maintener for the great job. and sorry for my bad english (but I am French lol)
Could you try this kernel and see if it functions better for you: http://koji.fedoraproject.org/koji/buildinfo?buildID=307357
Tried. It's worst here. I am spammed with "Disabling IRQ 16" now, more than 2 times per minute. Mar 15 20:58:57 pulsar kernel: [78682.553094] Disabling IRQ 16 Mar 15 21:02:02 pulsar kernel: [ 18.424895] Disabling lock debugging due to kernel taint Mar 15 21:02:10 pulsar kernel: [ 30.024146] Disabling IRQ 16 Mar 15 21:02:32 pulsar kernel: [ 50.353518] Disabling IRQ 16 Mar 15 21:02:49 pulsar kernel: [ 67.787900] Disabling IRQ 16 Mar 15 21:03:09 pulsar kernel: [ 87.310420] Disabling IRQ 16 Mar 15 21:03:15 pulsar kernel: [ 93.539826] Disabling IRQ 16 Mar 15 21:03:56 pulsar kernel: [ 134.335987] Disabling IRQ 16 Mar 15 21:03:57 pulsar kernel: [ 135.681313] Disabling IRQ 16 Mar 15 21:03:59 pulsar kernel: [ 137.358162] Disabling IRQ 16 Mar 15 21:04:00 pulsar kernel: [ 138.432437] Disabling IRQ 16 Mar 15 21:04:01 pulsar kernel: [ 139.636273] Disabling IRQ 16 Mar 15 21:04:14 pulsar kernel: [ 152.793792] Disabling IRQ 16 Mar 15 21:04:17 pulsar kernel: [ 155.577896] Disabling IRQ 16 Mar 15 21:04:18 pulsar kernel: [ 156.706745] Disabling IRQ 16 Mar 15 21:04:20 pulsar kernel: [ 158.823092] Disabling IRQ 16 Mar 15 21:04:42 pulsar kernel: [ 181.102471] Disabling IRQ 16 Mar 15 21:04:45 pulsar kernel: [ 183.478238] Disabling IRQ 16 Mar 15 21:05:25 pulsar kernel: [ 223.968318] Disabling IRQ 16 Mar 15 21:06:16 pulsar kernel: [ 274.571748] Disabling IRQ 16 Mar 15 21:06:42 pulsar kernel: [ 300.462351] Disabling IRQ 16 Mar 15 21:06:56 pulsar kernel: [ 314.898798] Disabling IRQ 16 Mar 15 21:06:59 pulsar kernel: [ 317.150231] Disabling IRQ 16 Mar 15 21:07:07 pulsar kernel: [ 325.163127] Disabling IRQ 16 Mar 15 21:07:12 pulsar kernel: [ 330.431224] Disabling IRQ 16 Mar 15 21:07:50 pulsar kernel: [ 368.301938] Disabling IRQ 16 Mar 15 21:07:58 pulsar kernel: [ 376.880142] Disabling IRQ 16 Mar 15 21:10:00 pulsar kernel: [ 498.048971] Disabling IRQ 16 Mar 15 21:10:20 pulsar kernel: [ 518.163637] Disabling IRQ 16 Mar 15 21:10:30 pulsar kernel: [ 527.913334] Disabling IRQ 16 Mar 15 21:11:34 pulsar kernel: [ 592.695034] Disabling IRQ 16 Mar 15 21:12:02 pulsar kernel: [ 619.925098] Disabling IRQ 16 Mar 15 21:12:13 pulsar kernel: [ 631.051405] Disabling IRQ 16 Mar 15 21:12:17 pulsar kernel: [ 635.180600] Disabling IRQ 16 Mar 15 21:12:36 pulsar kernel: [ 654.045670] Disabling IRQ 16 Mar 15 21:12:57 pulsar kernel: [ 674.964864] Disabling IRQ 16 Mar 15 21:13:02 pulsar kernel: [ 680.530997] Disabling IRQ 16 Mar 15 21:13:18 pulsar kernel: [ 695.938079] Disabling IRQ 16 Mar 15 21:13:43 pulsar kernel: [ 721.401916] Disabling IRQ 16 Mar 15 21:13:48 pulsar kernel: [ 726.409863] Disabling IRQ 16 Mar 15 21:13:54 pulsar kernel: [ 732.522519] Disabling IRQ 16 Mar 15 21:14:07 pulsar kernel: [ 745.176289] Disabling IRQ 16 Mar 15 21:14:12 pulsar kernel: [ 750.312007] Disabling IRQ 16 Mar 15 21:14:15 pulsar kernel: [ 752.763942] Disabling IRQ 16 Mar 15 21:14:18 pulsar kernel: [ 755.924274] Disabling IRQ 16 Mar 15 21:14:20 pulsar kernel: [ 757.834736] Disabling IRQ 16 Mar 15 21:15:17 pulsar kernel: [ 815.277900] Disabling IRQ 16 Mar 15 21:15:22 pulsar kernel: [ 820.473100] Disabling IRQ 16
(In reply to comment #34) > Tried. > It's worst here. I am spammed with "Disabling IRQ 16" now, more than 2 times > per minute. Erm... that's confusing. The kernel I linked to shouldn't disable repeatedly like that... what does 'uname -a' say? Anyway, there's a scratch build going here that has another revision: http://koji.fedoraproject.org/koji/taskinfo?taskID=3898618 testing that would be appreciated.
I took it from the FC16 update testing ... uname -a: Linux pulsar 3.2.10-1.fc16.x86_64 #1 SMP Mon Mar 12 22:34:35 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
oups ok , your link was for 3.2.10-2 I check...
Tested! Less messages. But it seems that my screen is becoming laggy just before the "Irq 16 might be stuck" message appears. Then it become better... And then lag again 2 or 3 minutes afer... So I would say It was better with 3.2.9-2 but more log messages. Mar 15 22:10:29 pulsar kernel: [ 62.896171] IRQ 16 might be stuck. Polling Mar 15 22:11:45 pulsar kernel: [ 138.405784] IRQ 16 might be stuck. Polling Mar 15 22:13:17 pulsar kernel: [ 230.227593] IRQ 16 might be stuck. Polling Mar 15 22:15:30 pulsar kernel: [ 363.701856] IRQ 16 might be stuck. Polling Mar 15 22:16:41 pulsar kernel: [ 434.726120] IRQ 16 might be stuck. Polling Mar 15 22:16:51 pulsar kernel: [ 444.816592] IRQ 16 might be stuck. Polling Mar 15 22:17:02 pulsar kernel: [ 454.907226] IRQ 16 might be stuck. Polling Mar 15 22:17:12 pulsar kernel: [ 464.997802] IRQ 16 might be stuck. Polling Mar 15 22:19:49 pulsar kernel: [ 622.362209] IRQ 16 might be stuck. Polling Mar 15 22:20:35 pulsar kernel: [ 667.997578] IRQ 16 might be stuck. Polling
(In reply to comment #38) > Tested! > Less messages. But it seems that my screen is becoming laggy just before the > "Irq 16 might be stuck" message appears. Then it become better... And then lag > again 2 or 3 minutes afer... The lag is somewhat to be expected. The CPU is in polling mode, which means it's eating CPU while looking for interrupts. If you're willing to test again, I can have another scratch build tomorrow that tries to poll a bit quicker but I'm not sure how much that would improve the lag. > So I would say It was better with 3.2.9-2 but more log messages. > > Mar 15 22:10:29 pulsar kernel: [ 62.896171] IRQ 16 might be stuck. Polling > Mar 15 22:11:45 pulsar kernel: [ 138.405784] IRQ 16 might be stuck. Polling > Mar 15 22:13:17 pulsar kernel: [ 230.227593] IRQ 16 might be stuck. Polling > Mar 15 22:15:30 pulsar kernel: [ 363.701856] IRQ 16 might be stuck. Polling > Mar 15 22:16:41 pulsar kernel: [ 434.726120] IRQ 16 might be stuck. Polling > Mar 15 22:16:51 pulsar kernel: [ 444.816592] IRQ 16 might be stuck. Polling > Mar 15 22:17:02 pulsar kernel: [ 454.907226] IRQ 16 might be stuck. Polling > Mar 15 22:17:12 pulsar kernel: [ 464.997802] IRQ 16 might be stuck. Polling > Mar 15 22:19:49 pulsar kernel: [ 622.362209] IRQ 16 might be stuck. Polling > Mar 15 22:20:35 pulsar kernel: [ 667.997578] IRQ 16 might be stuck. Polling Are these being printed to the console, or do you need to run 'dmesg' to see them? If they're still going to the console I'll reduce the severity again.
I'm looking this messages from /var/log/messages (I made a grep to only show the stuck message) I am about to try 3.2.10-2.2
ok, I am running 3.2.10-2.2 since 30mn, no IRQ messages at all for the moment! I didn't see any lag. So it's better than the 3.2.10-1 I'll continue and see tomorrow.
hi, no IRQ message since last reboot with 3.2.10-2.2 in the link you gave. uptime is 15h.
I have 3.2.10-3.fc16 available with yum in the updates repo. Do we find your last change in it ?
*** Bug 804725 has been marked as a duplicate of this bug. ***
(In reply to comment #43) > I have 3.2.10-3.fc16 available with yum in the updates repo. > Do we find your last change in it ? No, not really. 3.2.10-3 has the patch dropped entirely as it was broken for a number of users. The next submitted update should have it included.
(In reply to comment #45) > (In reply to comment #43) > > I have 3.2.10-3.fc16 available with yum in the updates repo. > > Do we find your last change in it ? > > No, not really. 3.2.10-3 has the patch dropped entirely as it was broken for a > number of users. The next submitted update should have it included. how much broker can it get?? seems that 3.2.10-3 also broke atheros nic for me since it works on install with 3.0.1.7 and as soon as update runs it s gone but on the bright side the irq errors went away with 3.2.10-3
[mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update.
I updated with this kernel-3.3.0-4.fc16 All seem ok, I don't see any IRQ log message... do we need to keep the irqpoll option ?
Any idea when this patch will go into mainline kernel source?
(In reply to comment #51) > Any idea when this patch will go into mainline kernel source? It probably won't be. At least not the current version of it. We're still deciding if it's worth carrying.
Are there any other approaches elsewhere to workaround the issues with the ASM108x bridge? From the postings on LKML it appears that Linus is interested in having a better workaround.
*** Bug 815119 has been marked as a duplicate of this bug. ***
If you can still reproduce this in 3.4, please reopen. We believe this should be fixed with the current updates.
*** Bug 839733 has been marked as a duplicate of this bug. ***
Problem is back in the last update 3.4.4-4 I updated yesterday and today many messages: "IRQ 16 might be stuck. Polling"
The problem came back again and the ping times from other internal networks is different from the other on each polling. ####### [root@localhost:/var/log] $ tail -f messages Jul 14 08:04:25 localhost kernel: [432091.477327] IRQ 16 might be stuck. Polling Jul 14 08:04:36 localhost kernel: [432102.159299] IRQ 16 might be stuck. Polling Jul 14 08:04:48 localhost kernel: [432114.489476] IRQ 16 might be stuck. Polling Jul 14 08:05:06 localhost kernel: [432132.336522] IRQ 16 might be stuck. Polling Jul 14 08:05:24 localhost kernel: [432150.356137] IRQ 16 might be stuck. Polling Jul 14 08:05:35 localhost kernel: [432161.292643] IRQ 16 might be stuck. Polling Jul 14 08:05:46 localhost kernel: [432172.527401] IRQ 16 might be stuck. Polling ############### C:\Program Files\Support Tools>ping 192.168.0.110 -t Pinging 192.168.0.110 with 32 bytes of data: Reply from 192.168.0.110: bytes=32 time=23ms TTL=64 Reply from 192.168.0.110: bytes=32 time=30ms TTL=64 Reply from 192.168.0.110: bytes=32 time=32ms TTL=64 Reply from 192.168.0.110: bytes=32 time=32ms TTL=64 Reply from 192.168.0.110: bytes=32 time=32ms TTL=64 Reply from 192.168.0.110: bytes=32 time=32ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time=59ms TTL=64 Reply from 192.168.0.110: bytes=32 time<1ms TTL=64 Reply from 192.168.0.110: bytes=32 time=3ms TTL=64 Reply from 192.168.0.110: bytes=32 time=3ms TTL=64 Reply from 192.168.0.110: bytes=32 time=4ms TTL=64
I think that particular case is about as good as it's going to get. The hardware doesn't behave correctly, and the vendor is uncooperative in telling us how to work around it, so this is the best we can do.
So what can we do ? We have to change our motherboard ? :(
I have the same problem. last updated 5 minutes ago. # uname -a Linux linuxlocal 3.4.4-5.fc17.x86_64 #1 SMP Thu Jul 5 20:20:59 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.4.4-5.fc17.x86_64 root=UUID=1d687255-fffd-46d0-b774-f6852fd48c9e ro quiet rhgb nouveau.modeset=0 rd.driver.blacklist=nouveau SYSFONT=False LANG=ru_RU.UTF-8 KEYTABLE=ru irqpoll # dmesg [ 314.460745] IRQ 16 might be stuck. Polling [ 795.077515] IRQ 16 might be stuck. Polling [ 827.721361] IRQ 16 might be stuck. Polling [ 852.817196] IRQ 16 might be stuck. Polling [ 975.831648] IRQ 16 might be stuck. Polling [ 1106.381880] IRQ 16 might be stuck. Polling [ 1244.393381] IRQ 16 might be stuck. Polling [ 1279.808228] IRQ 16 might be stuck. Polling
New asus P8Z77-V Pro seem to have the ASMedia ASM1083 bridge chip too... What can we do ? No more PCI ?
Folks, I reproduced this error but inadvertently. I stuck a 32Gb USB 2.0 SanDisk Cruzer flash drive into my desktop (Shuttle XPC XS58H7 PRO) and instantly received this IRQ #16 error on all my terminal windows. # uname -a Linux zurich.homelinux.org 3.6.7-4.fc16.x86_64 #1 SMP Tue Nov 20 20:33:31 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux # BOOT_IMAGE=/boot/vmlinuz-3.6.7-4.fc16.x86_64 root=/dev/mapper/vg_zurich-lv_root ro rd.lvm.lv=vg_zurich/lv_swap rd.md=0 rd.dm=0 KEYTABLE=us rd.lvm.lv=vg_zurich/lv_root quiet SYSFONT=latarcyrheb-sun16 rhgb rd.luks=0 LANG=en_US.UTF-8 pci=nomsi # less /var/log/messages Nov 29 23:34:15 zurich kernel: [50576.198354] irq 16: nobody cared (try booting with the "irqpoll" option) Nov 29 23:34:15 zurich kernel: [50576.198358] Pid: 0, comm: swapper/0 Tainted: P O 3.6.7-4.fc16.x86_64 #1 Nov 29 23:34:15 zurich kernel: [50576.198359] Call Trace: Nov 29 23:34:15 zurich kernel: [50576.198360] <IRQ> [<ffffffff810eb09d>] __report_bad_irq+0x3d/0xe0 Nov 29 23:34:15 zurich kernel: [50576.198368] [<ffffffff810eb355>] note_interrupt+0x165/0x220 Nov 29 23:34:15 zurich kernel: [50576.198370] [<ffffffff810e8b29>] handle_irq_event_percpu+0xa9/0x210 Nov 29 23:34:15 zurich kernel: [50576.198373] [<ffffffff8101baf9>] ? sched_clock+0x9/0x10 Nov 29 23:34:15 zurich kernel: [50576.198374] [<ffffffff810e8cd2>] handle_irq_event+0x42/0x70 Nov 29 23:34:15 zurich kernel: [50576.198376] [<ffffffff810ebed9>] handle_fasteoi_irq+0x59/0x100 Nov 29 23:34:15 zurich kernel: [50576.198379] [<ffffffff81016150>] handle_irq+0x60/0x150 Nov 29 23:34:15 zurich kernel: [50576.198382] [<ffffffff810656d4>] ? irq_enter+0x54/0x90 Nov 29 23:34:15 zurich kernel: [50576.198384] [<ffffffff816236ca>] do_IRQ+0x5a/0xe0 Nov 29 23:34:15 zurich kernel: [50576.198387] [<ffffffff81619eea>] common_interrupt+0x6a/0x6a Nov 29 23:34:15 zurich kernel: [50576.198388] <EOI> [<ffffffff814c6ad9>] ? poll_idle+0x49/0x90 Nov 29 23:34:15 zurich kernel: [50576.198392] [<ffffffff814c6aac>] ? poll_idle+0x1c/0x90 Nov 29 23:34:15 zurich kernel: [50576.198394] [<ffffffff814c6669>] cpuidle_enter+0x19/0x20 Nov 29 23:34:15 zurich kernel: [50576.198396] [<ffffffff814c6cfc>] cpuidle_idle_call+0xac/0x290 Nov 29 23:34:15 zurich kernel: [50576.198398] [<ffffffff8101d74f>] cpu_idle+0xcf/0x120 Nov 29 23:34:15 zurich kernel: [50576.198400] [<ffffffff815f611e>] rest_init+0x72/0x74 Nov 29 23:34:15 zurich kernel: [50576.198404] [<ffffffff81cfcc31>] start_kernel+0x3c7/0x3d4 Nov 29 23:34:15 zurich kernel: [50576.198406] [<ffffffff81cfc66a>] ? repair_env_string+0x5a/0x5a Nov 29 23:34:15 zurich kernel: [50576.198407] [<ffffffff81cfc356>] x86_64_start_reservations+0x131/0x135 Nov 29 23:34:15 zurich kernel: [50576.198409] [<ffffffff81cfc120>] ? early_idt_handlers+0x120/0x120 Nov 29 23:34:15 zurich kernel: [50576.198411] [<ffffffff81cfc45c>] x86_64_start_kernel+0x102/0x111 and the interesting thing was that the nvidia driver happens to share the interrupt with a usb: # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 345880 0 0 0 0 0 0 0 IO-APIC-edge timer 1: 2 0 0 0 0 0 0 0 IO-APIC-edge i8042 8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi xhci_hcd:usb9 16: 45593 0 0 0 0 0 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb3, nvidia 17: 546704 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel, eth0 18: 3 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb8, i801_smbus, eth1 19: 23052 0 0 0 0 0 0 0 IO-APIC-fasteoi ahci, uhci_hcd:usb5, uhci_hcd:usb7 21: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb4 22: 626 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel 23: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb6 NMI: 174 123 111 116 25 21 22 21 Non-maskable interrupts LOC: 160158 194809 173356 174282 59352 54094 53428 48579 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 174 123 111 116 25 21 22 21 Performance monitoring interrupts IWI: 0 0 0 0 0 0 0 0 IRQ work interrupts RTR: 5 0 0 0 0 0 0 0 APIC ICR read retries RES: 148485 68071 14284 5637 2939 1589 1274 1324 Rescheduling interrupts CAL: 16316 13558 9978 10295 7119 6619 6255 4802 Function call interrupts TLB: 0 0 0 0 0 0 0 0 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 7 7 7 7 7 7 7 7 Machine check polls ERR: 0 MIS: 0 I had to reboot as the only observable consequence of this event was that the mouse was limping.