Bug 800520

Summary: 2.6.42.9-2.fc15.x86_64: syslog message sabout disabling/polling IRQ
Product: [Fedora] Fedora Reporter: Harald Reindl <h.reindl>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: bb, chgonzalezg, covex, fedora, gansalmon, itamar, jeff, jeremy, jonathan, kernel-maint, leon, madhu.chinakonda, michele.mazza, mike, mikewillcook, obelov, pfessel
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-11 17:25:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output for kernel 3.2.10-2.fc16.x86_64
none
lsmod output for kernel 3.2.10-2.fc16.x86_64
none
/proc/interrupts from AMD X6 1075T processor, M4A89TD mobo
none
proc/interrupts from HP Pavillon Dv5-2060BR laptop (M430 Core i5)
none
Comment none

Description Harald Reindl 2012-03-06 16:23:36 UTC
2.6.42.9-2 (from koji) seems to work fine at all but on my homeserver the following messages permanently - the only difference to the workstation is a WLAN card, both machines are identical and even the setup is cloned over dd/ssh

03:00.0 Network controller: Atheros Communications Inc. AR5008 Wireless Network Adapter (rev 01)
_________________

Message from syslogd@srv-rhsoft at Mar  6 17:08:47 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar  6 17:10:51 ...
 kernel:Disabling IRQ 16

Mar  6 16:41:32 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 16:41:32 srv-rhsoft kernel: Polling IRQ 16
Mar  6 16:41:33 srv-rhsoft kernel: Reenabling IRQ 16
Mar  6 16:45:40 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 16:45:40 srv-rhsoft kernel: Polling IRQ 16
Mar  6 16:45:41 srv-rhsoft kernel: Reenabling IRQ 16
Mar  6 17:08:47 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 17:08:47 srv-rhsoft kernel: Polling IRQ 16
Mar  6 17:08:48 srv-rhsoft kernel: Reenabling IRQ 16
Mar  6 17:10:51 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 17:10:51 srv-rhsoft kernel: Polling IRQ 16
Mar  6 17:10:52 srv-rhsoft kernel: Reenabling IRQ 16
Mar  6 17:13:59 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 17:13:59 srv-rhsoft kernel: Polling IRQ 16
Mar  6 17:14:00 srv-rhsoft kernel: Reenabling IRQ 16
Mar  6 17:16:27 srv-rhsoft kernel: Disabling IRQ 16
Mar  6 17:16:27 srv-rhsoft kernel: Polling IRQ 16
Mar  6 17:16:28 srv-rhsoft kernel: Reenabling IRQ 16

Comment 1 michele mazza 2012-03-08 19:03:32 UTC
same issue on 2.6.42.9-1
IRQ 16 is shared between mmc0 and nvidia.

01:00.0 VGA compatible controller: nVidia Corporation GT218 [NVS 3100M] (rev a2)
0d:00.0 SD Host controller: Ricoh Co Ltd MMC/SD Host Controller (rev 01)

no obvious malfunction visible.

Comment 2 Josh Boyer 2012-03-12 13:56:37 UTC
*** Bug 802131 has been marked as a duplicate of this bug. ***

Comment 3 Josh Boyer 2012-03-12 13:58:35 UTC
*** Bug 802147 has been marked as a duplicate of this bug. ***

Comment 4 Paulo Fessel 2012-03-12 15:16:10 UTC
I have tried adding the "irqpoll" option when booting, after researching about these messages on the internet. In my system, this normally messes with the graphics driver and I'll have to cold-boot my machine thru the reset button.

Comment 5 Josh Boyer 2012-03-12 19:01:58 UTC
The verbose messages should be gone with the next kernel update.

Comment 6 Harald Reindl 2012-03-14 15:02:01 UTC
> The verbose messages should be gone with the next kernel update

no, they are still there and since these are "wall-messages" they are spiting in EVERY open terminal (even remote machines) the whole day

[root@srv-rhsoft:~]$ 
Message from syslogd@srv-rhsoft at Mar 14 15:37:55 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar 14 15:38:54 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar 14 15:44:32 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar 14 15:45:11 ...
 kernel:Disabling IRQ 16

[root@srv-rhsoft:~]$ 
Message from syslogd@srv-rhsoft at Mar 14 15:51:17 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar 14 15:55:56 ...
 kernel:Disabling IRQ 16

Message from syslogd@srv-rhsoft at Mar 14 15:58:42 ...
 kernel:Disabling IRQ 16
uname -r
3.2.10-1.fc16.x86_64 #1 SMP Mon Mar 12 22:34:35 UTC 2012

Comment 7 Josh Boyer 2012-03-14 15:47:24 UTC
(In reply to comment #6)
> Message from syslogd@srv-rhsoft at Mar 14 15:58:42 ...
>  kernel:Disabling IRQ 16
> uname -r
> 3.2.10-1.fc16.x86_64 #1 SMP Mon Mar 12 22:34:35 UTC 2012

Indeed.  I missed one spot.

Comment 8 Henrique Martins 2012-03-14 18:00:15 UTC
> The verbose messages should be gone with the next kernel update.

Are the messages going to be gone, or is the underlying cause be gone?  I.e. is the machine going to go through the enabling/probing/disabling cycle and not report it or is the cycle not happening?

Comment 9 Josh Boyer 2012-03-14 18:36:22 UTC
(In reply to comment #8)
> > The verbose messages should be gone with the next kernel update.
> 
> Are the messages going to be gone, or is the underlying cause be gone?  I.e. is
> the machine going to go through the enabling/probing/disabling cycle and not
> report it or is the cycle not happening?

With the kernel in updates-testing, it will still cycle.  I'm looking at changing the patch to only do this on machines that have a known broken PCI bridge instead.

Comment 10 Adam Pribyl 2012-03-14 18:37:02 UTC
In my case the interrupt in charge is shared between ahci and tv tuner in one of the PCI slots. When the tuner is running, there is a flood of messages about enabling and disabling interrupt. In my case I found it because my RAID1 got broken under heavy load. I'm not sure if it my be the cause however.

Comment 11 Henrique Martins 2012-03-14 19:05:18 UTC
> In my case the interrupt in charge is shared between 
> ahci and tv tuner in one of the PCI slots.

Didn't check what is being shared on my machine (is this info in dmesg?) but on the only machine I have where it fails, I do have a Haupage TV capture card ...

Comment 12 Henrique Martins 2012-03-14 19:12:00 UTC
I'm back to the kernel that doesn't report this, but if IRQ assignments are the same and listing /proc/irq/<irq number> is the way to find this out, in my case I see under IRQ18 ehci_hcd:usb1, ivtv0, uhci_hcd:usb5, uhci_hcd:usb8 with ivtv0 being the TV card.

I think I have my wireless mouse on one of the shared USBs above, as on one reboot the mouse didn't work (it did on others)

Comment 13 Michael Cronenworth 2012-03-15 05:10:40 UTC
(In reply to comment #9)
> With the kernel in updates-testing, it will still cycle.  I'm looking at
> changing the patch to only do this on machines that have a known broken PCI
> bridge instead.

Josh, you should push upstream to add MSI to ath9k. ;)

Comment 14 Harald Reindl 2012-03-15 09:45:18 UTC
here some additional infos of my machine spitting around
all is working fine AFAIK

_____________________________

[root@srv-rhsoft:~]$ dmesg | grep IRQ
Disabling IRQ 16
Disabling IRQ 16

[root@srv-rhsoft:~]$ ls /proc/irq/16/
insgesamt 0
dr-xr-xr-x 2 root root 0 2012-03-15 10:43 ath9k
dr-xr-xr-x 2 root root 0 2012-03-15 10:43 ehci_hcd:usb1
-r-------- 1 root root 0 2012-03-15 10:43 affinity_hint
-r--r--r-- 1 root root 0 2012-03-15 10:43 node
-rw------- 1 root root 0 2012-03-15 10:43 smp_affinity
-rw------- 1 root root 0 2012-03-15 10:43 smp_affinity_list
-r--r--r-- 1 root root 0 2012-03-15 10:43 spurious

[root@srv-rhsoft:~]$ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1c.6 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 (rev b4)
00:1c.7 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 8 (rev b4)
00:1d.0 USB Controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation Q67 Express Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
02:00.0 USB Controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 03)
03:00.0 Network controller: Atheros Communications Inc. AR5008 Wireless Network Adapter (rev 01)

Comment 15 Adam Pribyl 2012-03-15 10:02:12 UTC
I'd like to know wherether this is just anoying message or a sign of a problem.
What machines do have a broken PCI? This is AMD E-350.

# cat /proc/interrupts
           CPU0       CPU1
  0:        301          0    XT-PIC-XT-PIC    timer
  1:       2209          0    XT-PIC-XT-PIC    i8042
  2:          0          0    XT-PIC-XT-PIC    cascade
  7:          4          0    XT-PIC-XT-PIC    ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3
  8:          1          0    XT-PIC-XT-PIC    rtc0
  9:          0          0    XT-PIC-XT-PIC    acpi
 10:   12605057          0    XT-PIC-XT-PIC    ahci, saa7133[1], saa7133[1]
 11:    7872905          0    XT-PIC-XT-PIC    ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, saa7133[0], saa7133[0]
 16:    5324158         34   PCI-MSI-edge      eth0
 17:     265053         35   PCI-MSI-edge      eth1
 18:          0         30   PCI-MSI-edge      snd_hda_intel
NMI:       2134        586   Non-maskable interrupts
LOC:   25627274    8537321   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:       2134        586   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:     623049   13131851   Rescheduling interrupts
CAL:        354        551   Function call interrupts
TLB:      13856      12735   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        186        186   Machine check polls
ERR:          0
MIS:          0

And this is what I see with 2.6.42.9 when saa7133 is running:
Mar 14 14:47:24 server kernel: [ 1463.224055] Polling IRQ 11
Mar 14 14:47:25 server kernel: [ 1464.224056] Reenabling IRQ 11
Mar 14 14:47:31 server kernel: [ 1470.194167] Disabling IRQ 10
Mar 14 14:47:31 server kernel: [ 1470.203221] Polling IRQ 10
Mar 14 14:47:32 server kernel: [ 1471.203277] Reenabling IRQ 10
Mar 14 14:47:34 server kernel: [ 1473.621244] Disabling IRQ 11
Mar 14 14:47:34 server kernel: [ 1473.630306] Polling IRQ 11
Mar 14 14:47:35 server kernel: [ 1474.630054] Reenabling IRQ 11
Mar 14 14:47:38 server kernel: [ 1478.012060] Disabling IRQ 11

Comment 16 Oleg Belov 2012-03-15 11:13:56 UTC
I just had similar problem on every movement of USB mouse. Only uhci_hcd:usb3 was connected to IRQ 16. 

After switching the mouse to other USB controller system works OK.

Machine on ASUS P5K P35/ICH9 chipset Core Quad, 
3.2.9-2.fc16.x86_64 #1 SMP .. x86_64 x86_64 x86_64 GNU/Linux

Comment 17 Josh Boyer 2012-03-15 11:25:37 UTC
(In reply to comment #13)
> (In reply to comment #9)
> > With the kernel in updates-testing, it will still cycle.  I'm looking at
> > changing the patch to only do this on machines that have a known broken PCI
> > bridge instead.
> 
> Josh, you should push upstream to add MSI to ath9k. ;)

That really has nothing to do with this bug.

Comment 18 Josh Boyer 2012-03-15 11:26:27 UTC
Could you try this kernel and see if it functions better for you:

http://koji.fedoraproject.org/koji/buildinfo?buildID=307357

Comment 19 Harald Reindl 2012-03-15 11:47:40 UTC
not really

[root@srv-rhsoft:~]$ 
Message from syslogd@srv-rhsoft at Mar 15 12:43:45 ...
 kernel:Disabling IRQ 16

[root@srv-rhsoft:~]$ uname -r
3.2.10-2.fc16.x86_64 #1 SMP Thu Mar 15 01:39:16 UTC 2012
_________________________________

[root@srv-rhsoft:~]$ ls /proc/irq/16/                                                                                                                                                                              
insgesamt 0                                                                                                                                                                                                        
dr-xr-xr-x 2 root root 0 2012-03-15 12:47 ath9k                                                                                                                                                                    
dr-xr-xr-x 2 root root 0 2012-03-15 12:47 ehci_hcd:usb1                                                                                                                                                            
-r-------- 1 root root 0 2012-03-15 12:47 affinity_hint                                                                                                                                                            
-r--r--r-- 1 root root 0 2012-03-15 12:47 node                                                                                                                                                                     
-rw------- 1 root root 0 2012-03-15 12:47 smp_affinity                                                                                                                                                             
-rw------- 1 root root 0 2012-03-15 12:47 smp_affinity_list
-r--r--r-- 1 root root 0 2012-03-15 12:47 spurious

Comment 20 Harald Reindl 2012-03-15 11:49:27 UTC
this one has even a stack-trace while wlan0 seems to work

[root@srv-rhsoft:~]$ dmesg -c
scsi_verify_blk_ioctl: 158 callbacks suppressed
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
scsi_verify_blk_ioctl: 6 callbacks suppressed
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper/1 Tainted: G           O 3.2.10-2.fc16.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff810e11fd>] __report_bad_irq+0x3d/0xe0
 [<ffffffff810e14c5>] note_interrupt+0x175/0x240
 [<ffffffff810deba9>] handle_irq_event_percpu+0xa9/0x220
 [<ffffffff810ded64>] handle_irq_event+0x44/0x70
 [<ffffffff810e1f4f>] handle_fasteoi_irq+0x5f/0xf0
 [<ffffffff81016226>] handle_irq+0x46/0xb0
 [<ffffffff815ef51a>] do_IRQ+0x5a/0xe0
 [<ffffffff815e4e6e>] common_interrupt+0x6e/0x6e
 <EOI>  [<ffffffff81094189>] ? enqueue_hrtimer+0x39/0xc0
 [<ffffffff813125ed>] ? intel_idle+0xed/0x150
 [<ffffffff813125cf>] ? intel_idle+0xcf/0x150
 [<ffffffff814952d1>] cpuidle_idle_call+0xc1/0x280
 [<ffffffff8101322a>] cpu_idle+0xca/0x120
 [<ffffffff815d30b0>] start_secondary+0x260/0x262
handlers:
[<ffffffff81406600>] usb_hcd_irq
[<ffffffffa033f8a0>] ath_isr
Disabling IRQ 16

Comment 21 Josh Boyer 2012-03-15 13:04:45 UTC
(In reply to comment #20)
> this one has even a stack-trace while wlan0 seems to work
> 
> irq 16: nobody cared (try booting with the "irqpoll" option)
> Pid: 0, comm: swapper/1 Tainted: G           O 3.2.10-2.fc16.x86_64 #1
> Call Trace:
>  <IRQ>  [<ffffffff810e11fd>] __report_bad_irq+0x3d/0xe0
>  [<ffffffff810e14c5>] note_interrupt+0x175/0x240
>  [<ffffffff810deba9>] handle_irq_event_percpu+0xa9/0x220
>  [<ffffffff810ded64>] handle_irq_event+0x44/0x70
>  [<ffffffff810e1f4f>] handle_fasteoi_irq+0x5f/0xf0
>  [<ffffffff81016226>] handle_irq+0x46/0xb0
>  [<ffffffff815ef51a>] do_IRQ+0x5a/0xe0
>  [<ffffffff815e4e6e>] common_interrupt+0x6e/0x6e
>  <EOI>  [<ffffffff81094189>] ? enqueue_hrtimer+0x39/0xc0
>  [<ffffffff813125ed>] ? intel_idle+0xed/0x150
>  [<ffffffff813125cf>] ? intel_idle+0xcf/0x150
>  [<ffffffff814952d1>] cpuidle_idle_call+0xc1/0x280
>  [<ffffffff8101322a>] cpu_idle+0xca/0x120
>  [<ffffffff815d30b0>] start_secondary+0x260/0x262
> handlers:
> [<ffffffff81406600>] usb_hcd_irq
> [<ffffffffa033f8a0>] ath_isr
> Disabling IRQ 16

It has a stack trace because the kernel actually marked the interrupt as bad permanently.

Can you post the full dmesg and lsmod output for this machine?

Comment 22 Harald Reindl 2012-03-15 13:21:21 UTC
currently only "lsmod" because did "dmesg -c" after reboot
and /var/log/dmesg is not tocuhed since a long time
-rw-r--r--  1 root     root        57K 2011-06-18 13:05 dmesg

as said, i have here an identical machine as workstation and both are even dd-over-ssh-clones while the only difference is the wlan-card and both machines have 100% the identical packages, updates and running software all the time
_____________________

Module                  Size  Used by
vmnet                  55557  30
vsock                  52239  0
vmci                   80809  4 vsock
vmmon                  80511  32
tun                    22768  2
bridge                 90784  0
stp                    12823  1 bridge
fuse                   77591  3
llc                    14090  2 bridge,stp
nf_nat_ftp             12770  0
nf_conntrack_ftp       14484  1 nf_nat_ftp
ipt_LOG                12993  3
xt_limit               12711  3
xt_recent              18474  12
xt_state               12578  1134
xt_DSCP                12629  12
coretemp               13401  0
iptable_mangle         12695  1
ipt_MASQUERADE         12880  3
xt_multiport           12798  1122
iptable_nat            13383  1
nf_nat                 25322  3 nf_nat_ftp,ipt_MASQUERADE,iptable_nat
nf_conntrack_ipv4      14622  1137 iptable_nat,nf_nat
nf_conntrack           82331  7 nf_nat_ftp,nf_conntrack_ftp,xt_state,ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4
nf_defrag_ipv4         12673  1 nf_conntrack_ipv4
snd_hda_codec_hdmi     36277  1
snd_hda_codec_realtek   231982  1
arc4                   12529  2
snd_usb_audio         125353  1
snd_usbmidi_lib        24763  1 snd_usb_audio
snd_hda_intel          33276  3
snd_hda_codec         114652  3 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel
snd_rawmidi            29530  1 snd_usbmidi_lib
ath9k                  95034  0
mac80211              439421  1 ath9k
snd_hwdep              17611  2 snd_usb_audio,snd_hda_codec
snd_seq                64807  0
ath9k_common           13600  1 ath9k
ath9k_hw              374492  2 ath9k,ath9k_common
ath                    23089  3 ath9k,ath9k_common,ath9k_hw
cfg80211              194802  3 ath9k,mac80211,ath
snd_seq_device         14129  2 snd_rawmidi,snd_seq
hp_wmi                 18048  0
uvcvideo               71509  0
snd_pcm                97170  5 snd_hda_codec_hdmi,snd_usb_audio,snd_hda_intel,snd_hda_codec
videodev               97776  1 uvcvideo
snd_timer              28815  2 snd_seq,snd_pcm
media                  20408  2 uvcvideo,videodev
e1000e                198082  0
snd                    74425  19 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_usb_audio,snd_usbmidi_lib,snd_hda_intel,snd_hda_codec,snd_rawmidi,snd_hwdep,snd_seq,snd_seq_device,snd_pcm,snd_timer
serio_raw              13371  0
i2c_i801               17765  0
v4l2_compat_ioctl32    16726  1 videodev
snd_page_alloc         18101  2 snd_hda_intel,snd_pcm
iTCO_wdt               17948  0
sparse_keymap          13526  1 hp_wmi
rfkill                 21410  2 cfg80211,hp_wmi
iTCO_vendor_support    13419  1 iTCO_wdt
soundcore              14484  1 snd
microcode              23240  0
raid1                  35373  1
raid10                 35261  2
wmi                    18697  1 hp_wmi
usb_storage            52101  0
i915                  452930  2
drm_kms_helper         40141  1 i915
drm                   226004  3 i915,drm_kms_helper
i2c_algo_bit           13156  1 i915
i2c_core               37955  6 videodev,i2c_i801,i915,drm_kms_helper,drm,i2c_algo_bit
video                  18932  1 i915

Comment 23 Josh Boyer 2012-03-15 13:37:12 UTC
(In reply to comment #22)
> currently only "lsmod" because did "dmesg -c" after reboot
> and /var/log/dmesg is not tocuhed since a long time
> -rw-r--r--  1 root     root        57K 2011-06-18 13:05 dmesg

The kernel output should be stored in /var/log/messages

> as said, i have here an identical machine as workstation and both are even
> dd-over-ssh-clones while the only difference is the wlan-card and both machines
> have 100% the identical packages, updates and running software all the time
> _____________________
> 
> Module                  Size  Used by
> vmnet                  55557  30
> vsock                  52239  0
> vmci                   80809  4 vsock
> vmmon                  80511  32

Please try this without the vmware modules loaded.

Comment 24 Harald Reindl 2012-03-15 13:47:27 UTC
"/var/log/messages" is one of the things i have usually empty and ina filewatch-applet on the desktop, one reason why the messages are so annoying (including the mdadm) - i will provide the bootlog later afternoon after reboot the machine

they all are happening only since the last kernel-updates while there was over months no single one of them

without vmware-modules loaded is no option, the machine is primary a vmware-host

Comment 25 Henrique Martins 2012-03-15 14:45:40 UTC
Kernel 3.2.10-2.fc16.x86_64 doesn't work for me in the sense that my mouse becomes erratic and useless.  Mouse is on the shared IRQ 18 which has these devices on ehci_hcd:usb1, ivtv0, uhci_hcd:usb5, uhci_hcd:usb8.

I'll attach the output of dmesg and lsmod for this machine.

Comment 26 Henrique Martins 2012-03-15 14:47:48 UTC
Created attachment 570315 [details]
dmesg output for kernel 3.2.10-2.fc16.x86_64

Comment 27 Henrique Martins 2012-03-15 14:48:31 UTC
Created attachment 570316 [details]
lsmod output for kernel 3.2.10-2.fc16.x86_64

Comment 28 Josh Boyer 2012-03-15 16:05:03 UTC
OK, neither Henrique nor Harald have machines where the quirk is even activated.  The patch I've been working on shouldn't even be in play at this point.  What was the last kernel that worked without reporting a bad IRQ?

Comment 29 Adam Pribyl 2012-03-15 16:10:35 UTC
For me this is 2.6.42.7-1.fc15.i686.PAE - this is what I am using now instead .9.

Comment 30 Harald Reindl 2012-03-15 16:21:58 UTC
this happens AFAIK since the 3.2.9 / 2.6.42.9 kernels

Comment 31 Paulo Fessel 2012-03-15 16:30:18 UTC
In my AMD machine this happened since my update to kernel-3.2.9-2.fc16. I'm booting it right now on my laptop (Intel based) and will report if it shows the same issue.

Comment 32 Henrique Martins 2012-03-15 16:32:59 UTC
3.2.9-1.fc16.x86_64 works for my machine

Comment 33 Paulo Fessel 2012-03-15 16:42:13 UTC
My HP intel-based laptop (Pavillon DV5-2060BR) seems to have no issues with 3.2.9-2.fc16.x86_64. Locked the screen a few times and got no messages either on the screen or log file. Suspending and reactivating the laptop also works without problems or messages.

Comment 34 Paulo Fessel 2012-03-15 16:49:55 UTC
Created attachment 570352 [details]
/proc/interrupts from AMD X6 1075T processor, M4A89TD mobo

Comment 35 Paulo Fessel 2012-03-15 16:51:19 UTC
Created attachment 570353 [details]
proc/interrupts from HP Pavillon Dv5-2060BR laptop (M430 Core i5)

Comment 36 Henrique Martins 2012-03-15 16:56:03 UTC
I have 3.2.9-2.fc16.x86_64 running on a Lenovo ThinkPad T420, and an old Dell Optiplex GX520 and 3.2.9-2.fc16.x86_64 on a IBM ThinkPad T43, all with no problems whatsoever.  Only my (no brand) Gigabit Intel based MB home machine has this problem and somehow I think (but not sure) it is due to my Hauppauge video capture card ...

Comment 37 Josh Boyer 2012-03-15 16:59:35 UTC
(In reply to comment #36)
> I have 3.2.9-2.fc16.x86_64 running on a Lenovo ThinkPad T420, and an old Dell
> Optiplex GX520 and 3.2.9-2.fc16.x86_64 on a IBM ThinkPad T43, all with no
> problems whatsoever.  Only my (no brand) Gigabit Intel based MB home machine
> has this problem and somehow I think (but not sure) it is due to my Hauppauge
> video capture card ...

I'd agree with that conclusion.  I have no idea why it's happening though.

Comment 38 Josh Boyer 2012-03-15 17:02:20 UTC
(In reply to comment #32)
> 3.2.9-1.fc16.x86_64 works for my machine

Harald, Adam, could you both try 3.2.9-1 and see if you have reports of bad IRQs.  That does not have the polling IRQ patch applied at all.  It will at least eliminate that as the cause of the issues you've been seeing.

Honestly, 3.2.10-2 should have already eliminated it as the cause since it only switches to the polling mode if a specific PCI bridge is found.  However, it would be good to know if the 3.2.9 kernel itself introduces issues for you as opposed to 3.2.7/8

Comment 39 michele mazza 2012-03-15 18:15:11 UTC
2.6.42.7-1.fc15.x86_64 is my last working kernel.
running on a Lenovo ThinkPad T510i

Comment 40 Harald Reindl 2012-03-15 18:49:41 UTC
3.2.9-1.fc16.x86_64  is not affected

> Honestly, 3.2.10-2 should have already eliminated it as the 
> cause since it only switches to the polling mode if a 
> specific PCI bridge is found.

no, 3.2.10-2 is the worst kernel-build ever!

as i came home i had to realize that after https://bugzilla.redhat.com/show_bug.cgi?id=800520#c20 (i installed te kernel remote from the office) all rear-USB ports and WLAN was dead and it was really hard to handle suspend all my running virtual machines and reboot via mobile/wan/ssh

the previous builds did not affect working of USB/WLAN, they only spittet the whole day my logs and terminals full

Comment 41 Josh Boyer 2012-03-15 19:32:10 UTC
(In reply to comment #40)
> 3.2.9-1.fc16.x86_64  is not affected

Wonderful.  Thank you.

> > Honestly, 3.2.10-2 should have already eliminated it as the 
> > cause since it only switches to the polling mode if a 
> > specific PCI bridge is found.
> 
> no, 3.2.10-2 is the worst kernel-build ever!
> 

I found another case where if you only have a handful of ignored interrupts it'll mark it bad very quickly even on the non-broken machines.  It used to be much more tolerant.  I believe I've fixed that in this scratch build, and I would appreciate it if you'd test when it completes:

http://koji.fedoraproject.org/koji/taskinfo?taskID=3898618

Henrique, that might also explain the issues you saw on the machine with the video capture card.  Your testing would also be appreciated.

> as i came home i had to realize that after
> https://bugzilla.redhat.com/show_bug.cgi?id=800520#c20 (i installed te kernel
> remote from the office) all rear-USB ports and WLAN was dead and it was really
> hard to handle suspend all my running virtual machines and reboot via
> mobile/wan/ssh

Like I said, it permanently disabled the IRQ.  If that was shared, which it usually is, nothing using that IRQ will work.

> the previous builds did not affect working of USB/WLAN, they only spittet the
> whole day my logs and terminals full

That's because it fell back to polling mode instead of just disabling the IRQ.

Comment 42 Henrique Martins 2012-03-15 19:40:57 UTC
> Henrique, that might also explain the issues you saw on the machine with the
> video capture card.  Your testing would also be appreciated.

Later tonight, or tomorrow morning (USA California time).  This is my main home machine and I'm not going to try this remotely (which I could.)

Comment 43 Harald Reindl 2012-03-15 21:23:47 UTC
Created attachment 915425 [details]
Comment

(This comment was longer than 65,535 characters and has been moved to an attachment by Red Hat Bugzilla).

Comment 44 Josh Boyer 2012-03-15 23:42:28 UTC
(In reply to comment #43)
> 3.2.10-2.2.fc16.x86_64:
> 
> * no longer IRQ messages
> * WLAN works fine as AP
> * keyborad wors
> * mouse works

Good.  I think I have all the kinks worked out for systems that shouldn't need this patch.

> ___________________________________-
> 
> if we sooner or later get rid of the ioctl-messages i would be happy
> (triggered by /sbin/mdadm --detail /dev/mdX)
> scsi_verify_blk_ioctl: 6 callbacks suppressed
> mdadm: sending ioctl 1261 to a partition!
> mdadm: sending ioctl 1261 to a partition!
> mdadm: sending ioctl 1261 to a partition!
> mdadm: sending ioctl 1261 to a partition!
> mdadm: sending ioctl 1261 to a partition!
> mdadm: sending ioctl 1261 to a partition!
> ___________________________________-

Those are informational only.  They should be removed at some point in the not distant future.

> however, this is the dmesg-output of the working kernel

Great.  It showed that the quirk is indeed not active on your system.  I'll get the fixed up patch committed soon.

Comment 45 Henrique Martins 2012-03-16 05:54:02 UTC
3.2.10-2.2.fc16.x86_64 seems to works fine on my machine. No wall or syslogd messages.  Mouse and capture card work.  I'll leave the system on this kernel and will report back if something funny happens.  Can post dmesg output if needed.

Comment 46 Harald Reindl 2012-03-16 10:48:17 UTC
after some hours i can confimr again: works well

but a question to the kernel-3.2.10-3.fc16 build:
is there only no hint in the changelog or was there an overlap and
this build does not contain the fixes from the scratch-build?

i do not like reboot and downgrades too often :-)


http://koji.fedoraproject.org/koji/buildinfo?buildID=307526
kernel-3.2.10-3.fc16
* Thu Mar 15 2012 Justin M. Forbes <jforbes> - 3.2.10-3 - CVE-2012-1179 fix pmd_bad() triggering in code paths holding mmap_sem read mode (rhbz 803809) 
* Wed Mar 14 2012 Josh Boyer <jwboyer> - Fixup irqpoll patch to only activate on machines with ASM108x PCI bridge

Comment 47 Josh Boyer 2012-03-16 11:00:11 UTC
(In reply to comment #46)
> after some hours i can confimr again: works well
> 
> but a question to the kernel-3.2.10-3.fc16 build:
> is there only no hint in the changelog or was there an overlap and
> this build does not contain the fixes from the scratch-build?

It doesn't contain the fix in the -2.2 kernel.  Instead, it temporarily drops the patch entirely.  It should work well for your machine.  The next kernel update should have the fixed up patch and I'll be sure to note it in the changelog.

Comment 48 Harald Reindl 2012-03-16 12:40:21 UTC
i can confirm 3.2.10-3.fc16.x86_64 working fine
generally i wonder that most of my hardware seems to share IRQ 16
see last dmesg-grep

may this one reason that USB3 is buggy like hell on both of my machines (meaning while copy large files all osrt of errors like drive gone away, dribe gone read-only an so on while the same device on the front_USB2 works without any troubles)

i noticed this still in F14 as the machine was new and wondering a little bit since AFAIK linux was the first OS supporting USB3 at all

[root@srv-rhsoft:~]$ dmesg | grep -i irq
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ2 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] nr_irqs_gsi: 40
NR_IRQS:16640 nr_irqs:744 16
Enabled IRQ remapping in x2apic mode
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 10 11 12 14 15)
PCI: Using ACPI for IRQ routing
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
pnp 00:03: [irq 1]
pnp 00:04: [irq 12]
pnp 00:06: [irq 8]
pnp 00:09: [irq 13]
pnp 00:0a: [irq 4]
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.6: PCI INT C -> GSI 18 (level, low) -> IRQ 18
pci 0000:00:1c.7: PCI INT D -> GSI 19 (level, low) -> IRQ 19
pci 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
pci 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pcieport 0000:00:01.0: irq 42 for MSI/MSI-X
pcieport 0000:00:1c.0: irq 43 for MSI/MSI-X
pcieport 0000:00:1c.4: irq 44 for MSI/MSI-X
pcieport 0000:00:1c.6: irq 45 for MSI/MSI-X
pcieport 0000:00:1c.7: irq 46 for MSI/MSI-X
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: irq 47 for MSI/MSI-X
ata1: SATA max UDMA/133 abar m2048@0xfe725000 port 0xfe725100 irq 47
ata2: SATA max UDMA/133 abar m2048@0xfe725000 port 0xfe725180 irq 47
ata3: SATA max UDMA/133 abar m2048@0xfe725000 port 0xfe725200 irq 47
ata4: SATA max UDMA/133 abar m2048@0xfe725000 port 0xfe725280 irq 47
ata5: SATA max UDMA/133 abar m2048@0xfe725000 port 0xfe725300 irq 47
ehci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ehci_hcd 0000:00:1a.0: irq 16, io mem 0xfe727000
ehci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.0: irq 23, io mem 0xfe726000
xhci_hcd 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
xhci_hcd 0000:02:00.0: irq 16, io mem 0xfe500000
xhci_hcd 0000:02:00.0: irq 48 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 49 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 50 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 51 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 52 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 53 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 54 for MSI/MSI-X
xhci_hcd 0000:02:00.0: irq 55 for MSI/MSI-X
i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
rtc0: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
i915 0000:00:02.0: irq 56 for MSI/MSI-X
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: irq 57 for MSI/MSI-X
ath9k 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
e1000e 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
e1000e 0000:01:00.0: irq 58 for MSI/MSI-X
e1000e 0000:01:00.0: irq 59 for MSI/MSI-X
e1000e 0000:01:00.0: irq 60 for MSI/MSI-X
ieee80211 phy0: Atheros AR5418 MAC/BB Rev:2 AR2133 RF Rev:81 mem=0xffffc90012920000, irq=16
i801_smbus 0000:00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18
snd_hda_intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
snd_hda_intel 0000:00:1b.0: irq 61 for MSI/MSI-X
e1000e 0000:00:19.0: irq 57 for MSI/MSI-X
e1000e 0000:00:19.0: irq 57 for MSI/MSI-X


[root@srv-rhsoft:~]$ dmesg | grep -i irq | grep 16
NR_IRQS:16640 nr_irqs:744 16
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ehci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ehci_hcd 0000:00:1a.0: irq 16, io mem 0xfe727000
xhci_hcd 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
xhci_hcd 0000:02:00.0: irq 16, io mem 0xfe500000
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ath9k 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
e1000e 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ieee80211 phy0: Atheros AR5418 MAC/BB Rev:2 AR2133 RF Rev:81 mem=0xffffc90012920000, irq=16

Comment 49 Adam Pribyl 2012-03-16 13:00:14 UTC
(In reply to comment #41)

Are you going to do a fixed build for fc15 too? I'll be glad to test it. I am not sure if I can install the f16 kernel into f15, therefore I'd rather not.

Comment 50 Paulo Fessel 2012-03-16 13:19:15 UTC
I can confirm that 3.2.10-3.fc16.x86_64 has solved my problem with the IRQ messages. NVidia and Azalia drivers work ok, and I had no issues starting VirtualBox. Thanks!

Comment 51 Josh Boyer 2012-03-16 13:48:05 UTC
(In reply to comment #49)
> (In reply to comment #41)
> 
> Are you going to do a fixed build for fc15 too? I'll be glad to test it. I am
> not sure if I can install the f16 kernel into f15, therefore I'd rather not.

The latest update submitted for F15 updates-testing should have the patch temporarily dropped too.  I'll commit the fixed patch across all branches soon.

Comment 52 Adam Pribyl 2012-03-17 09:08:38 UTC
(In reply to comment #51)
> (In reply to comment #49)
> > (In reply to comment #41)
> > 
> > Are you going to do a fixed build for fc15 too? I'll be glad to test it. I am
> > not sure if I can install the f16 kernel into f15, therefore I'd rather not.
> 
> The latest update submitted for F15 updates-testing should have the patch
> temporarily dropped too.  I'll commit the fixed patch across all branches soon.

I tried 2.6.42.10-1 from testing and it is still printing repeatedly Disabling IRQ messages. I am not sure if this is what you meant.

Comment 53 Josh Boyer 2012-04-11 17:25:35 UTC
This should be fixed as far as I know.

Comment 54 Harald Reindl 2012-04-11 17:34:06 UTC
confirmed on all my hardware with the last releases including 3.3.1

Comment 55 michele mazza 2012-04-12 08:15:13 UTC
I moved to fc16 in the meantime, and the problem is gone in 3.3.0-8 and 3.3.1-3