Description of problem: interrupts stop working Version-Release number of selected component (if applicable): 2.6.27-0.347.rc7.git1.fc10.i686 but older versions recently have introduced this from 337+ How reproducible: 100% Steps to Reproduce: 1. Various actions may cause the kernel to lose interrupts, one such is running glxgears with radeon and observe the kernel disable interrupts all devices on IRQ 9 oddly. Actual results: System becomes unstable, network stops, ACPI switches to polling mode Expected results: No irqs are disabled. Additional info: hardware: Version: ThinkPad T42 String 1: IBM ThinkPad Embedded Controller -[1RHT71WW-3.04 ]- Usual /proc/interrupts: CPU0 0: 1077489 XT-PIC-XT timer 1: 5975 XT-PIC-XT i8042 2: 0 XT-PIC-XT cascade 3: 1 XT-PIC-XT ehci_hcd:usb1 4: 18 XT-PIC-XT serial 5: 106052 XT-PIC-XT yenta, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem 6: 3 XT-PIC-XT 7: 0 XT-PIC-XT parport0 8: 1 XT-PIC-XT rtc0 9: 7634 XT-PIC-XT acpi, uhci_hcd:usb2, yenta, eth0 10: 1 XT-PIC-XT uhci_hcd:usb4 11: 1 XT-PIC-XT uhci_hcd:usb3 12: 209481 XT-PIC-XT i8042 14: 29888 XT-PIC-XT ata_piix 15: 14941 XT-PIC-XT ata_piix NMI: 0 Non-maskable interrupts LOC: 0 Local timer interrupts RES: 0 Rescheduling interrupts CAL: 0 function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 MIS: 0 Oops output: [ 290.197429] irq 9: nobody cared (try booting with the "irqpoll" option) [ 290.197444] Pid: 2252, comm: X Not tainted 2.6.27-0.347.rc7.git1.fc10.i686 #1 [ 290.197455] [<c046d9ce>] __report_bad_irq+0x33/0x74 [ 290.197477] [<c046dbde>] note_interrupt+0x1cf/0x221 [ 290.197482] [<c044b395>] ? trace_hardirqs_off+0xb/0xd [ 290.197489] [<c046d0f7>] ? handle_IRQ_event+0x4c/0x54 [ 290.197494] [<c046e286>] handle_level_irq+0x85/0xb9 [ 290.197499] [<c046e201>] ? handle_level_irq+0x0/0xb9 [ 290.197505] [<c0406fa5>] do_IRQ+0x9f/0xc9 [ 290.197513] [<c04056f8>] common_interrupt+0x28/0x30 [ 290.197519] [<c044c0dc>] ? trace_hardirqs_on+0xb/0xd [ 290.197525] [<c044007b>] ? process_timer_rebalance+0xa7/0x171 [ 290.197533] [<c04326a8>] ? __do_softirq+0x6b/0x10f [ 290.197538] [<c043263d>] ? __do_softirq+0x0/0x10f [ 290.197543] [<c040704b>] do_softirq+0x7c/0xdd [ 290.197548] [<c046e201>] ? handle_level_irq+0x0/0xb9 [ 290.197553] [<c0432300>] irq_exit+0x49/0x88 [ 290.197559] [<c0406fb9>] do_IRQ+0xb3/0xc9 [ 290.197564] [<c04056f8>] common_interrupt+0x28/0x30 [ 290.197568] ======================= [ 290.197570] handlers: [ 290.197573] [<c055c6b4>] (acpi_irq+0x0/0x28) [ 290.197579] [<c05ef141>] (usb_hcd_irq+0x0/0xa8) [ 290.197584] [<f894126c>] (yenta_interrupt+0x0/0xc3 [yenta_socket]) [ 290.197608] [<f899a01b>] (e1000_intr+0x0/0x13f [e1000]) [ 290.197623] Disabling IRQ #9 [ 290.489747] INFO: trying to register non-static key. [ 290.489755] the code is fine but needs lockdep annotation. [ 290.489757] turning off the locking correctness validator. [ 290.489763] Pid: 3446, comm: glxgears Not tainted 2.6.27-0.347.rc7.git1.fc10.i686 #1 [ 290.489768] [<c06e2cf4>] ? printk+0x14/0x18 [ 290.489781] [<c044aecd>] register_lock_class+0x5a/0x285 [ 290.489788] [<c044c780>] __lock_acquire+0x97/0xae6 [ 290.489792] [<c0645c3f>] ? sock_aio_read+0xc7/0xd5 [ 290.489800] [<c044d22a>] lock_acquire+0x5b/0x81 [ 290.489804] [<c043faae>] ? add_wait_queue+0x17/0x35 [ 290.489812] [<c06e5248>] _spin_lock_irqsave+0x3f/0x6f [ 290.489817] [<c043faae>] ? add_wait_queue+0x17/0x35 [ 290.489822] [<c043faae>] add_wait_queue+0x17/0x35 [ 290.489830] [<f88e79c3>] radeon_irq_wait+0x8f/0x102 [radeon] [ 290.489854] [<c0427d09>] ? default_wake_function+0x0/0xd [ 290.489866] [<f88736cd>] drm_ioctl+0x1bb/0x230 [drm] [ 290.489894] [<f88e7934>] ? radeon_irq_wait+0x0/0x102 [radeon] [ 290.489917] [<c04a8605>] vfs_ioctl+0x55/0x6e [ 290.489923] [<c04a886d>] do_vfs_ioctl+0x24f/0x262 [ 290.489927] [<c06e3614>] ? _cond_resched+0x8/0x32 [ 290.489933] [<c04a88c5>] sys_ioctl+0x45/0x60 [ 290.489938] [<c0404d02>] syscall_call+0x7/0xb [ 290.489944] [<c06e007b>] ? native_cpu_up+0x49d/0x6e0 [ 290.489949] ======================= [ 290.489966] BUG: unable to handle kernel NULL pointer dereference at 00000004 [ 290.489970] IP: [<c053946e>] __list_add+0xa/0x5c [ 290.489977] *pde = 23d18067 *pte = 00000000 [ 290.489993] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 290.490005] Modules linked in: vboxdrv autofs4 fuse sunrpc ipt_REJECT nf_conntrack_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath ppdev snd_intel8x0 snd_intel8x0m video snd_ac97_codec snd_seq_dummy output ac97_bus snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss e1000 snd_pcm iTCO_wdt i2c_i801 iTCO_vendor_support yenta_socket rsrc_nonstatic parport_pc snd_timer pcspkr snd parport soundcore joydev snd_page_alloc pata_acpi ata_generic sha256_generic cbc aes_i586 dm_crypt dm_snapshot dm_zero dm_mirror dm_log radeon drm i2c_algo_bit i2c_core [last unloaded: microcode] [ 290.490017] [ 290.490017] Pid: 3446, comm: glxgears Not tainted (2.6.27-0.347.rc7.git1.fc10.i686 #1) [ 290.490017] EIP: 0060:[<c053946e>] EFLAGS: 00210082 CPU: 0 [ 290.490017] EIP is at __list_add+0xa/0x5c [ 290.490017] EAX: e3c44f10 EBX: e3c44f04 ECX: 00000000 EDX: f6a7a108 [ 290.490017] ESI: f6a7a108 EDI: e3c44f10 EBP: e3c44ee4 ESP: e3c44ed8 [ 290.490017] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 290.490017] Process glxgears (pid: 3446, ti=e3c44000 task=dd4a14b0 task.ti=e3c44000) [ 290.490017] Stack: e3c44f04 f6a7a0e8 00200246 e3c44ef8 c043fabe f6a7a000 f6a7a0e8 ffffe691 [ 290.490017] e3c44f24 f88e79c3 00000002 00000000 dd4a14b0 c0427d09 00000000 00000000 [ 290.490017] fffffff4 f64127b8 40046457 e3c44f48 f88736cd ea3b5000 f6a78000 f88e7934 [ 290.490017] Call Trace: [ 290.490017] [<c043fabe>] ? add_wait_queue+0x27/0x35 [ 290.490017] [<f88e79c3>] ? radeon_irq_wait+0x8f/0x102 [radeon] [ 290.490017] [<c0427d09>] ? default_wake_function+0x0/0xd [ 290.490017] [<f88736cd>] ? drm_ioctl+0x1bb/0x230 [drm] [ 290.490017] [<f88e7934>] ? radeon_irq_wait+0x0/0x102 [radeon] [ 290.490017] [<c04a8605>] ? vfs_ioctl+0x55/0x6e [ 290.490017] [<c04a886d>] ? do_vfs_ioctl+0x24f/0x262 [ 290.490017] [<c06e3614>] ? _cond_resched+0x8/0x32 [ 290.490017] [<c04a88c5>] ? sys_ioctl+0x45/0x60 [ 290.490017] [<c0404d02>] ? syscall_call+0x7/0xb [ 290.490017] [<c06e007b>] ? native_cpu_up+0x49d/0x6e0 [ 290.490017] ======================= [ 290.490017] Code: ef ff 83 c4 14 8b 13 8b 43 04 89 42 04 89 10 c7 43 04 00 02 20 00 c7 03 00 01 10 00 8b 5d fc c9 c3 55 89 e5 57 89 c7 56 89 d6 53 <8b> 41 04 89 cb 39 d0 74 17 51 50 52 68 29 54 7c c0 6a 1a 68 de [ 290.490017] EIP: [<c053946e>] __list_add+0xa/0x5c SS:ESP 0068:e3c44ed8 [ 290.490017] ---[ end trace 58bb180f76f64b48 ]--- [ 296.303067] ACPI: EC: missing confirmations, switch off interrupt mode.
same problem here on a thinkpad T40, with kernel-2.6.27-0.354.rc7.git3.fc10.i686. Problem occurs when gdm starts, the driver is radeon. The BIOS default IRQ pin assignment with this laptop makes all interrupts shared on IRQ11.
Created attachment 318107 [details] dmesg log dmesg until X starts. smolt uid for this laptop is pub_8be150b9-f49c-4c5e-968e-41db78fd37e4
This seems to be related to the PCMCIA yenta socket device. At least blacklisting it stops the interrupt from being shut off.
Somehow related to this commit upstream?: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=de85422b94ddb23c021126815ea49414047c13dc;hp=2542335ccf34cfb442d3fd842d7e78ca5e649951
my case really occurs when X starts. After shuffling a bit the interrupt assignments in the BIOS (config/pci), I have a situation where IRQ 10 is concerned, with a smaller number of interrupt handlers, when X starts: [drm] writeback test succeeded in 2 usecs irq 10: nobody cared (try booting with the "irqpoll" option) Pid: 2660, comm: Xorg Not tainted 2.6.27-0.372.rc8.fc10.i686 #1 [<c046db32>] __report_bad_irq+0x33/0x74 [<c046dd42>] note_interrupt+0x1cf/0x221 [<c044b465>] ? trace_hardirqs_off+0xb/0xd [<c046d25b>] ? handle_IRQ_event+0x4c/0x54 [<c046e3ea>] handle_level_irq+0x85/0xb9 [<c046e365>] ? handle_level_irq+0x0/0xb9 [<c0406fad>] do_IRQ+0x9f/0xc9 [<c0405700>] common_interrupt+0x28/0x30 [<c044c1ac>] ? trace_hardirqs_on+0xb/0xd [<c044007b>] ? process_timer_rebalance+0x1b/0x171 [<c043270c>] ? __do_softirq+0x6b/0x10f [<c04326a1>] ? __do_softirq+0x0/0x10f [<c0407053>] do_softirq+0x7c/0xdd [<c0432364>] irq_exit+0x49/0x88 [<c0415861>] smp_apic_timer_interrupt+0x73/0x81 [<c0405805>] apic_timer_interrupt+0x2d/0x34 [<c044c1ac>] ? trace_hardirqs_on+0xb/0xd [<c044007b>] ? process_timer_rebalance+0x1b/0x171 [<c049a9df>] ? kfree+0xf2/0x102 [<f8875700>] ? drm_ioctl+0x1ee/0x230 [drm] [<f8875700>] drm_ioctl+0x1ee/0x230 [drm] [<f88e2a39>] ? radeon_cp_setparam+0x0/0x193 [radeon] [<c04a8769>] vfs_ioctl+0x55/0x6e [<c04a89d1>] do_vfs_ioctl+0x24f/0x262 [<c050a8a0>] ? selinux_file_ioctl+0x3a/0x3d [<c04a8a29>] sys_ioctl+0x45/0x60 [<c0404d0a>] syscall_call+0x7/0xb ======================= handlers: [<c05ee62f>] (usb_hcd_irq+0x0/0xa8) [<f895801b>] (e1000_intr+0x0/0x13f [e1000]) Disabling IRQ #10 and in /var/log/Xorg.0.log : (II) RADEON(0): [drm] failure adding irq handler, there is a device already usin g that irq [drm] falling back to irq-free operation
I can reproduce it without yenta loaded, but with radeon drm: [ 68.385970] agpgart-intel 0000:00:00.0: AGP 2.0 bridge [ 68.386038] agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode [ 68.386104] pci 0000:01:00.0: putting AGP V2 device into 4x mode [ 70.171002] [drm] Setting GART location based on new memory map [ 70.171054] [drm] Loading R300 Microcode [ 70.171118] [drm] Num pipes: 1 [ 70.171131] [drm] writeback test succeeded in 1 usecs [ 181.729237] irq 9: nobody cared (try booting with the "irqpoll" option) [ 181.729262] Pid: 2164, comm: X Not tainted 2.6.27-0.377.rc8.git1.fc10.i686 #1 [ 181.729280] [<c0466a26>] __report_bad_irq+0x33/0x74 [ 181.729309] [<c0466c3a>] note_interrupt+0x1d3/0x225 [ 181.729322] [<c0466151>] ? handle_IRQ_event+0x61/0x69 [ 181.729337] [<c04672f6>] handle_level_irq+0x8d/0xc3 [ 181.729349] [<c0467269>] ? handle_level_irq+0x0/0xc3 [ 181.729363] [<c0406fc1>] do_IRQ+0x9f/0xc9 [ 181.729377] [<c0405710>] common_interrupt+0x28/0x30 [ 181.729393] [<c0431e60>] ? __do_softirq+0x6b/0x10f [ 181.729411] [<c0431df5>] ? __do_softirq+0x0/0x10f [ 181.729424] [<c0407067>] do_softirq+0x7c/0xdd [ 181.729436] [<c0467269>] ? handle_level_irq+0x0/0xc3 [ 181.729449] [<c0431ab8>] irq_exit+0x49/0x88 [ 181.729460] [<c0406fd5>] do_IRQ+0xb3/0xc9 [ 181.729471] [<c0405710>] common_interrupt+0x28/0x30 [ 181.729487] ======================= [ 181.729491] handlers: [ 181.729496] [<c05511b0>] (acpi_irq+0x0/0x28) [ 181.729509] [<c05e1e06>] (usb_hcd_irq+0x0/0xa8) [ 181.729523] [<f89b0006>] (e1000_intr+0x0/0x13f [e1000]) [ 181.729571] Disabling IRQ #9 [ 183.440880] BUG: unable to handle kernel NULL pointer dereference at 00000004 [ 183.440890] IP: [<c052ee42>] __list_add+0xa/0x5c [ 183.440900] *pde = 268d4067 *pte = 00000000 [ 183.440912] Oops: 0000 [#1] SMP [ 183.440919] Modules linked in: bridge stp bnep l2cap bluetooth autofs4 fuse sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath snd_intel8x0 snd_intel8x0m snd_ac97_codec ppdev e1000 snd_seq_dummy ac97_bus video snd_seq_oss output snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss iTCO_wdt iTCO_vendor_support snd_pcm parport_pc i2c_i801 snd_timer parport snd pcspkr joydev soundcore snd_page_alloc pata_acpi ata_generic sha256_generic cbc aes_i586 dm_crypt dm_snapshot dm_zero dm_mirror dm_log radeon drm i2c_algo_bit i2c_core [last unloaded: microcode] [ 183.440970] [ 183.440973] Pid: 2746, comm: glxgears Not tainted (2.6.27-0.377.rc8.git1.fc10.i686 #1) [ 183.440977] EIP: 0060:[<c052ee42>] EFLAGS: 00210046 CPU: 0 [ 183.440981] EIP is at __list_add+0xa/0x5c [ 183.440983] EAX: e68d1f10 EBX: e68d1f04 ECX: 00000000 EDX: f6d510ec [ 183.440986] ESI: f6d510ec EDI: e68d1f10 EBP: e68d1ee4 ESP: e68d1ed8 [ 183.440989] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 183.440992] Process glxgears (pid: 2746, ti=e68d1000 task=ebbb6600 task.ti=e68d1000) [ 183.440995] Stack: e68d1f04 f6d510e8 00200246 e68d1ef8 c043ee13 f6d51000 f6d510e8 fffe4468 [ 183.441003] e68d1f24 f88e699f 00000003 00000000 ebbb6600 c0427638 00000000 00000000 [ 183.441010] fffffff4 f6a0d660 40046457 e68d1f48 f88726dc f6b2f540 f6e34800 f88e6910 [ 183.441013] Call Trace: [ 183.441013] [<c043ee13>] ? add_wait_queue+0x27/0x35 [ 183.441013] [<f88e699f>] ? radeon_irq_wait+0x8f/0x102 [radeon] [ 183.441013] [<c0427638>] ? default_wake_function+0x0/0xd [ 183.441013] [<f88726dc>] ? drm_ioctl+0x1b2/0x227 [drm] [ 183.441013] [<f88e6910>] ? radeon_irq_wait+0x0/0x102 [radeon] [ 183.441013] [<c04a0735>] ? vfs_ioctl+0x55/0x6e [ 183.441013] [<c04a099d>] ? do_vfs_ioctl+0x24f/0x262 [ 183.441013] [<c06d3c7a>] ? _cond_resched+0x8/0x32 [ 183.441013] [<c052c122>] ? copy_to_user+0x40/0x110 [ 183.441013] [<c04a09f5>] ? sys_ioctl+0x45/0x60 [ 183.441013] [<c0404d32>] ? syscall_call+0x7/0xb [ 183.441013] [<c06d007b>] ? init_intel+0x1e0/0x27a [ 183.441013] ======================= [ 183.441013] Code: ef ff 83 c4 14 8b 13 8b 43 04 89 42 04 89 10 c7 43 04 00 02 20 00 c7 03 00 01 10 00 8b 5d fc c9 c3 55 89 e5 57 89 c7 56 89 d6 53 <8b> 41 04 89 cb 39 d0 74 17 51 50 52 68 5c ac 7a c0 6a 1a 68 11 [ 183.441013] EIP: [<c052ee42>] __list_add+0xa/0x5c SS:ESP 0068:e68d1ed8 [ 183.441013] ---[ end trace 77f17ea32ebebe89 ]--- [ 194.703059] ACPI: EC: missing confirmations, switch off interrupt mode.
as per airlied, this maybe a problem with vblank
Can we get /var/log/Xorg.0.log as well, please?
Created attachment 319274 [details] /var/log/Xorg.0.log Here is the X log file
Associated kernel.org bug: http://bugzilla.kernel.org/show_bug.cgi?id=11700 This really does now appear to look like radeon.
Closing, Dave has fixed this. New radeon drm kernel module in rawhide deals with IRQ properly now.