Bug 463593

Summary: [drm][radeon] kernel disables IRQ when radeon drm is used
Product: [Fedora] Fedora Reporter: Shawn Starr <shawn.starr>
Component: xorg-x11-drv-atiAssignee: Dave Airlie <airlied>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: rawhideCC: fabrice, kernel-maint, mcepl, mschmidt, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-11 15:19:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg log
none
/var/log/Xorg.0.log none

Description Shawn Starr 2008-09-24 01:43:30 UTC
Description of problem: interrupts stop working


Version-Release number of selected component (if applicable):
2.6.27-0.347.rc7.git1.fc10.i686 but older versions recently have introduced this from 337+ 

How reproducible:
100%

Steps to Reproduce:
1. Various actions may cause the kernel to lose interrupts, one such is running glxgears with radeon and observe the kernel disable interrupts all devices on IRQ 9 oddly.

Actual results:
System becomes unstable, network stops, ACPI switches to polling mode

Expected results:
No irqs are disabled.

Additional info:

hardware: 

        Version: ThinkPad T42
        String 1: IBM ThinkPad Embedded Controller -[1RHT71WW-3.04    ]-

Usual /proc/interrupts:

           CPU0
  0:    1077489    XT-PIC-XT        timer
  1:       5975    XT-PIC-XT        i8042
  2:          0    XT-PIC-XT        cascade
  3:          1    XT-PIC-XT        ehci_hcd:usb1
  4:         18    XT-PIC-XT        serial
  5:     106052    XT-PIC-XT        yenta, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem
  6:          3    XT-PIC-XT
  7:          0    XT-PIC-XT        parport0
  8:          1    XT-PIC-XT        rtc0
  9:       7634    XT-PIC-XT        acpi, uhci_hcd:usb2, yenta, eth0
 10:          1    XT-PIC-XT        uhci_hcd:usb4
 11:          1    XT-PIC-XT        uhci_hcd:usb3
 12:     209481    XT-PIC-XT        i8042
 14:      29888    XT-PIC-XT        ata_piix
 15:      14941    XT-PIC-XT        ata_piix
NMI:          0   Non-maskable interrupts
LOC:          0   Local timer interrupts
RES:          0   Rescheduling interrupts
CAL:          0   function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0
MIS:          0

Oops output:

[  290.197429] irq 9: nobody cared (try booting with the "irqpoll" option)                                                           
[  290.197444] Pid: 2252, comm: X Not tainted 2.6.27-0.347.rc7.git1.fc10.i686 #1                                                     
[  290.197455]  [<c046d9ce>] __report_bad_irq+0x33/0x74                                                                              
[  290.197477]  [<c046dbde>] note_interrupt+0x1cf/0x221                                                                              
[  290.197482]  [<c044b395>] ? trace_hardirqs_off+0xb/0xd                                                                            
[  290.197489]  [<c046d0f7>] ? handle_IRQ_event+0x4c/0x54                                                                            
[  290.197494]  [<c046e286>] handle_level_irq+0x85/0xb9                                                                              
[  290.197499]  [<c046e201>] ? handle_level_irq+0x0/0xb9                                                                             
[  290.197505]  [<c0406fa5>] do_IRQ+0x9f/0xc9                                                                                        
[  290.197513]  [<c04056f8>] common_interrupt+0x28/0x30                                                                              
[  290.197519]  [<c044c0dc>] ? trace_hardirqs_on+0xb/0xd                                                                             
[  290.197525]  [<c044007b>] ? process_timer_rebalance+0xa7/0x171                                                                    
[  290.197533]  [<c04326a8>] ? __do_softirq+0x6b/0x10f                                                                               
[  290.197538]  [<c043263d>] ? __do_softirq+0x0/0x10f                                                                                
[  290.197543]  [<c040704b>] do_softirq+0x7c/0xdd                                                                                    
[  290.197548]  [<c046e201>] ? handle_level_irq+0x0/0xb9                                                                             
[  290.197553]  [<c0432300>] irq_exit+0x49/0x88                                                                                      
[  290.197559]  [<c0406fb9>] do_IRQ+0xb3/0xc9                                                                                        
[  290.197564]  [<c04056f8>] common_interrupt+0x28/0x30                                                                              
[  290.197568]  =======================                                                                                              
[  290.197570] handlers:                                                                                                             
[  290.197573] [<c055c6b4>] (acpi_irq+0x0/0x28)                                                                                      
[  290.197579] [<c05ef141>] (usb_hcd_irq+0x0/0xa8)                                                                                   
[  290.197584] [<f894126c>] (yenta_interrupt+0x0/0xc3 [yenta_socket])                                                                
[  290.197608] [<f899a01b>] (e1000_intr+0x0/0x13f [e1000])                                                                           
[  290.197623] Disabling IRQ #9                                                                                                      
[  290.489747] INFO: trying to register non-static key.                                                                              
[  290.489755] the code is fine but needs lockdep annotation.                                                                        
[  290.489757] turning off the locking correctness validator.                                                                        
[  290.489763] Pid: 3446, comm: glxgears Not tainted 2.6.27-0.347.rc7.git1.fc10.i686 #1                                              
[  290.489768]  [<c06e2cf4>] ? printk+0x14/0x18                                                                                      
[  290.489781]  [<c044aecd>] register_lock_class+0x5a/0x285                                                                          
[  290.489788]  [<c044c780>] __lock_acquire+0x97/0xae6                                                                               
[  290.489792]  [<c0645c3f>] ? sock_aio_read+0xc7/0xd5                                                                               
[  290.489800]  [<c044d22a>] lock_acquire+0x5b/0x81                                                                                  
[  290.489804]  [<c043faae>] ? add_wait_queue+0x17/0x35                                                                              
[  290.489812]  [<c06e5248>] _spin_lock_irqsave+0x3f/0x6f                                                                            
[  290.489817]  [<c043faae>] ? add_wait_queue+0x17/0x35                                                                              
[  290.489822]  [<c043faae>] add_wait_queue+0x17/0x35                                                                                
[  290.489830]  [<f88e79c3>] radeon_irq_wait+0x8f/0x102 [radeon]                                                                     
[  290.489854]  [<c0427d09>] ? default_wake_function+0x0/0xd                                                                         
[  290.489866]  [<f88736cd>] drm_ioctl+0x1bb/0x230 [drm]                                                                             
[  290.489894]  [<f88e7934>] ? radeon_irq_wait+0x0/0x102 [radeon]                                                                    
[  290.489917]  [<c04a8605>] vfs_ioctl+0x55/0x6e                                                                                     
[  290.489923]  [<c04a886d>] do_vfs_ioctl+0x24f/0x262                                                                                
[  290.489927]  [<c06e3614>] ? _cond_resched+0x8/0x32                                                                                
[  290.489933]  [<c04a88c5>] sys_ioctl+0x45/0x60                                                                                     
[  290.489938]  [<c0404d02>] syscall_call+0x7/0xb                                                                                    
[  290.489944]  [<c06e007b>] ? native_cpu_up+0x49d/0x6e0                                                                             
[  290.489949]  =======================                                                                                              
[  290.489966] BUG: unable to handle kernel NULL pointer dereference at 00000004                                                     
[  290.489970] IP: [<c053946e>] __list_add+0xa/0x5c                                                                                  
[  290.489977] *pde = 23d18067 *pte = 00000000                                                                                       
[  290.489993] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC                                                                                   
[  290.490005] Modules linked in: vboxdrv autofs4 fuse sunrpc ipt_REJECT nf_conntrack_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath ppdev snd_intel8x0 snd_intel8x0m video snd_ac97_codec snd_seq_dummy output ac97_bus snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss e1000 snd_pcm iTCO_wdt i2c_i801 iTCO_vendor_support yenta_socket rsrc_nonstatic parport_pc snd_timer pcspkr snd parport soundcore joydev snd_page_alloc pata_acpi ata_generic sha256_generic cbc aes_i586 dm_crypt dm_snapshot dm_zero dm_mirror dm_log radeon drm i2c_algo_bit i2c_core [last unloaded: microcode]                                                                                    
[  290.490017]
[  290.490017] Pid: 3446, comm: glxgears Not tainted (2.6.27-0.347.rc7.git1.fc10.i686 #1)
[  290.490017] EIP: 0060:[<c053946e>] EFLAGS: 00210082 CPU: 0
[  290.490017] EIP is at __list_add+0xa/0x5c
[  290.490017] EAX: e3c44f10 EBX: e3c44f04 ECX: 00000000 EDX: f6a7a108
[  290.490017] ESI: f6a7a108 EDI: e3c44f10 EBP: e3c44ee4 ESP: e3c44ed8
[  290.490017]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  290.490017] Process glxgears (pid: 3446, ti=e3c44000 task=dd4a14b0 task.ti=e3c44000)
[  290.490017] Stack: e3c44f04 f6a7a0e8 00200246 e3c44ef8 c043fabe f6a7a000 f6a7a0e8 ffffe691
[  290.490017]        e3c44f24 f88e79c3 00000002 00000000 dd4a14b0 c0427d09 00000000 00000000
[  290.490017]        fffffff4 f64127b8 40046457 e3c44f48 f88736cd ea3b5000 f6a78000 f88e7934
[  290.490017] Call Trace:
[  290.490017]  [<c043fabe>] ? add_wait_queue+0x27/0x35
[  290.490017]  [<f88e79c3>] ? radeon_irq_wait+0x8f/0x102 [radeon]
[  290.490017]  [<c0427d09>] ? default_wake_function+0x0/0xd
[  290.490017]  [<f88736cd>] ? drm_ioctl+0x1bb/0x230 [drm]
[  290.490017]  [<f88e7934>] ? radeon_irq_wait+0x0/0x102 [radeon]
[  290.490017]  [<c04a8605>] ? vfs_ioctl+0x55/0x6e
[  290.490017]  [<c04a886d>] ? do_vfs_ioctl+0x24f/0x262
[  290.490017]  [<c06e3614>] ? _cond_resched+0x8/0x32
[  290.490017]  [<c04a88c5>] ? sys_ioctl+0x45/0x60
[  290.490017]  [<c0404d02>] ? syscall_call+0x7/0xb
[  290.490017]  [<c06e007b>] ? native_cpu_up+0x49d/0x6e0
[  290.490017]  =======================
[  290.490017] Code: ef ff 83 c4 14 8b 13 8b 43 04 89 42 04 89 10 c7 43 04 00 02 20 00 c7 03 00 01 10 00 8b 5d fc c9 c3 55 89 e5 57 89 c7 56 89 d6 53 <8b> 41 04 89 cb 39 d0 74 17 51 50 52 68 29 54 7c c0 6a 1a 68 de
[  290.490017] EIP: [<c053946e>] __list_add+0xa/0x5c SS:ESP 0068:e3c44ed8
[  290.490017] ---[ end trace 58bb180f76f64b48 ]---
[  296.303067] ACPI: EC: missing confirmations, switch off interrupt mode.

Comment 1 Fabrice Bellet 2008-09-30 19:00:10 UTC
same problem here on a thinkpad T40, with kernel-2.6.27-0.354.rc7.git3.fc10.i686. Problem occurs when gdm starts, the driver is radeon. The BIOS default IRQ pin assignment with this laptop makes all interrupts shared on IRQ11.

Comment 2 Fabrice Bellet 2008-09-30 19:04:18 UTC
Created attachment 318107 [details]
dmesg log

dmesg until X starts. smolt uid for this laptop is pub_8be150b9-f49c-4c5e-968e-41db78fd37e4

Comment 3 Shawn Starr 2008-10-01 18:05:37 UTC
This seems to be related to the PCMCIA yenta socket device. At least blacklisting it stops the interrupt from being shut off.

Comment 5 Fabrice Bellet 2008-10-01 19:04:04 UTC
my case really occurs when X starts. After shuffling a bit the interrupt assignments in the BIOS (config/pci), I have a situation where IRQ 10 is concerned, with a smaller number of interrupt handlers, when X starts:

[drm] writeback test succeeded in 2 usecs
irq 10: nobody cared (try booting with the "irqpoll" option)
Pid: 2660, comm: Xorg Not tainted 2.6.27-0.372.rc8.fc10.i686 #1
 [<c046db32>] __report_bad_irq+0x33/0x74
 [<c046dd42>] note_interrupt+0x1cf/0x221
 [<c044b465>] ? trace_hardirqs_off+0xb/0xd
 [<c046d25b>] ? handle_IRQ_event+0x4c/0x54
 [<c046e3ea>] handle_level_irq+0x85/0xb9
 [<c046e365>] ? handle_level_irq+0x0/0xb9
 [<c0406fad>] do_IRQ+0x9f/0xc9
 [<c0405700>] common_interrupt+0x28/0x30
 [<c044c1ac>] ? trace_hardirqs_on+0xb/0xd
 [<c044007b>] ? process_timer_rebalance+0x1b/0x171
 [<c043270c>] ? __do_softirq+0x6b/0x10f
 [<c04326a1>] ? __do_softirq+0x0/0x10f
 [<c0407053>] do_softirq+0x7c/0xdd
 [<c0432364>] irq_exit+0x49/0x88
 [<c0415861>] smp_apic_timer_interrupt+0x73/0x81
 [<c0405805>] apic_timer_interrupt+0x2d/0x34
 [<c044c1ac>] ? trace_hardirqs_on+0xb/0xd
 [<c044007b>] ? process_timer_rebalance+0x1b/0x171
 [<c049a9df>] ? kfree+0xf2/0x102
 [<f8875700>] ? drm_ioctl+0x1ee/0x230 [drm]
 [<f8875700>] drm_ioctl+0x1ee/0x230 [drm]
 [<f88e2a39>] ? radeon_cp_setparam+0x0/0x193 [radeon]
 [<c04a8769>] vfs_ioctl+0x55/0x6e
 [<c04a89d1>] do_vfs_ioctl+0x24f/0x262
 [<c050a8a0>] ? selinux_file_ioctl+0x3a/0x3d
 [<c04a8a29>] sys_ioctl+0x45/0x60
 [<c0404d0a>] syscall_call+0x7/0xb
 =======================
handlers:
[<c05ee62f>] (usb_hcd_irq+0x0/0xa8)
[<f895801b>] (e1000_intr+0x0/0x13f [e1000])
Disabling IRQ #10

and in /var/log/Xorg.0.log :

(II) RADEON(0): [drm] failure adding irq handler, there is a device already usin
g that irq
[drm] falling back to irq-free operation

Comment 6 Shawn Starr 2008-10-02 05:31:29 UTC
I can reproduce it without yenta loaded, but with radeon drm:

[   68.385970] agpgart-intel 0000:00:00.0: AGP 2.0 bridge
[   68.386038] agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
[   68.386104] pci 0000:01:00.0: putting AGP V2 device into 4x mode
[   70.171002] [drm] Setting GART location based on new memory map
[   70.171054] [drm] Loading R300 Microcode
[   70.171118] [drm] Num pipes: 1
[   70.171131] [drm] writeback test succeeded in 1 usecs
[  181.729237] irq 9: nobody cared (try booting with the "irqpoll" option)
[  181.729262] Pid: 2164, comm: X Not tainted 2.6.27-0.377.rc8.git1.fc10.i686 #1
[  181.729280]  [<c0466a26>] __report_bad_irq+0x33/0x74
[  181.729309]  [<c0466c3a>] note_interrupt+0x1d3/0x225
[  181.729322]  [<c0466151>] ? handle_IRQ_event+0x61/0x69
[  181.729337]  [<c04672f6>] handle_level_irq+0x8d/0xc3
[  181.729349]  [<c0467269>] ? handle_level_irq+0x0/0xc3
[  181.729363]  [<c0406fc1>] do_IRQ+0x9f/0xc9
[  181.729377]  [<c0405710>] common_interrupt+0x28/0x30
[  181.729393]  [<c0431e60>] ? __do_softirq+0x6b/0x10f
[  181.729411]  [<c0431df5>] ? __do_softirq+0x0/0x10f
[  181.729424]  [<c0407067>] do_softirq+0x7c/0xdd
[  181.729436]  [<c0467269>] ? handle_level_irq+0x0/0xc3
[  181.729449]  [<c0431ab8>] irq_exit+0x49/0x88
[  181.729460]  [<c0406fd5>] do_IRQ+0xb3/0xc9
[  181.729471]  [<c0405710>] common_interrupt+0x28/0x30
[  181.729487]  =======================
[  181.729491] handlers:
[  181.729496] [<c05511b0>] (acpi_irq+0x0/0x28)
[  181.729509] [<c05e1e06>] (usb_hcd_irq+0x0/0xa8)
[  181.729523] [<f89b0006>] (e1000_intr+0x0/0x13f [e1000])
[  181.729571] Disabling IRQ #9
[  183.440880] BUG: unable to handle kernel NULL pointer dereference at 00000004
[  183.440890] IP: [<c052ee42>] __list_add+0xa/0x5c
[  183.440900] *pde = 268d4067 *pte = 00000000 
[  183.440912] Oops: 0000 [#1] SMP 
[  183.440919] Modules linked in: bridge stp bnep l2cap bluetooth autofs4 fuse sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath snd_intel8x0 snd_intel8x0m snd_ac97_codec ppdev e1000 snd_seq_dummy ac97_bus video snd_seq_oss output snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss iTCO_wdt iTCO_vendor_support snd_pcm parport_pc i2c_i801 snd_timer parport snd pcspkr joydev soundcore snd_page_alloc pata_acpi ata_generic sha256_generic cbc aes_i586 dm_crypt dm_snapshot dm_zero dm_mirror dm_log radeon drm i2c_algo_bit i2c_core [last unloaded: microcode]
[  183.440970] 
[  183.440973] Pid: 2746, comm: glxgears Not tainted (2.6.27-0.377.rc8.git1.fc10.i686 #1)
[  183.440977] EIP: 0060:[<c052ee42>] EFLAGS: 00210046 CPU: 0
[  183.440981] EIP is at __list_add+0xa/0x5c
[  183.440983] EAX: e68d1f10 EBX: e68d1f04 ECX: 00000000 EDX: f6d510ec
[  183.440986] ESI: f6d510ec EDI: e68d1f10 EBP: e68d1ee4 ESP: e68d1ed8
[  183.440989]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  183.440992] Process glxgears (pid: 2746, ti=e68d1000 task=ebbb6600 task.ti=e68d1000)
[  183.440995] Stack: e68d1f04 f6d510e8 00200246 e68d1ef8 c043ee13 f6d51000 f6d510e8 fffe4468 
[  183.441003]        e68d1f24 f88e699f 00000003 00000000 ebbb6600 c0427638 00000000 00000000 
[  183.441010]        fffffff4 f6a0d660 40046457 e68d1f48 f88726dc f6b2f540 f6e34800 f88e6910 
[  183.441013] Call Trace:
[  183.441013]  [<c043ee13>] ? add_wait_queue+0x27/0x35
[  183.441013]  [<f88e699f>] ? radeon_irq_wait+0x8f/0x102 [radeon]
[  183.441013]  [<c0427638>] ? default_wake_function+0x0/0xd
[  183.441013]  [<f88726dc>] ? drm_ioctl+0x1b2/0x227 [drm]
[  183.441013]  [<f88e6910>] ? radeon_irq_wait+0x0/0x102 [radeon]
[  183.441013]  [<c04a0735>] ? vfs_ioctl+0x55/0x6e
[  183.441013]  [<c04a099d>] ? do_vfs_ioctl+0x24f/0x262
[  183.441013]  [<c06d3c7a>] ? _cond_resched+0x8/0x32
[  183.441013]  [<c052c122>] ? copy_to_user+0x40/0x110
[  183.441013]  [<c04a09f5>] ? sys_ioctl+0x45/0x60
[  183.441013]  [<c0404d32>] ? syscall_call+0x7/0xb
[  183.441013]  [<c06d007b>] ? init_intel+0x1e0/0x27a
[  183.441013]  =======================
[  183.441013] Code: ef ff 83 c4 14 8b 13 8b 43 04 89 42 04 89 10 c7 43 04 00 02 20 00 c7 03 00 01 10 00 8b 5d fc c9 c3 55 89 e5 57 89 c7 56 89 d6 53 <8b> 41 04 89 cb 39 d0 74 17 51 50 52 68 5c ac 7a c0 6a 1a 68 11 
[  183.441013] EIP: [<c052ee42>] __list_add+0xa/0x5c SS:ESP 0068:e68d1ed8
[  183.441013] ---[ end trace 77f17ea32ebebe89 ]---
[  194.703059] ACPI: EC: missing confirmations, switch off interrupt mode.

Comment 7 Shawn Starr 2008-10-02 05:43:16 UTC
as per airlied, this maybe a problem with vblank

Comment 8 Matěj Cepl 2008-10-02 10:49:26 UTC
Can we get /var/log/Xorg.0.log as well, please?

Comment 9 Fabrice Bellet 2008-10-02 18:14:47 UTC
Created attachment 319274 [details]
/var/log/Xorg.0.log

Here is the X log file

Comment 10 Shawn Starr 2008-10-08 07:53:12 UTC
Associated kernel.org bug: 
http://bugzilla.kernel.org/show_bug.cgi?id=11700

This really does now appear to look like radeon.

Comment 11 Shawn Starr 2008-10-11 15:19:03 UTC
Closing, Dave has fixed this. New radeon drm kernel module in rawhide deals with IRQ properly now.