From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040913 Description of problem: I upgraded from FC1 to FC2. The same hardware runs FC1 with no problems. I've run into several problems after the upgrade. This is just one of them. The hardware is a Dell Dimension 4550, 1GB ram, 40GB IDE disk, ATI Rage 128 Pro Ultra video. While trying to login, the system hangs. I can reproduce this easily as follows... I try to login through the GUI login screen to a test account that has no special dot files. The login hangs after putting up the "metacity" icon. I login from another machine using ssh and kill gnome-session. That kills the hung login but also immediately displays the "Disabling IRQ #11" message in my ssh window. The GUI login screen continues to function and I can use it to reboot the system. The network is dead (it uses IRQ #11). If I login as root instead of a normal user, the login succeeds, but then I run into other problems (not described in this bug report). Many people have suggest turning off acpi. I've tried many things, including booting with this line in grub.conf: kernel /vmlinuz-2.6.8-1.521 ro root=LABEL=/ hdd=ide-scsi acpi=off pci=noacpi noapic It made no difference. /proc/interrupts says: CPU0 0: 399303 XT-PIC timer 1: 40 XT-PIC i8042 2: 0 XT-PIC cascade 3: 0 XT-PIC ehci_hcd 8: 1 XT-PIC rtc 9: 0 XT-PIC acpi, uhci_hcd 10: 0 XT-PIC uhci_hcd 11: 26207 XT-PIC uhci_hcd, eth0, r128@PCI:1:0:0, Intel 82801DB-ICH4 12: 84 XT-PIC i8042 14: 12068 XT-PIC ide0 15: 1182 XT-PIC ide1 NMI: 0 ERR: 0 I've tried disabling both USB and sound in the BIOS. It didn't help. Version-Release number of selected component (if applicable): kernel-2.6.8-1.521 How reproducible: Always Steps to Reproduce: 1. login to non-root account using GUI login sreen. 2. when login hangs, use ssh from another machine to login. 3. kill gnome-session Actual Results: When it dies it says: Sep 16 21:49:06 dell kernel: irq 11: nobody cared! (screaming interrupt?) Sep 16 21:49:06 dell kernel: irq 11: Please try booting with acpi=off and report a bug Sep 16 21:49:06 dell kernel: Stack pointer is garbage, not printing trace Sep 16 21:49:06 dell kernel: handlers: Sep 16 21:49:06 dell kernel: [<429b3010>] (e100_intr+0x0/0xe6 [e100]) Sep 16 21:49:06 dell kernel: [<42a3c9a2>] (snd_intel8x0_interrupt+0x0/0x44f [snd_intel8x0]) Sep 16 21:49:06 dell kernel: Disabling IRQ #11 In the one case where it printed a stack trace I got: Sep 11 15:55:42 dell kernel: irq 11: nobody cared! (screaming interrupt?) Sep 11 15:55:42 dell kernel: Call Trace: Sep 11 15:55:42 dell kernel: [<021070c9>] __report_bad_irq+0x2b/0x67 Sep 11 15:55:42 dell kernel: [<02107161>] note_interrupt+0x43/0x66 Sep 11 15:55:42 dell kernel: [<02107327>] do_IRQ+0x109/0x169 Sep 11 15:55:42 dell kernel: [<0211af64>] __do_softirq+0x2c/0x73 Sep 11 15:55:42 dell kernel: [<021078f5>] do_softirq+0x46/0x4d Sep 11 15:55:42 dell kernel: ======================= Sep 11 15:55:42 dell kernel: [<0210737b>] do_IRQ+0x15d/0x169 Sep 11 15:55:42 dell kernel: Sep 11 15:55:42 dell kernel: handlers: Sep 11 15:55:42 dell kernel: [<0221522d>] (usb_hcd_irq+0x0/0x4b) Sep 11 15:55:42 dell kernel: [<429cbc6e>] (e100_intr+0x0/0xe0 [e100]) Sep 11 15:55:42 dell kernel: [<44d88501>] (snd_intel8x0_interrupt+0x0/0x17e [snd_intel8x0]) Sep 11 15:55:42 dell kernel: Disabling IRQ #11 Expected Results: System doesn't die. Additional info:
I've done some more debugging and this is what I've determined. First, I fixed the problem that was preventing me from logging in successfully. That problem was unrelated to this. Now I can login and I can reproduce this problem every time I logout. I built a kernel with some additional debugging information and it appears that the problem occurs just after the r128 driver is told to cleanup and remove its IRQ handler. My theory, which I have not yet proven, is that the device continues to generate interrupts even after the IRQ handler for the device has been removed, which of course means there's no one to handle the interrupt. The driver is clearly trying to disable interrupts for the device, but perhaps it's not working. Not also that the r128 driver in FC1 did not use interrupts. That may explain why FC1 did not have this problem.
Ok, I think I've now proven that my ATI Rage 128 device is generating interrupts even after the r128 driver disables interrupts. In r128_driver_irq_uninstall, after it writes to the device to disable interrupts, I set a global variable that says interrupts are disabled. I then use vblank_wait to wait for the next vblank interrupt. If interrupts were really disabled, I would expect vblank_wait to return indicating that the timeout expired. It doesn't, it returns success. In r128_dma_service, if I get an interrupt while the global "interrupts are disabled" flag is set, I increment a counter. After vblank_wait returns in r128_driver_irq_uninstall, I check the count. It indicates that an interrupt was received after interrupts were disabled. To me, that looks like proof that there's something wrong here. Maybe the hardware is broken, or maybe it's just not working the way the driver expects. Or maybe the driver is not doing the right thing to really disable interrupts. Now I need help from someone who actually understands this driver to figure out how to fix this.
This bug is the same as the Xorg bug reported at https://bugs.freedesktop.org/show_bug.cgi?id=1886 and https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=138822 The workaround described there (comment out ``Load "dri"'') solved the problem for me. I'm closing this bug as a duplicate of 138822. *** This bug has been marked as a duplicate of 138822 ***
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.