Pavel Emelianov from OpenVZ/Virtuozzo linux kernel team has noticed the following issue: "While testing kernel on machine with "irqpoll" option I've caught such a lockup: __do_IRQ() spin_lock(&desc->lock); desc->chip->ack(); /* IRQ is ACKed */ note_interrupt() misrouted_irq() handle_IRQ_event() if (...) local_irq_enable_in_hardirq(); /* interrupts are enabled from now */ ... __do_IRQ() /* same IRQ we've started from */ spin_lock(&desc->lock); /* LOCKUP */ Looking at misrouted_irq() code I've found that a potential deadlock like this can also take place: 1CPU: __do_IRQ() spin_lock(&desc->lock); /* irq = A */ misrouted_irq() for (i = 1; i < NR_IRQS; i++) { spin_lock(&desc->lock); /* irq = B */ if (desc->status & IRQ_INPROGRESS) { 2CPU: __do_IRQ() spin_lock(&desc->lock); /* irq = B */ misrouted_irq() for (i = 1; i < NR_IRQS; i++) { spin_lock(&desc->lock); /* irq = A */ if (desc->status & IRQ_INPROGRESS) { As the second lock on both CPUs is taken before checking that this irq is being handled in another processor this may cause a deadlock. This issue is only theoretical." This issue is fixed in linux mainstream by the following patches: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f72fa707604c015a6625e80f269506032d5430dc http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b42172fc7b569a0ef2b0fa38d71382969074c0e2 I've reproduced this issue when I used RHEL5 kerenl as kdump-kernel (kdump uses "irqpoll" option by default).
Created attachment 158728 [details] NMI in serial console logs
Vasily, > This issue is fixed in linux mainstream by the following patches: > ... Can you please confirm: (1) The resultant RHEL5 patch should look like this: --- linux-2.6.18.noarch/kernel/irq/handle.c.orig +++ linux-2.6.18.noarch/kernel/irq/handle.c @@ -233,10 +233,10 @@ fastcall unsigned int __do_IRQ(unsigned spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, regs, action); - - spin_lock(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret, regs); + + spin_lock(&desc->lock); if (likely(!(desc->status & IRQ_PENDING))) break; desc->status &= ~IRQ_PENDING; and: (2) If I provide you a test kernel, that you will be able to reproduce the conditions in order to confirm that the patch works? Thanks, Dave Anderson
Dave, OpenVZ kernels uses such patch long time ago and I can confirm that it should fix this issue. Unfortunately we are unable to reproduce this issue. Of course we can load your testkernel on the failed node, but its work can guarantee nothing. Thank you, Vasily Averin
Ok, I figured that was probably the case... Anyway, I appreciate your attaching the console log of a confirmed case in a RHEL5-based kernel. Thanks, Dave
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
(In reply to comment #3) > ... > Unfortunately we are unable to reproduce this issue. Of course we can load > your testkernel on the failed node, but its work can guarantee nothing. > > Thank you, > Vasily Averin Hi Vasily, I understand that this cannot be reproduced on demand, but for the sake of this bugzilla, can you exercise this test kernel, and verify the patch? An x86_64 binary rpm, the source rpm, and the linux-kernel-test.patch file that is applied to the base kernel can be found here: http://people.redhat.com/anderson/BZ_247379 Thanks, Dave
Patch looks correct. Kernel has been running for 3 days already, it is stable and doesn't crash.
Vasily, Thanks for taking the time to test it and check the patch. Just to be clear on the the specific NMI watchdog timeout scenario indicated in the attachment from comment #1, can you confirm the following (or offer up the correct explanation): 1. The "hald-addon-stor" task was doing a close(), presumably on the ide cd rom device, which caused it to issue a commands via ide_outsl() -- when an interrupt of some other device occurred. (It appears to have been a timer/clock interrupt?) 2. The incoming IRQ took its desc->lock, ACK'd, and then handled, its IRQ, and then called note_interrupt(). 3. note_interrupt() in turn called misrouted_irq(), which saw the ide-cd interrupt, and called handle_IRQ_event() for that device. 4. handle_IRQ_event() re-enabled interrupts via local_irq_enable_in_hardirq(). 5. we then see the trail through ide_intr(), which eventually leads to the printk() of "hda: cdrom_pc_intr: The drive appears confused..." 6. the printk() temporarily disables interrupts, but when complete, re-enables whatever flags status was in play at the time; as soon as that happened, another (timer?) interrupt of the same type as the original kicked in, causing the lock-up. That make sense?
Your explanation is correct. I would add that irq number should be saved in some register. I don't have vmlinux for your kernel but assume that it is RCX. RCX=0 ==> it was irq0, i.e timer interrupt.
Also I would note that it's very likely that first interrupt was irq0 too: note_interrupt() calls misrouted_irq() under following conditions: kernel/irq/spurious.c:note_interrupt():: if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) We have irqfixup == 2 (irqpoll) and very likely irq=0
I'll see if I can verify the IRQ number by the exception frame contents. I was also looking at the stale remnants on the stack, i.e., the lock_timer_base, _activate_task, enqueue_task... Thanks again for your help.
in 2.6.18-61.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html