247379 – "irqpoll": misrouted interrupts deadlocks

Bug 247379 - "irqpoll": misrouted interrupts deadlocks

Summary: "irqpoll": misrouted interrupts deadlocks

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Dave Anderson
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	425461
TreeView+	depends on / blocked

Reported:	2007-07-08 11:47 UTC by Vasily Averin
Modified:	2008-05-21 14:45 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHBA-2008-0314
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-05-21 14:45:44 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
NMI in serial console logs (28.55 KB, text/plain) 2007-07-08 11:49 UTC, Vasily Averin	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2008:0314	0	normal	SHIPPED_LIVE	Updated kernel packages for Red Hat Enterprise Linux 5.2	2008-05-20 18:43:34 UTC

Description Vasily Averin 2007-07-08 11:47:23 UTC

Pavel Emelianov from OpenVZ/Virtuozzo linux kernel team has noticed the
following issue:

"While testing kernel on machine with "irqpoll" option I've caught such a
lockup:

__do_IRQ()
   spin_lock(&desc->lock);
           desc->chip->ack(); /* IRQ is ACKed */
note_interrupt()
misrouted_irq()
handle_IRQ_event()
           if (...)
      local_irq_enable_in_hardirq();
/* interrupts are enabled from now */
...
__do_IRQ() /* same IRQ we've started from */
   spin_lock(&desc->lock); /* LOCKUP */

Looking at misrouted_irq() code I've found that a potential deadlock like
this can also take place:

1CPU:
__do_IRQ()
   spin_lock(&desc->lock); /* irq = A */
misrouted_irq()
   for (i = 1; i < NR_IRQS; i++) {
      spin_lock(&desc->lock); /* irq = B */
      if (desc->status & IRQ_INPROGRESS) {

2CPU:
__do_IRQ()
   spin_lock(&desc->lock); /* irq = B */
misrouted_irq()
   for (i = 1; i < NR_IRQS; i++) {
      spin_lock(&desc->lock); /* irq = A */
      if (desc->status & IRQ_INPROGRESS) {

As the second lock on both CPUs is taken before checking that this irq is
being handled in another processor this may cause a deadlock.  This issue
is only theoretical."

This issue is fixed in linux mainstream by the following patches:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f72fa707604c015a6625e80f269506032d5430dc
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b42172fc7b569a0ef2b0fa38d71382969074c0e2

I've reproduced this issue when I used RHEL5 kerenl as kdump-kernel (kdump  uses
"irqpoll" option by default).

Comment 1 Vasily Averin 2007-07-08 11:49:34 UTC

Created attachment 158728 [details]
NMI in serial console logs

Comment 2 Dave Anderson 2007-10-30 20:26:16 UTC

Vasily,

> This issue is fixed in linux mainstream by the following patches:
> ...

Can you please confirm:

(1) The resultant RHEL5 patch should look like this:

--- linux-2.6.18.noarch/kernel/irq/handle.c.orig
+++ linux-2.6.18.noarch/kernel/irq/handle.c
@@ -233,10 +233,10 @@ fastcall unsigned int __do_IRQ(unsigned
                spin_unlock(&desc->lock);

                action_ret = handle_IRQ_event(irq, regs, action);
-
-               spin_lock(&desc->lock);
                if (!noirqdebug)
                        note_interrupt(irq, desc, action_ret, regs);
+
+               spin_lock(&desc->lock);
                if (likely(!(desc->status & IRQ_PENDING)))
                        break;
                desc->status &= ~IRQ_PENDING;

and: 

(2) If I provide you a test kernel, that you will be able to reproduce
    the conditions in order to confirm that the patch works?

Thanks,
  Dave Anderson

Comment 3 Vasily Averin 2007-10-31 08:54:18 UTC

Dave,

OpenVZ kernels uses such patch long time ago and I can confirm that it should
fix this issue.

Unfortunately we are unable to reproduce this issue. Of course we can load your
testkernel on the failed node, but its work can guarantee nothing.

Thank you,
Vasily Averin

Comment 4 Dave Anderson 2007-10-31 13:49:43 UTC

Ok, I figured that was probably the case...

Anyway, I appreciate your attaching the console log
of a confirmed case in a RHEL5-based kernel.

Thanks,
  Dave

Comment 5 RHEL Program Management 2007-11-01 18:25:29 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Dave Anderson 2007-11-05 15:28:13 UTC

(In reply to comment #3)
> ... 
> Unfortunately we are unable to reproduce this issue. Of course we can load
> your testkernel on the failed node, but its work can guarantee nothing.
> 
> Thank you,
> Vasily Averin

Hi Vasily,

I understand that this cannot be reproduced on demand, but for the sake
of this bugzilla, can you exercise this test kernel, and verify the patch?

An x86_64 binary rpm, the source rpm, and the linux-kernel-test.patch file
that is applied to the base kernel can be found here:

  http://people.redhat.com/anderson/BZ_247379

Thanks,
  Dave

Comment 7 Vasily Averin 2007-11-12 13:09:48 UTC

Patch looks correct.
Kernel has been running for 3 days already, it is stable and doesn't crash.

Comment 8 Dave Anderson 2007-11-12 14:42:02 UTC

Vasily, 

Thanks for taking the time to test it and check the patch.

Just to be clear on the the specific NMI watchdog timeout
scenario indicated in the attachment from comment #1, can
you confirm the following (or offer up the correct explanation): 

1. The "hald-addon-stor" task was doing a close(), presumably
on the ide cd rom device, which caused it to issue a commands
via ide_outsl() -- when an interrupt of some other device occurred.

(It appears to have been a timer/clock interrupt?)

2. The incoming IRQ took its desc->lock, ACK'd, and then handled,
its IRQ, and then called note_interrupt().

3. note_interrupt() in turn called misrouted_irq(), which
saw the ide-cd interrupt, and called handle_IRQ_event() for
that device.

4. handle_IRQ_event() re-enabled interrupts via local_irq_enable_in_hardirq().

5. we then see the trail through ide_intr(), which eventually
leads to the printk() of "hda: cdrom_pc_intr: The drive appears confused..."

6. the printk() temporarily disables interrupts, but when complete,
re-enables whatever flags status was in play at the time; as soon as
that happened, another (timer?) interrupt of the same type as the 
original kicked in, causing the lock-up.

That make sense?

Comment 9 Vasily Averin 2007-11-13 07:19:22 UTC

Your explanation is correct.
I would add that irq number should be saved in some register. I don't have
vmlinux  for your kernel but assume that it is RCX. RCX=0 ==> it was irq0, i.e
timer interrupt.

Comment 10 Vasily Averin 2007-11-13 07:35:55 UTC

Also I would note that it's very likely that first interrupt was irq0 too:

note_interrupt() calls misrouted_irq() under following conditions:
kernel/irq/spurious.c:note_interrupt()::
if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE)

We have irqfixup == 2 (irqpoll) and very likely irq=0

Comment 11 Dave Anderson 2007-11-13 13:51:26 UTC

I'll see if I can verify the IRQ number by the exception
frame contents.  I was also looking at the stale remnants
on the stack, i.e., the lock_timer_base, _activate_task, 
enqueue_task...

Thanks again for your help.

Comment 14 Don Zickus 2007-12-17 19:36:16 UTC

in 2.6.18-61.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 17 errata-xmlrpc 2008-05-21 14:45:44 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.