Bug 432452 - [RHEL4] Setting IRQ affinity does not work with MSI devices
[RHEL4] Setting IRQ affinity does not work with MSI devices
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
high Severity high
: rc
: ---
Assigned To: Flavio Leitner
Martin Jenner
Depends On:
Blocks: 391511 461297
  Show dependency treegraph
Reported: 2008-02-11 22:34 EST by Flavio Leitner
Modified: 2010-10-22 18:25 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-01-06 16:15:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Patch fixing smp irq affinity (5.59 KB, patch)
2008-02-11 22:34 EST, Flavio Leitner
no flags Details | Diff
Patch based on 1769b46a to fix spurious interrupts (4.93 KB, patch)
2008-04-29 10:08 EDT, Flavio Leitner
no flags Details | Diff
patches combined (10.96 KB, patch)
2008-04-30 12:51 EDT, Flavio Leitner
no flags Details | Diff

  None (edit)
Description Flavio Leitner 2008-02-11 22:34:32 EST
Description of problem:
Echoing a mask to /proc/irq/<irq>/smp_affinity and then 
check /proc/interrupts shows that mask didn't work.

Version-Release number of selected component (if applicable):
Up to 2.6.9-68.11.EL

How reproducible:

Steps to Reproduce:
1. Select a device with MSI without masking & pending bit support
2. Find device irq
3. Change mask echoing to /proc/irq/<IRQ>/smp_affinity
4. Check the results in /proc/interrupts
Actual results:
The mask is ignored

Expected results:
IRQs moved to another CPU

Additional info:
The MSI should be disabled when moving IRQs on chips without 
Mask-and-Pending bits. This patch based on upstream code 
merge two IRQ chip drivers and also add msi_set_enable() to 
handle this case. 

Comment 1 Flavio Leitner 2008-02-11 22:34:32 EST
Created attachment 294622 [details]
Patch fixing smp irq affinity
Comment 2 Andy Gospodarek 2008-02-13 09:14:45 EST
Flavio, was this based on a particular upstream commit?  If so can you let me
know what it was so I can take a peek?

Also, let me know if you (or a customer) have tested this.
Comment 3 Flavio Leitner 2008-02-13 14:51:37 EST
Hi Andy,

I've tested this patch on hs21-7995-2.gsslab.rdu.redhat.com and it worked.
Still waiting customer feedback, though.

Actually, it's not based on just one commit but the main commit is:

commit 58e0543e8f355b32f0778a18858b255adb7402ae
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Mon Mar 5 00:30:11 2007 -0800

    [PATCH] msi: support masking msi irqs without a mask bit

    For devices that do not support msi-x we only support 1 interrupt.
    Therefore we can disable that one interrupt by disabling the msi 
    capability itself.  If
    we leave the intx interrupts disabled while we have the msi capability
    disabled no interrupts should be delivered from that device.

    Devices with just the minimal msi support (and thus hitting this code path)
    include things like the intel e1000 nic, so it looks like is going to be a
    fairly common case and thus important to get right.

    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Michael Ellerman <michael@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Greg KH <greg@kroah.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index c43e7d2..01869b1 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -85,6 +85,8 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
                        mask_bits &= ~(1);
                        mask_bits |= flag;
                        pci_write_config_dword(entry->dev, pos, mask_bits);
+               } else {
+                       msi_set_enable(entry->dev, !flag);
        case PCI_CAP_ID_MSIX:

Comment 4 Andy Gospodarek 2008-02-13 15:23:57 EST
Thanks for the update.  There are several MSI issues in rhel5 that need to be addressed as well, so you may want to check to be sure this isn't broken there as well.  It is nice to fix both releases at the same time if we can.
Comment 5 Flavio Leitner 2008-02-14 18:21:09 EST
Sure, the RHEL5 version is bz#43245, it's also tested and ok.
Comment 6 Flavio Leitner 2008-02-14 18:22:18 EST
Actually, is bz#432451 - https://bugzilla.redhat.com/show_bug.cgi?id=432451
Comment 7 Andy Gospodarek 2008-03-25 16:45:14 EDT

I hate to push this out but there are some other msi problems I've noticed in
rhel4 (that are related to the rhel5 bug 428696).

Can you combine this patch and take a look at those to see if we can put them
all in 4.8?

Comment 10 Flavio Leitner 2008-04-29 10:08:13 EDT
Created attachment 304118 [details]
Patch based on 1769b46a to fix spurious interrupts
Comment 11 Flavio Leitner 2008-04-30 12:51:02 EDT
Created attachment 304255 [details]
patches combined
Comment 12 RHEL Product and Program Management 2008-09-03 08:55:06 EDT
Updating PM score.
Comment 14 Andy Gospodarek 2008-09-24 17:31:21 EDT
The RHEL5 bug was closed as a WONTFIX due to hardware limitations, so I suspect this bug will have the same fate.

Note You need to log in before you can comment on or make changes to this bug.