Bug 432452 - [RHEL4] Setting IRQ affinity does not work with MSI devices
Summary: [RHEL4] Setting IRQ affinity does not work with MSI devices
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.6
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Flavio Leitner
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: 391511 461297
TreeView+ depends on / blocked
 
Reported: 2008-02-12 03:34 UTC by Flavio Leitner
Modified: 2018-10-20 00:29 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-06 21:15:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch fixing smp irq affinity (5.59 KB, patch)
2008-02-12 03:34 UTC, Flavio Leitner
no flags Details | Diff
Patch based on 1769b46a to fix spurious interrupts (4.93 KB, patch)
2008-04-29 14:08 UTC, Flavio Leitner
no flags Details | Diff
patches combined (10.96 KB, patch)
2008-04-30 16:51 UTC, Flavio Leitner
no flags Details | Diff

Description Flavio Leitner 2008-02-12 03:34:32 UTC
Description of problem:
Echoing a mask to /proc/irq/<irq>/smp_affinity and then 
check /proc/interrupts shows that mask didn't work.

Version-Release number of selected component (if applicable):
Up to 2.6.9-68.11.EL

How reproducible:
Always

Steps to Reproduce:
1. Select a device with MSI without masking & pending bit support
2. Find device irq
3. Change mask echoing to /proc/irq/<IRQ>/smp_affinity
4. Check the results in /proc/interrupts
  
Actual results:
The mask is ignored

Expected results:
IRQs moved to another CPU

Additional info:
The MSI should be disabled when moving IRQs on chips without 
Mask-and-Pending bits. This patch based on upstream code 
merge two IRQ chip drivers and also add msi_set_enable() to 
handle this case. 

Flavio

Comment 1 Flavio Leitner 2008-02-12 03:34:32 UTC
Created attachment 294622 [details]
Patch fixing smp irq affinity

Comment 2 Andy Gospodarek 2008-02-13 14:14:45 UTC
Flavio, was this based on a particular upstream commit?  If so can you let me
know what it was so I can take a peek?

Also, let me know if you (or a customer) have tested this.

Comment 3 Flavio Leitner 2008-02-13 19:51:37 UTC
Hi Andy,

I've tested this patch on hs21-7995-2.gsslab.rdu.redhat.com and it worked.
Still waiting customer feedback, though.

Actually, it's not based on just one commit but the main commit is:

commit 58e0543e8f355b32f0778a18858b255adb7402ae
Author: Eric W. Biederman <ebiederm>
Date:   Mon Mar 5 00:30:11 2007 -0800

    [PATCH] msi: support masking msi irqs without a mask bit

    For devices that do not support msi-x we only support 1 interrupt.
    Therefore we can disable that one interrupt by disabling the msi 
    capability itself.  If
    we leave the intx interrupts disabled while we have the msi capability
    disabled no interrupts should be delivered from that device.

    Devices with just the minimal msi support (and thus hitting this code path)
    include things like the intel e1000 nic, so it looks like is going to be a
    fairly common case and thus important to get right.

    Signed-off-by: Eric W. Biederman <ebiederm>
    Cc: Michael Ellerman <michael.au>
    Cc: Paul Mackerras <paulus>
    Cc: Benjamin Herrenschmidt <benh.org>
    Cc: Greg KH <greg>
    Signed-off-by: Andrew Morton <akpm>
    Signed-off-by: Linus Torvalds <torvalds>

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index c43e7d2..01869b1 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -85,6 +85,8 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
                        mask_bits &= ~(1);
                        mask_bits |= flag;
                        pci_write_config_dword(entry->dev, pos, mask_bits);
+               } else {
+                       msi_set_enable(entry->dev, !flag);
                }
                break;
        case PCI_CAP_ID_MSIX:

Flavio


Comment 4 Andy Gospodarek 2008-02-13 20:23:57 UTC
Thanks for the update.  There are several MSI issues in rhel5 that need to be addressed as well, so you may want to check to be sure this isn't broken there as well.  It is nice to fix both releases at the same time if we can.

Comment 5 Flavio Leitner 2008-02-14 23:21:09 UTC
Sure, the RHEL5 version is bz#43245, it's also tested and ok.
Flavio

Comment 6 Flavio Leitner 2008-02-14 23:22:18 UTC
Actually, is bz#432451 - https://bugzilla.redhat.com/show_bug.cgi?id=432451
Flavio

Comment 7 Andy Gospodarek 2008-03-25 20:45:14 UTC
Flavio,

I hate to push this out but there are some other msi problems I've noticed in
rhel4 (that are related to the rhel5 bug 428696).

Can you combine this patch and take a look at those to see if we can put them
all in 4.8?

Thanks.

Comment 10 Flavio Leitner 2008-04-29 14:08:13 UTC
Created attachment 304118 [details]
Patch based on 1769b46a to fix spurious interrupts

Comment 11 Flavio Leitner 2008-04-30 16:51:02 UTC
Created attachment 304255 [details]
patches combined

Comment 12 RHEL Program Management 2008-09-03 12:55:06 UTC
Updating PM score.

Comment 14 Andy Gospodarek 2008-09-24 21:31:21 UTC
The RHEL5 bug was closed as a WONTFIX due to hardware limitations, so I suspect this bug will have the same fate.


Note You need to log in before you can comment on or make changes to this bug.