Description of problem: Echoing a mask to /proc/irq/<irq>/smp_affinity and then check /proc/interrupts shows that mask didn't work. Version-Release number of selected component (if applicable): Up to 2.6.9-68.11.EL How reproducible: Always Steps to Reproduce: 1. Select a device with MSI without masking & pending bit support 2. Find device irq 3. Change mask echoing to /proc/irq/<IRQ>/smp_affinity 4. Check the results in /proc/interrupts Actual results: The mask is ignored Expected results: IRQs moved to another CPU Additional info: The MSI should be disabled when moving IRQs on chips without Mask-and-Pending bits. This patch based on upstream code merge two IRQ chip drivers and also add msi_set_enable() to handle this case. Flavio
Created attachment 294622 [details] Patch fixing smp irq affinity
Flavio, was this based on a particular upstream commit? If so can you let me know what it was so I can take a peek? Also, let me know if you (or a customer) have tested this.
Hi Andy, I've tested this patch on hs21-7995-2.gsslab.rdu.redhat.com and it worked. Still waiting customer feedback, though. Actually, it's not based on just one commit but the main commit is: commit 58e0543e8f355b32f0778a18858b255adb7402ae Author: Eric W. Biederman <ebiederm> Date: Mon Mar 5 00:30:11 2007 -0800 [PATCH] msi: support masking msi irqs without a mask bit For devices that do not support msi-x we only support 1 interrupt. Therefore we can disable that one interrupt by disabling the msi capability itself. If we leave the intx interrupts disabled while we have the msi capability disabled no interrupts should be delivered from that device. Devices with just the minimal msi support (and thus hitting this code path) include things like the intel e1000 nic, so it looks like is going to be a fairly common case and thus important to get right. Signed-off-by: Eric W. Biederman <ebiederm> Cc: Michael Ellerman <michael.au> Cc: Paul Mackerras <paulus> Cc: Benjamin Herrenschmidt <benh.org> Cc: Greg KH <greg> Signed-off-by: Andrew Morton <akpm> Signed-off-by: Linus Torvalds <torvalds> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index c43e7d2..01869b1 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -85,6 +85,8 @@ static void msi_set_mask_bit(unsigned int irq, int flag) mask_bits &= ~(1); mask_bits |= flag; pci_write_config_dword(entry->dev, pos, mask_bits); + } else { + msi_set_enable(entry->dev, !flag); } break; case PCI_CAP_ID_MSIX: Flavio
Thanks for the update. There are several MSI issues in rhel5 that need to be addressed as well, so you may want to check to be sure this isn't broken there as well. It is nice to fix both releases at the same time if we can.
Sure, the RHEL5 version is bz#43245, it's also tested and ok. Flavio
Actually, is bz#432451 - https://bugzilla.redhat.com/show_bug.cgi?id=432451 Flavio
Flavio, I hate to push this out but there are some other msi problems I've noticed in rhel4 (that are related to the rhel5 bug 428696). Can you combine this patch and take a look at those to see if we can put them all in 4.8? Thanks.
Created attachment 304118 [details] Patch based on 1769b46a to fix spurious interrupts
Created attachment 304255 [details] patches combined
Updating PM score.
The RHEL5 bug was closed as a WONTFIX due to hardware limitations, so I suspect this bug will have the same fate.