Bug 184254 - PCI interrupts on ioapic pins 0-15 always get "legacy" IRQs.
Summary: PCI interrupts on ioapic pins 0-15 always get "legacy" IRQs.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Brian Maly
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 181409 184261 185624
TreeView+ depends on / blocked
 
Reported: 2006-03-07 18:44 UTC by Kimball Murray
Modified: 2007-11-30 22:07 UTC (History)
1 user (show)

Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-08-10 22:31:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to work-around VIA chipset work-around (3.70 KB, patch)
2006-04-21 00:24 UTC, Kimball Murray
no flags Details | Diff
Upstream patch to Linus's git tree (5.65 KB, patch)
2006-05-08 18:38 UTC, Kimball Murray
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 0 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 04:00:00 UTC

Description Kimball Murray 2006-03-07 18:44:39 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6

Description of problem:
This was re-introduced in kernel 2.6.9-34.  The patch from 33 to 34 is as follows:

diff -Naur linux-2.6.9-33.EL/arch/x86_64/kernel/mpparse.c linux-2.6.9-34.EL/arch/x86_64/kernel/mpparse.c
--- linux-2.6.9-33.EL/arch/x86_64/kernel/mpparse.c      2006-02-21 12:42:52.000000000 -0500
+++ linux-2.6.9-34.EL/arch/x86_64/kernel/mpparse.c      2006-02-25 09:54:34.000000000 -0500
@@ -969,7 +969,13 @@
                 */
                int irq = gsi;
                if (gsi < MAX_GSI_NUM) {
-                       gsi = pci_irq++;
+                       if (gsi > 15)
+                               gsi = pci_irq++;
+                       /*
+                        * Don't assign IRQ used by ACPI SCI
+                        */
+                       if (gsi == acpi_fadt.sci_int)
+                               gsi = pci_irq++;
                        gsi_to_irq[irq] = gsi;
                } else {
                        printk(KERN_ERR "GSI %u is too high\n", gsi);

Stratus has a hardware platform with the 8254 timer connected to ioapic pin 1, and a PCI device connected to ioapic pin 0.  There is an interrupt source override that maps pin 1 to IRQ0, and before the above patch was applied, the PCI device used to get IRQ153 assigned to it.  But now with the patch, the PCI device also gets IRQ0.  Not good, considering the differences in triggering characteristics (low-level vs high-edge).  So the result is a non-stop train of IRQ0s making the system time leap forward rapidly.  This happens because both pins 0 and 1 are bound together to drive the same IRQ.  Without the patch above, these two pins are assigned different IRQs.

It was suggested that we try an upstream kernel, to see if some other change went in along with the above patch that perhaps would avoid this problem.  But unfortunately, that was not the case.  The upstream kernel also causes the timer to leap forward uncontrollably on Stratus hardware.

Version-Release number of selected component (if applicable):
kernel-2.6.9-34

How reproducible:
Always

Steps to Reproduce:
1. Boot the machine.
2. Watch the system time fly!
3.
  

Actual Results:  System clock went at warp speed.

Expected Results:  Normal IRQ0 interrupt rate.

Additional info:

Comment 4 Bob Johnson 2006-04-11 16:38:29 UTC
This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.

Comment 5 Kimball Murray 2006-04-21 00:24:12 UTC
Created attachment 128071 [details]
Patch to work-around VIA chipset work-around

Comment 6 Kimball Murray 2006-04-21 00:34:16 UTC
Just lost all my comments!  Must "Save Changes" before hitting the "create
attachment" button.  sigh...

Well, here we go again.  I just attached a patch which is based on something I
worked out with Natalie Portasevich, who submitted the original VIA chipset
work-around that broke our platform (2.6.13, git details below).  We came up
with an approach that allows the VIA work-around to run as before, but avoids a
special case collision of ioapic pins on IRQ0.  Once I had Natalie's agreement,
I sent this upstream
(http://marc.theaimsgroup.com/?l=linux-kernel&m=114557490907738&w=2).

The patch affects both x86_64 and i386, though only the x86_64 portion affects
Stratus.  I added the i386 partly at Natalie's suggestion that it makes the
patch more complete (some i386 platforms could have been broken by the VIA
work-around too).

The attached patch was generated and tested in 2.6.9-34.20.  I tried both the
i386 and x86_64 platforms, though admittedly only our x86_64 platform cares
about this patch.  Still, I wanted to make sure the i386 platform didn't suffer
any regressions. 

Finally, here are the details from Natalie's git commit, back in 2.6.13:

[PATCH] x86: avoid wasting IRQs patch update

The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Signed-off by: Natalie Protasevich <Natalie.Protasevich>

Signed-off-by: Andrew Morton <akpm>
Signed-off-by: Linus Torvalds <torvalds>

---
commit e1afc3f522ed088405fc8932110d338330db82bb
tree 944b79bef5f73bfe1ea7fc5e89cb9e36562d0929
parent 80625942094b114d85811e5ff1fbc9e06dabe0ff
author Natalie.Protasevich <Natalie.Protasevich> 
Fri, 29 Jul 2005 14:03:32 -0700
committer Linus Torvalds <torvalds.org> Fri, 29 Jul 2005 15:01:13 
-0700

 arch/i386/kernel/mpparse.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/i386/kernel/mpparse.c b/arch/i386/kernel/mpparse.c
index af917f6..ce838ab 100644
--- a/arch/i386/kernel/mpparse.c
+++ b/arch/i386/kernel/mpparse.c
@@ -1116,7 +1116,15 @@ int mp_register_gsi (u32 gsi, int edge_l
                 */
                int irq = gsi;
                if (gsi < MAX_GSI_NUM) {
-                       gsi = pci_irq++;
+                       if (gsi > 15)
+                               gsi = pci_irq++;
+#ifdef CONFIG_ACPI_BUS
+                       /*
+                        * Don't assign IRQ used by ACPI SCI
+                        */
+                       if (gsi == acpi_fadt.sci_int)
+                               gsi = pci_irq++;
+#endif
                        gsi_to_irq[irq] = gsi;
                } else {
                        printk(KERN_ERR "GSI %u is too high\n", gsi);



Comment 7 Kimball Murray 2006-04-21 18:09:59 UTC
Just got this back from Andrew Morton:

The patch titled

    x86/x86_64: avoid IRQ0 ioapic pin collision

has been added to the -mm tree.  Its filename is

    x86-x86_64-avoid-irq0-ioapic-pin-collision.patch


Comment 9 Jason Baron 2006-05-03 17:41:17 UTC
committed in stream U4 build 34.28. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 10 Kimball Murray 2006-05-08 18:38:18 UTC
Created attachment 128759 [details]
Upstream patch to Linus's git tree

This just got merged into Linus' tree recently.  Not sure when it will appear
in the kernel.org builds.

Comment 13 Red Hat Bugzilla 2006-08-10 22:31:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html



Note You need to log in before you can comment on or make changes to this bug.