From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6 Description of problem: This was re-introduced in kernel 2.6.9-34. The patch from 33 to 34 is as follows: diff -Naur linux-2.6.9-33.EL/arch/x86_64/kernel/mpparse.c linux-2.6.9-34.EL/arch/x86_64/kernel/mpparse.c --- linux-2.6.9-33.EL/arch/x86_64/kernel/mpparse.c 2006-02-21 12:42:52.000000000 -0500 +++ linux-2.6.9-34.EL/arch/x86_64/kernel/mpparse.c 2006-02-25 09:54:34.000000000 -0500 @@ -969,7 +969,13 @@ */ int irq = gsi; if (gsi < MAX_GSI_NUM) { - gsi = pci_irq++; + if (gsi > 15) + gsi = pci_irq++; + /* + * Don't assign IRQ used by ACPI SCI + */ + if (gsi == acpi_fadt.sci_int) + gsi = pci_irq++; gsi_to_irq[irq] = gsi; } else { printk(KERN_ERR "GSI %u is too high\n", gsi); Stratus has a hardware platform with the 8254 timer connected to ioapic pin 1, and a PCI device connected to ioapic pin 0. There is an interrupt source override that maps pin 1 to IRQ0, and before the above patch was applied, the PCI device used to get IRQ153 assigned to it. But now with the patch, the PCI device also gets IRQ0. Not good, considering the differences in triggering characteristics (low-level vs high-edge). So the result is a non-stop train of IRQ0s making the system time leap forward rapidly. This happens because both pins 0 and 1 are bound together to drive the same IRQ. Without the patch above, these two pins are assigned different IRQs. It was suggested that we try an upstream kernel, to see if some other change went in along with the above patch that perhaps would avoid this problem. But unfortunately, that was not the case. The upstream kernel also causes the timer to leap forward uncontrollably on Stratus hardware. Version-Release number of selected component (if applicable): kernel-2.6.9-34 How reproducible: Always Steps to Reproduce: 1. Boot the machine. 2. Watch the system time fly! 3. Actual Results: System clock went at warp speed. Expected Results: Normal IRQ0 interrupt rate. Additional info:
This issue is on Red Hat Engineering's list of planned work items for the upcoming Red Hat Enterprise Linux 4.4 release. Engineering resources have been assigned and barring unforeseen circumstances, Red Hat intends to include this item in the 4.4 release.
Created attachment 128071 [details] Patch to work-around VIA chipset work-around
Just lost all my comments! Must "Save Changes" before hitting the "create attachment" button. sigh... Well, here we go again. I just attached a patch which is based on something I worked out with Natalie Portasevich, who submitted the original VIA chipset work-around that broke our platform (2.6.13, git details below). We came up with an approach that allows the VIA work-around to run as before, but avoids a special case collision of ioapic pins on IRQ0. Once I had Natalie's agreement, I sent this upstream (http://marc.theaimsgroup.com/?l=linux-kernel&m=114557490907738&w=2). The patch affects both x86_64 and i386, though only the x86_64 portion affects Stratus. I added the i386 partly at Natalie's suggestion that it makes the patch more complete (some i386 platforms could have been broken by the VIA work-around too). The attached patch was generated and tested in 2.6.9-34.20. I tried both the i386 and x86_64 platforms, though admittedly only our x86_64 platform cares about this patch. Still, I wanted to make sure the i386 platform didn't suffer any regressions. Finally, here are the details from Natalie's git commit, back in 2.6.13: [PATCH] x86: avoid wasting IRQs patch update The patch addresses a problem with ACPI SCI interrupt entry, which gets re-used, and the IRQ is assigned to another unrelated device. The patch corrects the code such that SCI IRQ is skipped and duplicate entry is avoided. Second issue came up with VIA chipset, the problem was caused by original patch assigning IRQs starting 16 and up. The VIA chipset uses 4-bit IRQ register for internal interrupt routing, and therefore cannot handle IRQ numbers assigned to its devices. The patch corrects this problem by allowing PCI IRQs below 16. Signed-off by: Natalie Protasevich <Natalie.Protasevich> Signed-off-by: Andrew Morton <akpm> Signed-off-by: Linus Torvalds <torvalds> --- commit e1afc3f522ed088405fc8932110d338330db82bb tree 944b79bef5f73bfe1ea7fc5e89cb9e36562d0929 parent 80625942094b114d85811e5ff1fbc9e06dabe0ff author Natalie.Protasevich <Natalie.Protasevich> Fri, 29 Jul 2005 14:03:32 -0700 committer Linus Torvalds <torvalds.org> Fri, 29 Jul 2005 15:01:13 -0700 arch/i386/kernel/mpparse.c | 10 +++++++++- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/arch/i386/kernel/mpparse.c b/arch/i386/kernel/mpparse.c index af917f6..ce838ab 100644 --- a/arch/i386/kernel/mpparse.c +++ b/arch/i386/kernel/mpparse.c @@ -1116,7 +1116,15 @@ int mp_register_gsi (u32 gsi, int edge_l */ int irq = gsi; if (gsi < MAX_GSI_NUM) { - gsi = pci_irq++; + if (gsi > 15) + gsi = pci_irq++; +#ifdef CONFIG_ACPI_BUS + /* + * Don't assign IRQ used by ACPI SCI + */ + if (gsi == acpi_fadt.sci_int) + gsi = pci_irq++; +#endif gsi_to_irq[irq] = gsi; } else { printk(KERN_ERR "GSI %u is too high\n", gsi);
Just got this back from Andrew Morton: The patch titled x86/x86_64: avoid IRQ0 ioapic pin collision has been added to the -mm tree. Its filename is x86-x86_64-avoid-irq0-ioapic-pin-collision.patch
committed in stream U4 build 34.28. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
Created attachment 128759 [details] Upstream patch to Linus's git tree This just got merged into Linus' tree recently. Not sure when it will appear in the kernel.org builds.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0575.html