Bug 468267

Summary: Interrupts presented through multiple P2P PCI Express bridges to the OS are not processed correctly
Product: Red Hat Enterprise Linux 5 Reporter: ldekay <ldekay>
Component: kernelAssignee: Arnd Bergmann <arnd>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: low    
Version: 5.2CC: arnd, hannsj_uhl, ldekay
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 13:09:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Tarball of /proc/interrupts and lspci -vvx from rhel 5.2 and fedora 7
none
Tarball of /proc/interrupts and lspci -vvx from rhel 5.2 with AJA on INTA and INTC none

Description ldekay 2008-10-23 19:59:01 UTC
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.17) Gecko/20080829 Firefox/2.0.0.17

It appears that Red Hat 5.2 kernel is only processing interrupts that are mapped to INTA and does not correctly map the INTx virtual wires correctly across multiple P2P (peer-to-peer) bridges to the system interrupt resources. Therefore, interrupts that are mapped to INTB, INTC, or INTD are not serviced.

According to the PCIe Base Specifications 1.0a , when INTx interrupts are presented across a switch ”Virtual and actual PCI to PCI Bridges must map the virtual wires tracked on the secondary side of the Bridge according to the Device Number of the Device on the secondary side of the Bridge, as shown in Table 2-13”.   Page 66 PCIe Base Specification Rev 1.0a

Device Number for Device      INTx Virtual Wire on           Mapping to INTx        Virtual
on Secondary Side of          Secondary Side of Bridge       Wire on Primary Side of
Bridge (Interrupt Source)                                     Bridge
______________________________________________________________________________

0,4,8,12,16,20,24,28          INTA                           INTA
                              INTB                           INTB
                              INTC                           INTC
                              INTD                           INTD
______________________________________________________________________________

1,5,9,13,17,21,25,29          INTA                           INTB
                              INTB                           INTC
                              INTC                           INTD
                              INTD                           INTA
______________________________________________________________________________

2,6,10,14,18,22,26,30         INTA                           INTC
                              INTB                           INTD
                              INTC                           INTA
                              INTD                           INTB
______________________________________________________________________________

3,7,11,15,19,23,27,31         INTA                           INTD
                              INTB                           INTA
                              INTC                           INTB
                              INTD                           INTC

We have seen in PCI Express traces that the ASSERT_INTx is generated by the PCI Express endpoint, but only in the case where an INTA is passed up as the DEASSERT_INTx returned. The failure is not exhibited by the EXACT same hardware/firmware running Fedora with kernel 2.6.22-5

Reproducible: Always

Steps to Reproduce:
1. Hardware setup must be such that multiple PCI-Express P2P bridges exist and the interrupt generated will be mapped to INT B, C, or D. INT A will not fail. 
2. RHEL 5.2 for PPC64 running on an IBM QS-21 blade with a NextIO N1400-PCM PCI Express High Speed Switch Module (PCM) and the N2800-I/O Consolidation Appliance (ICA)are useful in creating the failing scenario. 
3. Set the kernel parameter pci=nomsi so that the PCI Express endpoint generates  legacy interrupts instead of MSIs
4. force the endpoint to generate an interrupt
Actual Results:  
The interrupt is never serviced.

Expected Results:  
The interrupt should be serviced.

If hardware is needed to reproduce this issue and further debug, please contact ldekay
It also will be possible to debug further in NextIO's lab and send results to RedHat.

Comment 1 Arnd Bergmann 2008-10-24 07:39:24 UTC
The problem is not a bug in RHEL5, but rather a known deficiency in the firmware. The QS21 firmware does not formally support P2P bridges at this point because INTx interrupts are known to be misrouted in the device tree information that is passed to the operating system.

If you wish to add support for INTx in QS21 and/or get formal support for P2P bridges in that firmware, please open a support request with your IBM contact.

Comment 2 ldekay 2008-10-24 14:03:20 UTC
Arnd, how can that explain why we have no issues with Fedora 7?

Comment 3 Arnd Bergmann 2008-10-24 14:52:00 UTC
I suspect that on the Fedora 7 system, you had installed the extension card in one of the working slots. It is really hard to tell now because you are running an obscure kernel version on an outdated distribution.

Try reproducing with Fedora 9, or 2.6.27 or at least the latest Fedora 7 kernel (2.6.23.17-88.fc7), and post the output of 'lspci -vvx' on RHEL5.2 and a working kernel.

Also, if you have no issues with Fedora, why insist on using RHEL with a non-default kernel boot option?

Comment 4 ldekay 2008-10-24 16:37:12 UTC
We captured PCIe traces in Fedora 7 and RHEL 5.2 using the same slots. We saw the INTA, INTB, INTC, and INTD asserted/deasserted in Fedora 7, but not in RHEL 5.2. In RHEL 5.2, we only see INTA asserted/deasserted; INTB , INTC, INTD are asserted but we never see them deasserted.

Comment 5 Arnd Bergmann 2008-10-24 17:38:00 UTC
please attach the output of 'lspci -vvx' and the contents of /proc/interrupts on both systems.

Comment 6 ldekay 2008-10-24 19:52:24 UTC
Created attachment 321461 [details]
Tarball of /proc/interrupts and lspci -vvx from rhel 5.2 and fedora 7

Comment 7 ldekay 2008-10-24 19:55:30 UTC
Added a tarball of /proc/interrupts and lspci -vvx from rhel 5.2 and fedora 7. The PCI Express endpoint of interest is a SysKonnect NIC (eth2) mapped to INTC.

Comment 8 Arnd Bergmann 2008-10-27 12:25:57 UTC
The two listings show the same results:

The attachment lists device 0002:14:00.0 as having interrupt 106, which is INTA of the PCIe host bridge. The device is connected through the bridges 02:00:00.0, 02:01:00.0, 02:02:0a.9, 02:08:00.0 and finally 02:09:0c.0, which means that according to the PCI IRQ swizzling rules you listed in comment #1, it should be 108 (INTC) of the host controller.

Moreover, the device has never received any interrupts, indicating that it is not even intialized, although the device driver has clearly been loaded.
From all I can tell, the device is just as broken in Fedora 7 as it is in RHEL.

Comment 9 ldekay 2008-10-27 15:02:35 UTC
The fact that IRQ 106 shows up in /proc/interrupts means that the device has received an interrupt. It would not be in the table if it had not. The difference is that RHEL 5.2 is not getting the deassert back to the device,  whereas Fedora does. This is confirmed in the PCI Express traces showing that the interrupt deassert happens in  Fedora, but not in RHEL.

I can send you results from testing with another device if it will help to convince you. I have done testing with another device, an AJA video capture card,  that shows the same end result. It is broken in RHEL 5.2 and fully functional in Fedora 7. The results are cleaner though. It will show the interrupt counter increasing in Fedora 7, but not in RHEL. It will also show you the same mapping in both OSes, pin INTA routed to IRQ 106. I will have to repeat the tests to capture the output that you want though. Do you want those results captured?

Although the device driver for the SysKonnect has issues in both kernels that I have not looked into, it is not just as broken in Fedora 7 as it is in RHEL. In Fedora 7, the device is functional as an ethernet controller and it is not in RHEL 5.2 (DHCP lease, ping, etc work in Fedora, not in RHEL).

Comment 10 ldekay 2008-10-27 15:25:35 UTC
I have to correct a statement made in comment #9. I have not tested the AJA card in Fedora 7. We do not have a driver available to test the device in that kernel. What I did test was that device mapped to pin INTA and then again to pin INTC in RHEL5.2 and showed that interrupts were hung on INTC. I will attach those results.

Comment 11 ldekay 2008-10-27 16:27:08 UTC
Created attachment 321627 [details]
Tarball of /proc/interrupts and lspci -vvx from rhel 5.2 with AJA on INTA and INTC

This file contains the results of lspci -vvx and cat /proc/interrupts run in 2 different tests: the AJA card on INTA and the AJA card on INTC, both in rhel 5.2

Comment 12 Arnd Bergmann 2008-10-28 12:52:16 UTC
(In reply to comment #11)
> Created an attachment (id=321627) [details]

Thanks, this confirms what I was saying earlier about the firmware. In your 'INTA' listing, INTA  of the AJA card is correctly routed to IRQ 106, in your 'INTC' listing, the firmware also routes the INTA line to IRQ 106, but it should be IRQ 108.

Comment 13 Arnd Bergmann 2008-10-28 12:53:32 UTC
(In reply to comment #9)
> In Fedora 7, the device is functional as an ethernet controller and it
> is not in RHEL 5.2 (DHCP lease, ping, etc work in Fedora, not in RHEL).

If it works in Fedora, that is probably the result of an unrelated bug in
Fedora and purely coincidence.

Comment 14 ldekay 2009-02-11 14:04:26 UTC
IBM has confirmed that this is a bug in the QS-21 firmware as explained in comment #1. A bug fix has been provided and confirmed to work correctly.

Comment 15 RHEL Program Management 2014-03-07 13:41:17 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 16 RHEL Program Management 2014-06-02 13:09:23 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Comment 17 Red Hat Bugzilla 2023-09-14 01:13:56 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days