Description of problem: Large amount of CPU time is wasted to handle MSI interrupt. This degrades performance considerably under high MSI interrupt load. MSI-X cannot be a workaround of this issue because RHEL5.x supports only 256 interrupt vectors and using MSI-X runs out of interrupt vectors easily. In this case (no enough vectors for MSI-X), MSI will be used instead of MSI-X. Analysis of the problem ----------------------- When a MSI interrupt is generated, the RHEL5.x kernel mask it at the beginning of interrupt handler (at the ack time) if this MSI interrupt is maskable. To mask the MSI interrupt, kernel writes to PCI configuration space with holding spin-lock. This wastes large amount of CPU time because: - PCI config access is very slow. - PCI config access is serialized (need spin-lock) among CPUs. Masking maskable MSI interrupt is required only when irq affinity is being changed, and the behavior of RHEL5.x (masking every maskable MSI interrupt) is redundant. This was already fixed in the upstream kernel by commit 277bc33bc2479707e88b0b2ae6fe56e8e4aabe81. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux Version Number: RHEL5 Release Number: 4 Architecture: x86_64 Kernel Version: 2.6.18-164.9.1.el5 Related Package Version: None Related Middleware / Application: None Drivers or hardware or architecture dependency: System with PCI adapter cards that support maskable MSI. This problem is not processor architecture specific. The specifically benchmarked system is: Model: PRIMEQUEST1800E CPU Info: Xeon X7560 (2.27GHz/8core/24MB L3) * 8 Memory Info: 512GB (DDR3-1066 8GB DIMM * 64) Hardware Component Information: - FC: 8GB FC (PCIe/single port) * 28 - LAN: On-board LAN (Intel igb 1000Mbps) * 4 - Storage: 270 Logical Volumes * ETERNUS6000(8FC path, 24 RAID groups(RAID0)) * 4 * ETERNUS3000(2FC path, 1 RAID group(RAID0), 6 RAID grpups (RAID5)) * 1 How reproducible: Every time Actual Results: Performance is very much degraded when MSI is used, compared to MSI-X. Expected results: Performance result using MSI is near to the performance result using MSI-X. Additional info: No sosreport unfortunately. It is difficult to build the same environment where we did benchmark for getting sosreport because the environment was very large (around I/O configuration especially). About the proposal patch ------------------------ As mentioned above, this issue was fixed in upstream kernel by the commit 277bc33bc2479707e88b0b2ae6fe56e8e4aabe81, which changes MSI logic to use irq_chip instead of using hw_interrupt_type. This change seems too large to be backported to RHEL5.x. So the proposal patch changes existing hw_interrupt_type for MSI to not mask maskable interrupt at the ack time. The customers patch is attached.
Created attachment 419243 [details] Patch to bring msi performance inline with msi-x I'm reporting this on proxy one behalf of the SEG engineering with the ticket, in order to keep the ball in motion.
Wade, I think you're right in your analysis. I'll update your patch against RHEL5 latest, do a quick test, and post on RHKL. P.
Created attachment 435098 [details] RHEL5 fix for this issue
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 436875 [details] RHEL5 fix for this issue
We probably ought to release-note this so people can discover its existence.
in kernel-2.6.18-211.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
FJ has confirmed the fix works as expected. linux-2.6-pci-msi-add-option-for-lockless-interrupt-mode.patch is applied in kernel 2.6.18-194.14.1.el5 correctly
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html