Hide Forgot
Description of problem: If you rename an Ethernet device to something like "dtc12", the irqbalance daemon will not balance the IRQs for that device. It fails to detect the device class as Ethernet and instead categorizes it as "other". This is essentially the same bug and problem as bug 682211. That bug refers to the RHEL 6 method of renaming a device. Version-Release number of selected component (if applicable): irqbalance-0.55-15.el5 How reproducible: Always Steps to Reproduce: 1. system-config-network-tui - Rename an Ethernet device to something that does not start with "eth". 2. Reboot Actual results: Watching /proc/interrupts (watch -d cat /proc/interrupts) shows that the interrupts for the NICs stay on a single core: CPU0 CPU1 CPU2 CPU3 ... 130: 1394 0 0 0 PCI-MSI dtc12 If you run irqbalance in debug mode (irqbalance --debug), it displays that interrupt as class other: Package 0: cpu mask is 0000000f (workload 0) Cache domain 2: cpu mask is 0000000c (workload 0) CPU number 3 (workload 0) CPU number 2 (workload 0) Cache domain 0: cpu mask is 00000003 (workload 0) CPU number 1 (workload 0) CPU number 0 (workload 0) ... Interrupt 130 (class other) has workload 12 Expected results: The NICs should be balanced to different CPUs, and may change CPUs depending on load. cat /proc/interrupts: CPU0 CPU1 CPU2 CPU3 ... 146: 286 0 0 1317 PCI-MSI dtc12 irqbalance --debug: Package 0: cpu mask is 0000000f (workload 0) Cache domain 2: cpu mask is 0000000c (workload 0) CPU number 3 (workload 0) CPU number 2 (workload 0) Cache domain 0: cpu mask is 00000003 (workload 0) CPU number 1 (workload 0) CPU number 0 (workload 0) ... Interrupt 146 (class ethernet) has workload 9 Additional info: The cause of the problem is the irqbalance's function find_class and the struct ethernet_modules in classify.c. This method will only find an Ethernet device if its name contains any of the strings "eth", "e100", "eepro100", "orinico_cs", "wvlan_cs", "3c5", "HiSax". If you remove the code commenting out the Ethernet handling in irqbalance's numa.c:pci_numa_scan(), irqbalance will detect most of the NIC IRQs correctly and handle them. This will work for PCI-MSI and APIC-level IRQs, but will not work for the multiple IRQs in some PCI-MSI-X devices (bnx2).
Patch https://bugzilla.redhat.com/attachment.cgi?id=516487 from bug #682211 could be apply on this problem, I guess.
Jeremy, did patch fix your issue? Thanks! Petr H
Petr, I could not apply the patch you supplied. The changes to irqbalance.h and network.c failed because there are no functions called "dev_to_node" or "dev_to_bus" in the source code I have. I tried applying the patch to the src RPM for irqbalance-0.55-15.el5 on RHEL 5. I also checked the source for the RHEL 6 package and the irqbalance.org .56 source for the dev_to_bus function and didn't see it. Thanks, Jeremy
Apologize, the patch was for upstream svn top from: http://irqbalance.googlecode.com/svn/trunk/ But if you want to use RHEL5 RPM, just let me know, I will prepare testing one with backported patch for you. Thanks! Petr H
I'm not having too much luck with compiling irqbalance directly from SVN. If you can provide a back ported SRC RPM (or the binary RPM) I can test it. I'll continue to work on the compile as I have time over the next few days. Thanks, Jeremy
Created attachment 521038 [details] SRPM with backported netdevs patch
(In reply to comment #7) > Created attachment 521038 [details] > SRPM with backported netdevs patch Problem with compilation was caused by older version of numactl-devel in RHEL5. I backported only bits of code related to your issue and this SRPM should be fine. Thanks! Petr H
Petr, I was able to compile, install, and test the SRPM you attached. This patch fixes the problem. Before the install, my NIC was detected as 'other' ("Interrupt 90 (class other) has workload 3"). After the install, it is detected as 'ethernet' ("Interrupt 90 (class ethernet) has workload 20"). I debugged it with "sudo irqbalance --debug" to verify the class selection. Thanks, Jeremy
I confirm this bug for the current RHN RHEL5 provided package irqbalance-0.55-15.el5: /proc/interrupts ... 52: 1668207 0 0 0 0 0 0 0 PCI-MSI-X vl666-5 59: 6678541 0 0 0 0 0 0 0 PCI-MSI-X iscsi-0 60: 1843608 0 0 0 0 0 0 0 PCI-MSI-X vl666-6 67: 2569910 0 0 0 0 0 0 0 PCI-MSI-X iscsi-1 68: 2428199 0 0 0 0 0 0 0 PCI-MSI-X vl666-7 75: 2277454 0 0 0 0 0 0 0 PCI-MSI-X iscsi-2 83: 2656553 0 0 0 0 0 0 0 PCI-MSI-X iscsi-3 ... The bugfix srpm(irqbalance-0.55-15.netdevs.el5) provided by Petr fixes this issue: /proc/interrupts ... 52: 28075 32075 28756 97089 6812 15596 16399 49034 PCI-MSI-X vl666-5 59: 82219 125452 85254 71544 222417 222170 69186 108966 PCI-MSI-X iscsi-0 60: 101 42262 34036 61607 72672 6124 22149 91436 PCI-MSI-X vl666-6 67: 92877 124405 93098 15893 80090 30386 31874 31852 PCI-MSI-X iscsi-1 68: 7280 380 25081 30239 26496 44053 113230 64850 PCI-MSI-X vl666-7 75: 31843 31822 108718 137668 0 15853 15934 60951 PCI-MSI-X iscsi-2 83: 15921 30470 0 0 149690 60925 27957 15934 PCI-MSI-X iscsi-3 ... Thx & Kind Regards, Roland
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
*** This bug has been marked as a duplicate of bug 798624 ***