Bug 2184735
Summary: | irqbalance: silently failing to enforce IRQBALANCE_BANNED_CPULIST | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Robin Jarry <rjarry> | |
Component: | irqbalance | Assignee: | ltao | |
Status: | CLOSED ERRATA | QA Contact: | Jiri Dluhos <jdluhos> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 9.2 | CC: | atenart, cfontain, dmarchan, ekuris, hakhande, hewang, jdluhos, jeder, jshortt, ltao, mleitner, ruyang, rvr, vcandapp | |
Target Milestone: | rc | Keywords: | TestOnly, Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | irqbalance-1.9.2-2.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2219830 (view as bug list) | Environment: | ||
Last Closed: | 2023-11-07 08:56:07 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2219830 |
Description
Robin Jarry
2023-04-05 14:23:43 UTC
(In reply to Robin Jarry from comment #0) > Hi folks, > > When there is not enough room in the non-banned CPUs APIC, irqbalance seems > to silently let irq affinities overspill on banned CPUs. > > Here are a few traces to highlight the problem. The tool I am using > (irqstat) is available here: https://pypi.org/project/linux-tools/ > > > # rpm -qa irqbalance > > irqbalance-1.9.0-3.el9.x86_64 > > > > ~# grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance > > IRQBALANCE_BANNED_CPULIST=2-19,22-39 > > > > ~# irqstat -n | grep -v '\<0\>$' > > CPU AFFINITY-IRQs EFFECTIVE-IRQs > > 0 114 114 > > 1 43 34 > > 20 184 179 > > 21 61 51 > > > > ~# echo 10 > /sys/class/net/ens2f0np0/device/sriov_numvfs > > ~# echo 10 > /sys/class/net/ens2f1np1/device/sriov_numvfs > > ~# echo 10 > /sys/class/net/eno3/device/sriov_numvfs > > ~# echo 10 > /sys/class/net/eno4/device/sriov_numvfs > > > > ~# irqstat -n | grep -v '\<0\>$' > > CPU AFFINITY-IRQs EFFECTIVE-IRQs > > 0 224 201 > > 1 52 43 > > 2 97 78 > > 4 12 11 > > 6 12 11 > > 8 13 12 > > 10 6 5 > > 12 12 11 > > 14 12 11 > > 16 13 12 > > 18 13 12 > > 20 234 201 > > 21 52 42 > > 22 81 68 > > > > ~# irqstat -c 2 > > IRQ AFFINITY EFFECTIVE-CPU DESCRIPTION > > 47 2 2 IR-PCI-MSI 12582920-edge i40e-eno1-TxRx-7 > > 61 2 2 IR-PCI-MSI 12582934-edge i40e-eno1-TxRx-21 > > 62 2 2 IR-PCI-MSI 12582935-edge i40e-eno1-TxRx-22 > > 63 2 2 IR-PCI-MSI 12582936-edge i40e-eno1-TxRx-23 > > 64 2 2 IR-PCI-MSI 12582937-edge i40e-eno1-TxRx-24 > > 66 2 2 IR-PCI-MSI 12582939-edge i40e-eno1-TxRx-26 > > 68 2 2 IR-PCI-MSI 12582941-edge i40e-eno1-TxRx-28 > > 70 2 2 IR-PCI-MSI 12582943-edge i40e-eno1-TxRx-30 > > 77 2 2 IR-PCI-MSI 12582950-edge i40e-eno1-TxRx-37 > > 92 2 2 IR-PCI-MSI 12584960-edge i40e-0000:18:00.1:misc > > 97 2 2 IR-PCI-MSI 12584965-edge i40e-eno2-TxRx-4 > > 102 2 2 IR-PCI-MSI 12584970-edge i40e-eno2-TxRx-9 > > 128 2 2 IR-PCI-MSI 12584988-edge i40e-eno2-TxRx-27 > > 133 2 2 IR-PCI-MSI 12584993-edge i40e-eno2-TxRx-32 > > 134 2 2 IR-PCI-MSI 12584994-edge i40e-eno2-TxRx-33 > > 139 2 2 IR-PCI-MSI 12584999-edge i40e-eno2-TxRx-38 > > 141 2 2 IR-PCI-MSI 12585001-edge i40e-0000:18:00.1:fdir-TxRx-0 > > 157 2 2 IR-PCI-MSI 49285123-edge ens3f1-rx-3 > > 168 2 2 IR-PCI-MSI 49289220-edge ens3f3-rx-4 > > 251 2 2 IR-PCI-MSI 12587020-edge i40e-eno3-TxRx-11 > > 253 2 2 IR-PCI-MSI 12587022-edge i40e-eno3-TxRx-13 > > 254 2 2 IR-PCI-MSI 12587023-edge i40e-eno3-TxRx-14 > > 256 2 2 IR-PCI-MSI 12587025-edge i40e-eno3-TxRx-16 > > 257 2 2 IR-PCI-MSI 12587026-edge i40e-eno3-TxRx-17 > > 260 2 2 IR-PCI-MSI 12587029-edge i40e-eno3-TxRx-20 > > 269 2 2 IR-PCI-MSI 12587038-edge i40e-eno3-TxRx-29 > > 316 0,2,20,22 2 IR-PCI-MSI 49827840-edge mlx5_comp0@pci:0000:5f:01.2 > > 317 2 2 IR-PCI-MSI 49827841-edge mlx5_comp1@pci:0000:5f:01.2 > > 329 2 2 IR-PCI-MSI 49829889-edge mlx5_comp1@pci:0000:5f:01.3 > > 352 0,2,20,22 2 IR-PCI-MSI 49833984-edge mlx5_comp0@pci:0000:5f:01.5 > > 353 2 2 IR-PCI-MSI 49833985-edge mlx5_comp1@pci:0000:5f:01.5 > > 364 0,2,20,22 2 IR-PCI-MSI 49836032-edge mlx5_comp0@pci:0000:5f:01.6 > > 365 2 2 IR-PCI-MSI 49836033-edge mlx5_comp1@pci:0000:5f:01.6 > > 380 2 2 IR-PCI-MSI 49807363-edge mlx5_comp3@pci:0000:5f:00.0 > > 382 2 2 IR-PCI-MSI 49807365-edge mlx5_comp5@pci:0000:5f:00.0 > > 384 2 2 IR-PCI-MSI 49807367-edge mlx5_comp7@pci:0000:5f:00.0 > > 386 2 2 IR-PCI-MSI 49807369-edge mlx5_comp9@pci:0000:5f:00.0 > > 387 2 2 IR-PCI-MSI 49807370-edge mlx5_comp10@pci:0000:5f:00.0 > > 406 2 2 IR-PCI-MSI 49807389-edge mlx5_comp29@pci:0000:5f:00.0 > > 407 2 2 IR-PCI-MSI 49807390-edge mlx5_comp30@pci:0000:5f:00.0 > > 408 2 2 IR-PCI-MSI 49807391-edge mlx5_comp31@pci:0000:5f:00.0 > > 413 2 2 IR-PCI-MSI 49807396-edge mlx5_comp36@pci:0000:5f:00.0 > > 420 2 2 IR-PCI-MSI 12589058-edge i40e-eno4-TxRx-1 > > 421 2 2 IR-PCI-MSI 12589059-edge i40e-eno4-TxRx-2 > > 423 2 2 IR-PCI-MSI 12589061-edge i40e-eno4-TxRx-4 > > 425 2 2 IR-PCI-MSI 12589063-edge i40e-eno4-TxRx-6 > > 427 2 2 IR-PCI-MSI 12589065-edge i40e-eno4-TxRx-8 > > 436 2 2 IR-PCI-MSI 12589074-edge i40e-eno4-TxRx-17 > > 441 2 2 IR-PCI-MSI 12589079-edge i40e-eno4-TxRx-22 > > 447 2 2 IR-PCI-MSI 12589085-edge i40e-eno4-TxRx-28 > > 450 2 2 IR-PCI-MSI 12589088-edge i40e-eno4-TxRx-31 > > 452 2 2 IR-PCI-MSI 12589090-edge i40e-eno4-TxRx-33 > > 454 2 2 IR-PCI-MSI 12589092-edge i40e-eno4-TxRx-35 > > 457 2 2 IR-PCI-MSI 12589095-edge i40e-eno4-TxRx-38 > > 530 2 2 IR-PCI-MSI 49809418-edge mlx5_comp10@pci:0000:5f:00.1 > > 531 2 2 IR-PCI-MSI 49809419-edge mlx5_comp11@pci:0000:5f:00.1 > > 532 2 2 IR-PCI-MSI 49809420-edge mlx5_comp12@pci:0000:5f:00.1 > > 533 2 2 IR-PCI-MSI 49809421-edge mlx5_comp13@pci:0000:5f:00.1 > > 534 2 2 IR-PCI-MSI 49809422-edge mlx5_comp14@pci:0000:5f:00.1 > > 535 2 2 IR-PCI-MSI 49809423-edge mlx5_comp15@pci:0000:5f:00.1 > > 536 2 2 IR-PCI-MSI 49809424-edge mlx5_comp16@pci:0000:5f:00.1 > > 537 2 2 IR-PCI-MSI 49809425-edge mlx5_comp17@pci:0000:5f:00.1 > > 538 2 2 IR-PCI-MSI 49809426-edge mlx5_comp18@pci:0000:5f:00.1 > > 539 2 2 IR-PCI-MSI 49809427-edge mlx5_comp19@pci:0000:5f:00.1 > > 611 2 2 IR-PCI-MSI 49838081-edge mlx5_comp1@pci:0000:5f:01.7 > > 622 0,2,20,22 2 IR-PCI-MSI 49840128-edge mlx5_comp0@pci:0000:5f:02.0 > > 623 2 2 IR-PCI-MSI 49840129-edge mlx5_comp1@pci:0000:5f:02.0 > > 635 2 2 IR-PCI-MSI 49842177-edge mlx5_comp1@pci:0000:5f:02.1 > > 647 2 2 IR-PCI-MSI 49844225-edge mlx5_comp1@pci:0000:5f:02.2 > > 659 2 2 IR-PCI-MSI 49846273-edge mlx5_comp1@pci:0000:5f:02.3 > > 732 0,2,20,22 2 IR-PCI-MSI 12756995-edge iavf-eno3v5-TxRx-2 > > 734 0,2,20,22 2 IR-PCI-MSI 12759040-edge iavf-0000:18:0a.6:mbx > > 762 0,2,20,22 2 IR-PCI-MSI 12824579-edge iavf-eno4v6-TxRx-2 > > 767 0,2,20,22 2 IR-PCI-MSI 12814339-edge iavf-eno4v1-TxRx-2 > > 772 0,2,20,22 2 IR-PCI-MSI 12826627-edge iavf-eno4v7-TxRx-2 > > 784 0,2,20,22 2 IR-PCI-MSI 12830720-edge iavf-0000:18:0f.1:mbx > > 787 0,2,20,22 2 IR-PCI-MSI 12830723-edge iavf-eno4v9-TxRx-2 > > 792 0,2,20,22 2 IR-PCI-MSI 12816387-edge iavf-eno4v2-TxRx-2 > > The irq affinities are overspilling on banned cpus. This probably is because > the APIC from cpus 0 and 20 are full which is only a hardware limitation. > > Reducing the span of banned cpus fixes the issue: > > > # grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance > > IRQBALANCE_BANNED_CPULIST=4-19,24-39 > > > > ~# irqstat -n | grep -v '\<0\>$' > > CPU AFFINITY-IRQs EFFECTIVE-IRQs > > 0 160 164 > > 1 31 22 > > 2 162 153 > > 3 30 21 > > 20 164 159 > > 21 31 21 > > 22 162 157 > > 23 30 21 > > However, it would be nice if irqbalance could at least log a warning or an > error that it failed to set the affinity for a specific irq. > > > ~# echo 0,20 > /proc/irq/47/smp_affinity_list > > -bash: echo: write error: No space left on device > > For the record, here is the platform overview: > > > NUMA 0 > > ====== > > > > Memory: 187GB > > 2MB hugepages: 0 > > 1GB hugepages: 32 > > > > CPUs > > ---- > > > > Model name: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz > > Cores IDs: > > 0,20 2,22 4,24 6,26 8,28 10,30 12,32 14,34 16,36 18,38 > > > > NICs > > ---- > > > > SLOT DRIVER IFNAME MAC LINK/STATE SPEED DEVICE > > 0000:18:00.0 i40e eno1 e4:43:4b:48:6a:20 1/up 1Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:18:00.1 i40e eno2 e4:43:4b:48:6a:21 1/up 1Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:18:00.2 i40e eno3 e4:43:4b:48:6a:22 1/up 10Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:18:00.3 i40e eno4 e4:43:4b:48:6a:23 1/up 10Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:3b:00.0 vfio-pci - - -/- - Ethernet Controller E810-C for QSFP > > 0000:3b:00.1 vfio-pci - - -/- - Ethernet Controller E810-C for QSFP > > 0000:5e:00.0 tg3 ens3f0 00:0a:f7:d9:e4:14 0/down - NetXtreme BCM5719 Gigabit Ethernet PCIe > > 0000:5e:00.1 tg3 ens3f1 00:0a:f7:d9:e4:15 0/down - NetXtreme BCM5719 Gigabit Ethernet PCIe > > 0000:5e:00.2 tg3 ens3f2 00:0a:f7:d9:e4:16 0/down - NetXtreme BCM5719 Gigabit Ethernet PCIe > > 0000:5e:00.3 tg3 ens3f3 00:0a:f7:d9:e4:17 0/down - NetXtreme BCM5719 Gigabit Ethernet PCIe > > 0000:5f:00.0 mlx5_core ens2f0np0 04:3f:72:b8:be:6a 1/up 10Gb/s MT27800 Family [ConnectX-5] > > 0000:5f:00.1 mlx5_core ens2f1np1 04:3f:72:b8:be:6b 1/up 10Gb/s MT27800 Family [ConnectX-5] > > > > NUMA 1 > > ====== > > > > Memory: 188GB > > 2MB hugepages: 0 > > 1GB hugepages: 32 > > > > CPUs > > ---- > > > > Model name: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz > > Cores IDs: > > 1,21 3,23 5,25 7,27 9,29 11,31 13,33 15,35 17,37 19,39 > > > > NICs > > ---- > > > > SLOT DRIVER IFNAME MAC LINK/STATE SPEED DEVICE > > 0000:af:00.0 i40e ens4f0 f8:f2:1e:42:5d:70 1/up 10Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:af:00.1 i40e ens4f1 f8:f2:1e:42:5d:70 1/up 10Gb/s Ethernet Controller X710 for 10GbE SFP+ > > 0000:af:00.2 vfio-pci - - -/- - Ethernet Controller X710 for 10GbE SFP+ > > 0000:af:00.3 vfio-pci - - -/- - Ethernet Controller X710 for 10GbE SFP+ Hi Robin, Thanks for reporting the issue, I think it is reasonable to have a notification when irqbalance overspill IRQs on banned cpus. Just for curiousity, which beaker machine did you use for the testing? I failed to simulate a system which have enough devices' IRQs to overspill using qemu. Thanks, Tao Liu Hi there, any machine with SRIOV capable PCI devices should be enough to produce more than 224 IRQs. I don't think you will be able to test this in QEMU. Patch[1] posted upstream [1]: https://github.com/Irqbalance/irqbalance/pull/265 I just realized that this issue actually hides another regression. This patch https://github.com/Irqbalance/irqbalance/commit/55c5c321c73e4c9b54e041ba8c7d542598685bae (included in irqbalance 1.7.0) causes any failure to enforce smp_affinity to ban the IRQ for the whole life of the process. APIC being out of space is a transient issue but irqbalance will never try again to move the interrupt to another CPU unless it is restarted. I have submitted another pull request here https://github.com/Irqbalance/irqbalance/pull/266. Waiting for feedback. The issue should be now resolved by all commits here https://github.com/Irqbalance/irqbalance/pull/269 @ Hi @ltao I have added a small fix to my patch series. Can you take this commit along with the others? https://github.com/Irqbalance/irqbalance/commit/bc7794dc78474c463a26926749537f23abc4c082 Thanks! (In reply to Robin Jarry from comment #6) > Hi @ltao I have added a small fix to my patch series. Can you > take this commit along with the others? > > https://github.com/Irqbalance/irqbalance/commit/ > bc7794dc78474c463a26926749537f23abc4c082 > > Thanks! Hi Robin, Thanks a lot for your works! I will integrate all patches and make a release next Monday. Thanks, Tao Liu Hi Robin, Could you please have a check on the irqbalance-1.9.2-2.el9 release, to see if it works for you? Thanks! Thanks, Tao Liu Hi Tao, sorry about the delay. Yes irqbalance-1.9.2-2.el9 contains the required fixes. Thanks! (In reply to Robin Jarry from comment #10) > Hi Tao, > > sorry about the delay. Yes irqbalance-1.9.2-2.el9 contains the required > fixes. Thanks! Hi Robin, Thanks for the confirmation! Thanks, Tao Liu There is no release+ flags been set, so cannot be added into errata. I don't know if it is due to missing DTM and ITR, will make a try. Hi Jiri, Could you please help set ITR then see if release+ flags can be set? Thanks, Tao Liu OK, Thanks for setting ITR flags, VĂctor. However a new error reported by errata: Errata Can only add VERIFIED bugs when advisory is in REL PREP state So maybe a verified flag is needed from QE? Thanks, Tao Liu Thanks Robin for detailed testing! Setting VERIFIED+OtherQA. Apologies, not OtherQA; it's developer's unit testing, not OtherQA. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (irqbalance bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:6688 |