Bug 2219830 - irqbalance: silently failing to enforce IRQBALANCE_BANNED_CPULIST
Summary: irqbalance: silently failing to enforce IRQBALANCE_BANNED_CPULIST
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: z4
: 17.1
Assignee: Miguel Angel Nieto
QA Contact: Miguel Angel Nieto
URL:
Whiteboard:
Depends On: 2184735
Blocks: 2034801 2274492
TreeView+ depends on / blocked
 
Reported: 2023-07-05 14:52 UTC by Robin Jarry
Modified: 2024-11-21 09:38 UTC (History)
25 users (show)

Fixed In Version: os-net-config-14.2.1-17.1.20240917140802.61d7bd7.el9ost
Doc Type: Known Issue
Doc Text:
In RHOSP 17.1, there is a known issue of transient packet loss where hardware interrupt requests (IRQs) are causing non-voluntary context switches on OVS-DPDK PMD threads or in guests running DPDK applications. + This issue is the result of provisioning large numbers of VFs during deployment. VFs need IRQs, each of which must be bound to a physical CPU. When there are not enough housekeeping CPUs to handle the capacity of IRQs, `irqbalance` fails to bind all of them and the IRQs overspill on isolated CPUs. + Workaround: You can try one or more of these actions: * Reduce the number of provisioned VFs to avoid unused VFs remaining bound to their default Linux driver. * Increase the number of housekeeping CPUs to handle all IRQs. * Force unused VF network interfaces down to avoid IRQs from interrupting isolated CPUs. * Disable multicast and broadcast traffic on unused, down VF network interfaces to avoid IRQs from interrupting isolated CPUs.
Clone Of: 2184735
Environment:
Last Closed: 2024-11-21 09:38:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github os-net-config os-net-config pull 18 0 None Merged Allow disabling sriov_drivers_autoprobe 2024-09-16 10:37:49 UTC
Red Hat Issue Tracker OSP-26344 0 None None None 2023-07-05 15:37:56 UTC
Red Hat Product Errata RHBA-2024:9974 0 None None None 2024-11-21 09:38:40 UTC

Description Robin Jarry 2023-07-05 14:52:06 UTC
+++ This bug was initially created as a clone of Bug #2184735 +++

Hi folks,

When there is not enough room in the non-banned CPUs APIC, irqbalance seems to silently let irq affinities overspill on banned CPUs.

Here are a few traces to highlight the problem. The tool I am using (irqstat) is available here: https://pypi.org/project/linux-tools/

> # rpm -qa irqbalance
> irqbalance-1.9.0-3.el9.x86_64
>
> ~# grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance
> IRQBALANCE_BANNED_CPULIST=2-19,22-39
>
> ~# irqstat -n | grep -v '\<0\>$'
> CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> 0              114             114
> 1               43              34
> 20             184             179
> 21              61              51
>
> ~# echo 10 > /sys/class/net/ens2f0np0/device/sriov_numvfs
> ~# echo 10 > /sys/class/net/ens2f1np1/device/sriov_numvfs
> ~# echo 10 > /sys/class/net/eno3/device/sriov_numvfs
> ~# echo 10 > /sys/class/net/eno4/device/sriov_numvfs
>
> ~# irqstat -n | grep -v '\<0\>$'
> CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> 0              224             201
> 1               52              43
> 2               97              78
> 4               12              11
> 6               12              11
> 8               13              12
> 10               6               5
> 12              12              11
> 14              12              11
> 16              13              12
> 18              13              12
> 20             234             201
> 21              52              42
> 22              81              68
>
> ~# irqstat -c 2
> IRQ        AFFINITY  EFFECTIVE-CPU  DESCRIPTION
> 47                2              2  IR-PCI-MSI 12582920-edge i40e-eno1-TxRx-7
> 61                2              2  IR-PCI-MSI 12582934-edge i40e-eno1-TxRx-21
> 62                2              2  IR-PCI-MSI 12582935-edge i40e-eno1-TxRx-22
> 63                2              2  IR-PCI-MSI 12582936-edge i40e-eno1-TxRx-23
> 64                2              2  IR-PCI-MSI 12582937-edge i40e-eno1-TxRx-24
> 66                2              2  IR-PCI-MSI 12582939-edge i40e-eno1-TxRx-26
> 68                2              2  IR-PCI-MSI 12582941-edge i40e-eno1-TxRx-28
> 70                2              2  IR-PCI-MSI 12582943-edge i40e-eno1-TxRx-30
> 77                2              2  IR-PCI-MSI 12582950-edge i40e-eno1-TxRx-37
> 92                2              2  IR-PCI-MSI 12584960-edge i40e-0000:18:00.1:misc
> 97                2              2  IR-PCI-MSI 12584965-edge i40e-eno2-TxRx-4
> 102               2              2  IR-PCI-MSI 12584970-edge i40e-eno2-TxRx-9
> 128               2              2  IR-PCI-MSI 12584988-edge i40e-eno2-TxRx-27
> 133               2              2  IR-PCI-MSI 12584993-edge i40e-eno2-TxRx-32
> 134               2              2  IR-PCI-MSI 12584994-edge i40e-eno2-TxRx-33
> 139               2              2  IR-PCI-MSI 12584999-edge i40e-eno2-TxRx-38
> 141               2              2  IR-PCI-MSI 12585001-edge i40e-0000:18:00.1:fdir-TxRx-0
> 157               2              2  IR-PCI-MSI 49285123-edge ens3f1-rx-3
> 168               2              2  IR-PCI-MSI 49289220-edge ens3f3-rx-4
> 251               2              2  IR-PCI-MSI 12587020-edge i40e-eno3-TxRx-11
> 253               2              2  IR-PCI-MSI 12587022-edge i40e-eno3-TxRx-13
> 254               2              2  IR-PCI-MSI 12587023-edge i40e-eno3-TxRx-14
> 256               2              2  IR-PCI-MSI 12587025-edge i40e-eno3-TxRx-16
> 257               2              2  IR-PCI-MSI 12587026-edge i40e-eno3-TxRx-17
> 260               2              2  IR-PCI-MSI 12587029-edge i40e-eno3-TxRx-20
> 269               2              2  IR-PCI-MSI 12587038-edge i40e-eno3-TxRx-29
> 316       0,2,20,22              2  IR-PCI-MSI 49827840-edge mlx5_comp0@pci:0000:5f:01.2
> 317               2              2  IR-PCI-MSI 49827841-edge mlx5_comp1@pci:0000:5f:01.2
> 329               2              2  IR-PCI-MSI 49829889-edge mlx5_comp1@pci:0000:5f:01.3
> 352       0,2,20,22              2  IR-PCI-MSI 49833984-edge mlx5_comp0@pci:0000:5f:01.5
> 353               2              2  IR-PCI-MSI 49833985-edge mlx5_comp1@pci:0000:5f:01.5
> 364       0,2,20,22              2  IR-PCI-MSI 49836032-edge mlx5_comp0@pci:0000:5f:01.6
> 365               2              2  IR-PCI-MSI 49836033-edge mlx5_comp1@pci:0000:5f:01.6
> 380               2              2  IR-PCI-MSI 49807363-edge mlx5_comp3@pci:0000:5f:00.0
> 382               2              2  IR-PCI-MSI 49807365-edge mlx5_comp5@pci:0000:5f:00.0
> 384               2              2  IR-PCI-MSI 49807367-edge mlx5_comp7@pci:0000:5f:00.0
> 386               2              2  IR-PCI-MSI 49807369-edge mlx5_comp9@pci:0000:5f:00.0
> 387               2              2  IR-PCI-MSI 49807370-edge mlx5_comp10@pci:0000:5f:00.0
> 406               2              2  IR-PCI-MSI 49807389-edge mlx5_comp29@pci:0000:5f:00.0
> 407               2              2  IR-PCI-MSI 49807390-edge mlx5_comp30@pci:0000:5f:00.0
> 408               2              2  IR-PCI-MSI 49807391-edge mlx5_comp31@pci:0000:5f:00.0
> 413               2              2  IR-PCI-MSI 49807396-edge mlx5_comp36@pci:0000:5f:00.0
> 420               2              2  IR-PCI-MSI 12589058-edge i40e-eno4-TxRx-1
> 421               2              2  IR-PCI-MSI 12589059-edge i40e-eno4-TxRx-2
> 423               2              2  IR-PCI-MSI 12589061-edge i40e-eno4-TxRx-4
> 425               2              2  IR-PCI-MSI 12589063-edge i40e-eno4-TxRx-6
> 427               2              2  IR-PCI-MSI 12589065-edge i40e-eno4-TxRx-8
> 436               2              2  IR-PCI-MSI 12589074-edge i40e-eno4-TxRx-17
> 441               2              2  IR-PCI-MSI 12589079-edge i40e-eno4-TxRx-22
> 447               2              2  IR-PCI-MSI 12589085-edge i40e-eno4-TxRx-28
> 450               2              2  IR-PCI-MSI 12589088-edge i40e-eno4-TxRx-31
> 452               2              2  IR-PCI-MSI 12589090-edge i40e-eno4-TxRx-33
> 454               2              2  IR-PCI-MSI 12589092-edge i40e-eno4-TxRx-35
> 457               2              2  IR-PCI-MSI 12589095-edge i40e-eno4-TxRx-38
> 530               2              2  IR-PCI-MSI 49809418-edge mlx5_comp10@pci:0000:5f:00.1
> 531               2              2  IR-PCI-MSI 49809419-edge mlx5_comp11@pci:0000:5f:00.1
> 532               2              2  IR-PCI-MSI 49809420-edge mlx5_comp12@pci:0000:5f:00.1
> 533               2              2  IR-PCI-MSI 49809421-edge mlx5_comp13@pci:0000:5f:00.1
> 534               2              2  IR-PCI-MSI 49809422-edge mlx5_comp14@pci:0000:5f:00.1
> 535               2              2  IR-PCI-MSI 49809423-edge mlx5_comp15@pci:0000:5f:00.1
> 536               2              2  IR-PCI-MSI 49809424-edge mlx5_comp16@pci:0000:5f:00.1
> 537               2              2  IR-PCI-MSI 49809425-edge mlx5_comp17@pci:0000:5f:00.1
> 538               2              2  IR-PCI-MSI 49809426-edge mlx5_comp18@pci:0000:5f:00.1
> 539               2              2  IR-PCI-MSI 49809427-edge mlx5_comp19@pci:0000:5f:00.1
> 611               2              2  IR-PCI-MSI 49838081-edge mlx5_comp1@pci:0000:5f:01.7
> 622       0,2,20,22              2  IR-PCI-MSI 49840128-edge mlx5_comp0@pci:0000:5f:02.0
> 623               2              2  IR-PCI-MSI 49840129-edge mlx5_comp1@pci:0000:5f:02.0
> 635               2              2  IR-PCI-MSI 49842177-edge mlx5_comp1@pci:0000:5f:02.1
> 647               2              2  IR-PCI-MSI 49844225-edge mlx5_comp1@pci:0000:5f:02.2
> 659               2              2  IR-PCI-MSI 49846273-edge mlx5_comp1@pci:0000:5f:02.3
> 732       0,2,20,22              2  IR-PCI-MSI 12756995-edge iavf-eno3v5-TxRx-2
> 734       0,2,20,22              2  IR-PCI-MSI 12759040-edge iavf-0000:18:0a.6:mbx
> 762       0,2,20,22              2  IR-PCI-MSI 12824579-edge iavf-eno4v6-TxRx-2
> 767       0,2,20,22              2  IR-PCI-MSI 12814339-edge iavf-eno4v1-TxRx-2
> 772       0,2,20,22              2  IR-PCI-MSI 12826627-edge iavf-eno4v7-TxRx-2
> 784       0,2,20,22              2  IR-PCI-MSI 12830720-edge iavf-0000:18:0f.1:mbx
> 787       0,2,20,22              2  IR-PCI-MSI 12830723-edge iavf-eno4v9-TxRx-2
> 792       0,2,20,22              2  IR-PCI-MSI 12816387-edge iavf-eno4v2-TxRx-2

The irq affinities are overspilling on banned cpus. This probably is because the APIC from cpus 0 and 20 are full which is only a hardware limitation.

Reducing the span of banned cpus fixes the issue:

> # grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance
> IRQBALANCE_BANNED_CPULIST=4-19,24-39
>
> ~# irqstat -n | grep -v '\<0\>$'
> CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> 0              160             164
> 1               31              22
> 2              162             153
> 3               30              21
> 20             164             159
> 21              31              21
> 22             162             157
> 23              30              21

However, it would be nice if irqbalance could at least log a warning or an error that it failed to set the affinity for a specific irq.

> ~# echo 0,20 > /proc/irq/47/smp_affinity_list
> -bash: echo: write error: No space left on device

For the record, here is the platform overview:

> NUMA 0
> ======
> 
> Memory: 187GB
> 2MB hugepages: 0
> 1GB hugepages: 32
> 
> CPUs
> ----
> 
> Model name:                      Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> Cores IDs:
> 0,20    2,22    4,24    6,26    8,28    10,30   12,32   14,34   16,36   18,38
> 
> NICs
> ----
> 
> SLOT          DRIVER     IFNAME     MAC                LINK/STATE  SPEED   DEVICE
> 0000:18:00.0  i40e       eno1       e4:43:4b:48:6a:20  1/up        1Gb/s   Ethernet Controller X710 for 10GbE SFP+
> 0000:18:00.1  i40e       eno2       e4:43:4b:48:6a:21  1/up        1Gb/s   Ethernet Controller X710 for 10GbE SFP+
> 0000:18:00.2  i40e       eno3       e4:43:4b:48:6a:22  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> 0000:18:00.3  i40e       eno4       e4:43:4b:48:6a:23  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> 0000:3b:00.0  vfio-pci   -          -                  -/-         -       Ethernet Controller E810-C for QSFP
> 0000:3b:00.1  vfio-pci   -          -                  -/-         -       Ethernet Controller E810-C for QSFP
> 0000:5e:00.0  tg3        ens3f0     00:0a:f7:d9:e4:14  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> 0000:5e:00.1  tg3        ens3f1     00:0a:f7:d9:e4:15  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> 0000:5e:00.2  tg3        ens3f2     00:0a:f7:d9:e4:16  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> 0000:5e:00.3  tg3        ens3f3     00:0a:f7:d9:e4:17  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> 0000:5f:00.0  mlx5_core  ens2f0np0  04:3f:72:b8:be:6a  1/up        10Gb/s  MT27800 Family [ConnectX-5]
> 0000:5f:00.1  mlx5_core  ens2f1np1  04:3f:72:b8:be:6b  1/up        10Gb/s  MT27800 Family [ConnectX-5]
> 
> NUMA 1
> ======
> 
> Memory: 188GB
> 2MB hugepages: 0
> 1GB hugepages: 32
> 
> CPUs
> ----
> 
> Model name:                      Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> Cores IDs:
> 1,21    3,23    5,25    7,27    9,29    11,31   13,33   15,35   17,37   19,39
> 
> NICs
> ----
> 
> SLOT          DRIVER    IFNAME  MAC                LINK/STATE  SPEED   DEVICE
> 0000:af:00.0  i40e      ens4f0  f8:f2:1e:42:5d:70  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> 0000:af:00.1  i40e      ens4f1  f8:f2:1e:42:5d:70  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> 0000:af:00.2  vfio-pci  -       -                  -/-         -       Ethernet Controller X710 for 10GbE SFP+
> 0000:af:00.3  vfio-pci  -       -                  -/-         -       Ethernet Controller X710 for 10GbE SFP+

--- Additional comment from  on 2023-06-19 07:10:43 UTC ---

(In reply to Robin Jarry from comment #0)
> Hi folks,
> 
> When there is not enough room in the non-banned CPUs APIC, irqbalance seems
> to silently let irq affinities overspill on banned CPUs.
> 
> Here are a few traces to highlight the problem. The tool I am using
> (irqstat) is available here: https://pypi.org/project/linux-tools/
> 
> > # rpm -qa irqbalance
> > irqbalance-1.9.0-3.el9.x86_64
> >
> > ~# grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance
> > IRQBALANCE_BANNED_CPULIST=2-19,22-39
> >
> > ~# irqstat -n | grep -v '\<0\>$'
> > CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> > 0              114             114
> > 1               43              34
> > 20             184             179
> > 21              61              51
> >
> > ~# echo 10 > /sys/class/net/ens2f0np0/device/sriov_numvfs
> > ~# echo 10 > /sys/class/net/ens2f1np1/device/sriov_numvfs
> > ~# echo 10 > /sys/class/net/eno3/device/sriov_numvfs
> > ~# echo 10 > /sys/class/net/eno4/device/sriov_numvfs
> >
> > ~# irqstat -n | grep -v '\<0\>$'
> > CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> > 0              224             201
> > 1               52              43
> > 2               97              78
> > 4               12              11
> > 6               12              11
> > 8               13              12
> > 10               6               5
> > 12              12              11
> > 14              12              11
> > 16              13              12
> > 18              13              12
> > 20             234             201
> > 21              52              42
> > 22              81              68
> >
> > ~# irqstat -c 2
> > IRQ        AFFINITY  EFFECTIVE-CPU  DESCRIPTION
> > 47                2              2  IR-PCI-MSI 12582920-edge i40e-eno1-TxRx-7
> > 61                2              2  IR-PCI-MSI 12582934-edge i40e-eno1-TxRx-21
> > 62                2              2  IR-PCI-MSI 12582935-edge i40e-eno1-TxRx-22
> > 63                2              2  IR-PCI-MSI 12582936-edge i40e-eno1-TxRx-23
> > 64                2              2  IR-PCI-MSI 12582937-edge i40e-eno1-TxRx-24
> > 66                2              2  IR-PCI-MSI 12582939-edge i40e-eno1-TxRx-26
> > 68                2              2  IR-PCI-MSI 12582941-edge i40e-eno1-TxRx-28
> > 70                2              2  IR-PCI-MSI 12582943-edge i40e-eno1-TxRx-30
> > 77                2              2  IR-PCI-MSI 12582950-edge i40e-eno1-TxRx-37
> > 92                2              2  IR-PCI-MSI 12584960-edge i40e-0000:18:00.1:misc
> > 97                2              2  IR-PCI-MSI 12584965-edge i40e-eno2-TxRx-4
> > 102               2              2  IR-PCI-MSI 12584970-edge i40e-eno2-TxRx-9
> > 128               2              2  IR-PCI-MSI 12584988-edge i40e-eno2-TxRx-27
> > 133               2              2  IR-PCI-MSI 12584993-edge i40e-eno2-TxRx-32
> > 134               2              2  IR-PCI-MSI 12584994-edge i40e-eno2-TxRx-33
> > 139               2              2  IR-PCI-MSI 12584999-edge i40e-eno2-TxRx-38
> > 141               2              2  IR-PCI-MSI 12585001-edge i40e-0000:18:00.1:fdir-TxRx-0
> > 157               2              2  IR-PCI-MSI 49285123-edge ens3f1-rx-3
> > 168               2              2  IR-PCI-MSI 49289220-edge ens3f3-rx-4
> > 251               2              2  IR-PCI-MSI 12587020-edge i40e-eno3-TxRx-11
> > 253               2              2  IR-PCI-MSI 12587022-edge i40e-eno3-TxRx-13
> > 254               2              2  IR-PCI-MSI 12587023-edge i40e-eno3-TxRx-14
> > 256               2              2  IR-PCI-MSI 12587025-edge i40e-eno3-TxRx-16
> > 257               2              2  IR-PCI-MSI 12587026-edge i40e-eno3-TxRx-17
> > 260               2              2  IR-PCI-MSI 12587029-edge i40e-eno3-TxRx-20
> > 269               2              2  IR-PCI-MSI 12587038-edge i40e-eno3-TxRx-29
> > 316       0,2,20,22              2  IR-PCI-MSI 49827840-edge mlx5_comp0@pci:0000:5f:01.2
> > 317               2              2  IR-PCI-MSI 49827841-edge mlx5_comp1@pci:0000:5f:01.2
> > 329               2              2  IR-PCI-MSI 49829889-edge mlx5_comp1@pci:0000:5f:01.3
> > 352       0,2,20,22              2  IR-PCI-MSI 49833984-edge mlx5_comp0@pci:0000:5f:01.5
> > 353               2              2  IR-PCI-MSI 49833985-edge mlx5_comp1@pci:0000:5f:01.5
> > 364       0,2,20,22              2  IR-PCI-MSI 49836032-edge mlx5_comp0@pci:0000:5f:01.6
> > 365               2              2  IR-PCI-MSI 49836033-edge mlx5_comp1@pci:0000:5f:01.6
> > 380               2              2  IR-PCI-MSI 49807363-edge mlx5_comp3@pci:0000:5f:00.0
> > 382               2              2  IR-PCI-MSI 49807365-edge mlx5_comp5@pci:0000:5f:00.0
> > 384               2              2  IR-PCI-MSI 49807367-edge mlx5_comp7@pci:0000:5f:00.0
> > 386               2              2  IR-PCI-MSI 49807369-edge mlx5_comp9@pci:0000:5f:00.0
> > 387               2              2  IR-PCI-MSI 49807370-edge mlx5_comp10@pci:0000:5f:00.0
> > 406               2              2  IR-PCI-MSI 49807389-edge mlx5_comp29@pci:0000:5f:00.0
> > 407               2              2  IR-PCI-MSI 49807390-edge mlx5_comp30@pci:0000:5f:00.0
> > 408               2              2  IR-PCI-MSI 49807391-edge mlx5_comp31@pci:0000:5f:00.0
> > 413               2              2  IR-PCI-MSI 49807396-edge mlx5_comp36@pci:0000:5f:00.0
> > 420               2              2  IR-PCI-MSI 12589058-edge i40e-eno4-TxRx-1
> > 421               2              2  IR-PCI-MSI 12589059-edge i40e-eno4-TxRx-2
> > 423               2              2  IR-PCI-MSI 12589061-edge i40e-eno4-TxRx-4
> > 425               2              2  IR-PCI-MSI 12589063-edge i40e-eno4-TxRx-6
> > 427               2              2  IR-PCI-MSI 12589065-edge i40e-eno4-TxRx-8
> > 436               2              2  IR-PCI-MSI 12589074-edge i40e-eno4-TxRx-17
> > 441               2              2  IR-PCI-MSI 12589079-edge i40e-eno4-TxRx-22
> > 447               2              2  IR-PCI-MSI 12589085-edge i40e-eno4-TxRx-28
> > 450               2              2  IR-PCI-MSI 12589088-edge i40e-eno4-TxRx-31
> > 452               2              2  IR-PCI-MSI 12589090-edge i40e-eno4-TxRx-33
> > 454               2              2  IR-PCI-MSI 12589092-edge i40e-eno4-TxRx-35
> > 457               2              2  IR-PCI-MSI 12589095-edge i40e-eno4-TxRx-38
> > 530               2              2  IR-PCI-MSI 49809418-edge mlx5_comp10@pci:0000:5f:00.1
> > 531               2              2  IR-PCI-MSI 49809419-edge mlx5_comp11@pci:0000:5f:00.1
> > 532               2              2  IR-PCI-MSI 49809420-edge mlx5_comp12@pci:0000:5f:00.1
> > 533               2              2  IR-PCI-MSI 49809421-edge mlx5_comp13@pci:0000:5f:00.1
> > 534               2              2  IR-PCI-MSI 49809422-edge mlx5_comp14@pci:0000:5f:00.1
> > 535               2              2  IR-PCI-MSI 49809423-edge mlx5_comp15@pci:0000:5f:00.1
> > 536               2              2  IR-PCI-MSI 49809424-edge mlx5_comp16@pci:0000:5f:00.1
> > 537               2              2  IR-PCI-MSI 49809425-edge mlx5_comp17@pci:0000:5f:00.1
> > 538               2              2  IR-PCI-MSI 49809426-edge mlx5_comp18@pci:0000:5f:00.1
> > 539               2              2  IR-PCI-MSI 49809427-edge mlx5_comp19@pci:0000:5f:00.1
> > 611               2              2  IR-PCI-MSI 49838081-edge mlx5_comp1@pci:0000:5f:01.7
> > 622       0,2,20,22              2  IR-PCI-MSI 49840128-edge mlx5_comp0@pci:0000:5f:02.0
> > 623               2              2  IR-PCI-MSI 49840129-edge mlx5_comp1@pci:0000:5f:02.0
> > 635               2              2  IR-PCI-MSI 49842177-edge mlx5_comp1@pci:0000:5f:02.1
> > 647               2              2  IR-PCI-MSI 49844225-edge mlx5_comp1@pci:0000:5f:02.2
> > 659               2              2  IR-PCI-MSI 49846273-edge mlx5_comp1@pci:0000:5f:02.3
> > 732       0,2,20,22              2  IR-PCI-MSI 12756995-edge iavf-eno3v5-TxRx-2
> > 734       0,2,20,22              2  IR-PCI-MSI 12759040-edge iavf-0000:18:0a.6:mbx
> > 762       0,2,20,22              2  IR-PCI-MSI 12824579-edge iavf-eno4v6-TxRx-2
> > 767       0,2,20,22              2  IR-PCI-MSI 12814339-edge iavf-eno4v1-TxRx-2
> > 772       0,2,20,22              2  IR-PCI-MSI 12826627-edge iavf-eno4v7-TxRx-2
> > 784       0,2,20,22              2  IR-PCI-MSI 12830720-edge iavf-0000:18:0f.1:mbx
> > 787       0,2,20,22              2  IR-PCI-MSI 12830723-edge iavf-eno4v9-TxRx-2
> > 792       0,2,20,22              2  IR-PCI-MSI 12816387-edge iavf-eno4v2-TxRx-2
> 
> The irq affinities are overspilling on banned cpus. This probably is because
> the APIC from cpus 0 and 20 are full which is only a hardware limitation.
> 
> Reducing the span of banned cpus fixes the issue:
> 
> > # grep ^IRQBALANCE_BANNED_CPU /etc/sysconfig/irqbalance
> > IRQBALANCE_BANNED_CPULIST=4-19,24-39
> >
> > ~# irqstat -n | grep -v '\<0\>$'
> > CPU  AFFINITY-IRQs  EFFECTIVE-IRQs
> > 0              160             164
> > 1               31              22
> > 2              162             153
> > 3               30              21
> > 20             164             159
> > 21              31              21
> > 22             162             157
> > 23              30              21
> 
> However, it would be nice if irqbalance could at least log a warning or an
> error that it failed to set the affinity for a specific irq.
> 
> > ~# echo 0,20 > /proc/irq/47/smp_affinity_list
> > -bash: echo: write error: No space left on device
> 
> For the record, here is the platform overview:
> 
> > NUMA 0
> > ======
> > 
> > Memory: 187GB
> > 2MB hugepages: 0
> > 1GB hugepages: 32
> > 
> > CPUs
> > ----
> > 
> > Model name:                      Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> > Cores IDs:
> > 0,20    2,22    4,24    6,26    8,28    10,30   12,32   14,34   16,36   18,38
> > 
> > NICs
> > ----
> > 
> > SLOT          DRIVER     IFNAME     MAC                LINK/STATE  SPEED   DEVICE
> > 0000:18:00.0  i40e       eno1       e4:43:4b:48:6a:20  1/up        1Gb/s   Ethernet Controller X710 for 10GbE SFP+
> > 0000:18:00.1  i40e       eno2       e4:43:4b:48:6a:21  1/up        1Gb/s   Ethernet Controller X710 for 10GbE SFP+
> > 0000:18:00.2  i40e       eno3       e4:43:4b:48:6a:22  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> > 0000:18:00.3  i40e       eno4       e4:43:4b:48:6a:23  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> > 0000:3b:00.0  vfio-pci   -          -                  -/-         -       Ethernet Controller E810-C for QSFP
> > 0000:3b:00.1  vfio-pci   -          -                  -/-         -       Ethernet Controller E810-C for QSFP
> > 0000:5e:00.0  tg3        ens3f0     00:0a:f7:d9:e4:14  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> > 0000:5e:00.1  tg3        ens3f1     00:0a:f7:d9:e4:15  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> > 0000:5e:00.2  tg3        ens3f2     00:0a:f7:d9:e4:16  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> > 0000:5e:00.3  tg3        ens3f3     00:0a:f7:d9:e4:17  0/down      -       NetXtreme BCM5719 Gigabit Ethernet PCIe
> > 0000:5f:00.0  mlx5_core  ens2f0np0  04:3f:72:b8:be:6a  1/up        10Gb/s  MT27800 Family [ConnectX-5]
> > 0000:5f:00.1  mlx5_core  ens2f1np1  04:3f:72:b8:be:6b  1/up        10Gb/s  MT27800 Family [ConnectX-5]
> > 
> > NUMA 1
> > ======
> > 
> > Memory: 188GB
> > 2MB hugepages: 0
> > 1GB hugepages: 32
> > 
> > CPUs
> > ----
> > 
> > Model name:                      Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> > Cores IDs:
> > 1,21    3,23    5,25    7,27    9,29    11,31   13,33   15,35   17,37   19,39
> > 
> > NICs
> > ----
> > 
> > SLOT          DRIVER    IFNAME  MAC                LINK/STATE  SPEED   DEVICE
> > 0000:af:00.0  i40e      ens4f0  f8:f2:1e:42:5d:70  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> > 0000:af:00.1  i40e      ens4f1  f8:f2:1e:42:5d:70  1/up        10Gb/s  Ethernet Controller X710 for 10GbE SFP+
> > 0000:af:00.2  vfio-pci  -       -                  -/-         -       Ethernet Controller X710 for 10GbE SFP+
> > 0000:af:00.3  vfio-pci  -       -                  -/-         -       Ethernet Controller X710 for 10GbE SFP+

Hi Robin,

Thanks for reporting the issue, I think it is reasonable to have a notification when irqbalance overspill IRQs on banned cpus. Just for curiousity, which beaker machine did you use for the testing? I failed to simulate a system which have enough devices' IRQs to overspill using qemu.

Thanks,
Tao Liu

--- Additional comment from Robin Jarry on 2023-06-19 07:37:08 UTC ---

Hi there, any machine with SRIOV capable PCI devices should be enough to produce more than 224 IRQs. I don't think you will be able to test this in QEMU.

--- Additional comment from  on 2023-07-04 02:34:39 UTC ---

Patch[1] posted upstream

[1]: https://github.com/Irqbalance/irqbalance/pull/265

--- Additional comment from Robin Jarry on 2023-07-05 14:47:01 UTC ---

I just realized that this issue actually hides another regression.

This patch https://github.com/Irqbalance/irqbalance/commit/55c5c321c73e4c9b54e041ba8c7d542598685bae (included in irqbalance 1.7.0) causes any failure to enforce smp_affinity to ban the IRQ for the whole life of the process. APIC being out of space is a transient issue but irqbalance will never try again to move the interrupt to another CPU unless it is restarted.

I have submitted another pull request here https://github.com/Irqbalance/irqbalance/pull/266.

Waiting for feedback.

Comment 16 Miguel Angel Nieto 2024-10-25 08:12:48 UTC
I have been configured 96 vfs (64 in intel nics and 32 in mellanox nics) and compared results settings drivers_autoprobe to true/false. I obtained good performance results for dpdk/sriov in both cases and all testcases in basic regression run sucessfully.
Somehow, I am not seeing in latest puddle performance degradation if i increase a lot the number of vfs and drivers_autoprobe is set to true. There are other improvements in this build that may be helping (nothing is writing to the serial/vga console anymore).
In any case, i have tested that setting "drivers_autoprobe: false" does not break anything in a dpdk/sriov environment.

(undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
RHOS-17.1-RHEL-9-20241014.n.1
[root@compute-0 tripleo-admin]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 9.2 (Plow)
[root@compute-0 tripleo-admin]# uname -a
Linux compute-0 5.14.0-284.86.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Sep 23 12:42:39 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux


- type: sriov_pf
  name: nic9
  mtu: 9000
  numvfs: 32
  use_dhcp: false
  defroute: false
  nm_controlled: true
  hotplug: true
  promisc: false
  drivers_autoprobe: false

- type: sriov_pf
  name: nic10
  mtu: 9000
  numvfs: 32
  use_dhcp: false
  defroute: false
  nm_controlled: true
  hotplug: true
  promisc: false
  drivers_autoprobe: false

- type: sriov_pf
  name: nic11
  mtu: 9000
  numvfs: 16
  use_dhcp: false
  defroute: false
  nm_controlled: true
  hotplug: true
  promisc: false
  drivers_autoprobe: false

- type: sriov_pf
  name: nic12
  mtu: 9000
  numvfs: 16
  use_dhcp: false
  defroute: false
  nm_controlled: true
  hotplug: true
  promisc: false
  drivers_autoprobe: false

Comment 21 errata-xmlrpc 2024-11-21 09:38:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9974


Note You need to log in before you can comment on or make changes to this bug.