RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1844520 - Incorrect pinning of IRQ threads on isolated CPUs by drivers that use cpumask_local_spread()
Summary: Incorrect pinning of IRQ threads on isolated CPUs by drivers that use cpumask...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.3
Assignee: Nitesh Narayan Lal
QA Contact: Pei Zhang
Jaroslav Klech
URL:
Whiteboard:
Depends On:
Blocks: 1807069 1817732 1867174 1868433
TreeView+ depends on / blocked
 
Reported: 2020-06-05 15:20 UTC by Nitesh Narayan Lal
Modified: 2020-11-04 01:22 UTC (History)
17 users (show)

Fixed In Version: kernel-4.18.0-229.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1867174 (view as bug list)
Environment:
Last Closed: 2020-11-04 01:20:59 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:4431 0 None None None 2020-11-04 01:21:34 UTC

Description Nitesh Narayan Lal 2020-06-05 15:20:10 UTC
Description of problem:
Loading any kernel module on the runtime that uses cpumask_local_spread() ends up pinning IRQ threads on CPUs including those which are isolated.

Version-Release number of selected component (if applicable):
kernel-4.18.0-193.7.1.rt13.58.el8_2
tuned-2.13.0-6.el8.noarch

How reproducible:
Always.

Steps to Reproduce:
1. Setup an RT host
2. Isolate CPUs (including the starting CPUs such as 0,1,2 etc) using a realtime-virtual-host profile.
3. Load a kernel module that uses cpumsak_local_spread() eg. iavf and create VFs.
4. list the threads that are pinned to isolated CPUs.

Actual results:
CPU2:    466 FF 99 [watchdogd]
CPU2:     30 FF 99 [posixcputmr/2]*
CPU2:     29 FF 99 [migration/2]*
CPU2:     28 FF 99 [watchdog/2]*
CPU2:   7583 FF 50 [irq/503-iavf-et]*
CPU2:   7578 FF 50 [irq/498-iavf-et]*
CPU2:   7573 FF 50 [irq/493-iavf-et]*
CPU2:   7568 FF 50 [irq/488-iavf-et]*
CPU2:   7563 FF 50 [irq/483-iavf-et]*
...

CPU2:   6924 FF 50 [irq/247-i40e-et]*
CPU2:   6858 FF 50 [irq/110-i40e-et]*
CPU2:   6798 FF 50 [irq/179-i40e-et]*
CPU2:   2087 FF 50 [irq/243-eth4-Tx]*
CPU2:   1904 FF 50 [irq/64-eth0-TxR]*
CPU2:   1089 FF 50 [irq/84-megasas]*
CPU2:    462 FF 50 [irq/9-acpi]*
CPU2:     31 FF  1 [rcuc/2]*
CPU2:     12 FF  1 [rcub/1]
CPU2:     11 FF  1 [rcu_preempt]
CPU3:    466 FF 99 [watchdogd]
CPU3:     38 FF 99 [posixcputmr/3]*
CPU3:     37 FF 99 [migration/3]*
CPU3:     36 FF 99 [watchdog/3]*
CPU3:   7589 FF 50 [irq/509-iavf-et]*
CPU3:   7584 FF 50 [irq/504-iavf-et]*
CPU3:   7579 FF 50 [irq/499-iavf-et]*
CPU3:   7574 FF 50 [irq/494-iavf-et]*
...
CPU3:   7463 FF 50 [irq/384-iavf-et]*
CPU3:   7458 FF 50 [irq/379-iavf-et]*
CPU3:   7453 FF 50 [irq/374-iavf-et]*
CPU3:   7035 FF 50 [irq/316-i40e-et]*
CPU3:   6925 FF 50 [irq/248-i40e-et]*
CPU3:   6859 FF 50 [irq/111-i40e-et]*
CPU3:   6799 FF 50 [irq/180-i40e-et]*
CPU3:   1065 FF 50 [irq/47-megasas]*
CPU3:    593 FF 50 [irq/41-PCIe]*
CPU3:     39 FF  1 [rcuc/3]*
CPU3:     12 FF  1 [rcub/1]
CPU3:     11 FF  1 [rcu_preempt]


Expected results:
We don't expect iavf IRQ threads pinned to the isolated CPUs as that may case latency overhead when an interrupts on the threads pinned to these isolated CPUs are triggered.
CPU2:     30 FF 99 [posixcputmr/2]*
CPU2:     29 FF 99 [migration/2]*
CPU2:     31 FF  4 [rcuc/2]*
CPU2:     12 FF  4 [rcub/1]
CPU2:     32 FF  2 [ksoftirqd/2]*
CPU3:     39 FF 99 [posixcputmr/3]*
CPU3:     38 FF 99 [migration/3]*
CPU3:     40 FF  4 [rcuc/3]*
CPU3:     12 FF  4 [rcub/1]
CPU3:     41 FF  2 [ksoftirqd/3]*

Additional info:
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-193.7.1.rt13.58.el8_2.x86_64 root=/dev/mapper/rhel_hab--19-root ro biosdevname=0 net.ifnames=0 noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off crashkernel=auto resume=/dev/mapper/rhel_hab--19-swap rd.lvm.lv=rhel_hab-19/root rd.lvm.lv=rhel_hab-19/swap intel_iommu=on iommu=pt pci=realloc =1 isolcpus=managed_irq,domain,2-23,26-47 intel_pstate=disable nosoftlockup tsc=nowatchdog nohz=on nohz_full=2-23,26-47 rcu_nocbs=2-23,26-47

These threads can be removed by re-applying the tuned realtime-virtual-host profile, which will move these IRQ threads away from the isolated CPUs. But ideally, there should not be a need to re-apply the profile as it has already been applied once. However, even after this affinity_hint is not corrected as it still includes isolated CPUs.
example:
cat /proc/irq/37*/affinity_hint
0000,00000000
0000,00000001
0000,00000002
0000,00000004
0000,00000008
0000,00000000
0000,00000001
0000,00000002
0000,00000004
0000,00000008


Issue:
This behavior is caused because cpumask_local_spread() doesn't respect CPU isolation while returning the CPUs. This eventually leads to the pinning of IRQ threads to CPUs that are already isolated.

Comment 1 Nitesh Narayan Lal 2020-06-05 15:22:08 UTC
I am assigning the bug to myself as I am already working on a fix that will ensure that cpumask_local_spread() only uses housekeeping CPUs. This fix is derived from one of the existing patches of task isolation patch-series [1] that is currently under discussion. I will be posting the fix upstream with some changes and will share a link here. 

[1] https://lkml.org/lkml/2020/4/9/530

Comment 2 Nitesh Narayan Lal 2020-06-15 19:41:27 UTC
Patches have been posted upstream: https://lore.kernel.org/lkml/20200610161226.424337-1-nitesh@redhat.com/

Comment 13 Frantisek Hrbata 2020-07-31 06:36:10 UTC
Patch(es) available on kernel-4.18.0-229.el8

Comment 19 Pei Zhang 2020-08-11 11:45:45 UTC
Seems this issue is hardware related. Following Nitesh's instructions, I cannot reproduce this issue on below 2 servers with XXV710 NICs and XL710 NICs. Next, I'll try to reproduce and verify on Nitesh's server.

(1)dell-per730-27.lab.eng.pek2.redhat.com

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             2299.763
BogoMIPS:            4599.62
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            40960K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d

# lspci | grep Eth
...
82:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
82:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)

(2) dell-per430-11.lab.eng.pek2.redhat.com
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              20
On-line CPU(s) list: 0-19
Thread(s) per core:  1
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             2297.239
BogoMIPS:            4594.46
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            25600K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d


# lspci | grep Eth
...
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
06:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
06:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)

Comment 20 Pei Zhang 2020-08-11 11:55:11 UTC
Update my testings here:

Version: 4.18.0-193.rt13.51.el8.x86_64:

Fails to reproduce this issue.

1. Create 2 VFs
echo 2 > /sys/bus/pci/devices/0000\:82\:00.0/sriov_numvfs

2. Find iavf irqs.

# find /proc/irq/ -name "*iav*"
/proc/irq/361/iavf-0000:82:02.0:mbx
/proc/irq/362/iavf-enp130s0f0v0-TxRx-0
/proc/irq/363/iavf-enp130s0f0v0-TxRx-1
/proc/irq/364/iavf-enp130s0f0v0-TxRx-2
/proc/irq/365/iavf-enp130s0f0v0-TxRx-3
/proc/irq/366/iavf-0000:82:02.1:mbx
/proc/irq/367/iavf-enp130s0f0v1-TxRx-0
/proc/irq/368/iavf-enp130s0f0v1-TxRx-1
/proc/irq/369/iavf-enp130s0f0v1-TxRx-2
/proc/irq/370/iavf-enp130s0f0v1-TxRx-3

3. Check cpu affinity of above threads. They are pin to housekeeping cores.
[root@dell-per730-27 ~]# cat /proc/irq/36*/smp_affinity_list
22
28
2
6
0
4
30
14
20
22
22

[root@dell-per730-27 ~]# cat /proc/irq/370/smp_affinity_list
12


Reference:

Host info:

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-193.rt13.51.el8.x86_64 root=/dev/mapper/rhel_dell--per730--27-root ro crashkernel=auto resume=/dev/mapper/rhel_dell--per730--27-swap rd.lvm.lv=rhel_dell-per730-27/root rd.lvm.lv=rhel_dell-per730-27/swap console=ttyS0,115200n81 skew_tick=1 isolcpus=managed_irq,domain,1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 intel_pstate=disable nosoftlockup tsc=nowatchdog nohz=on nohz_full=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 rcu_nocbs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31

[root@dell-per730-27 ~]# lspci | grep Eth
...
82:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
82:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
82:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
82:02.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             2300.009
BogoMIPS:            4599.62
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            40960K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d

Comment 21 broskos 2020-08-11 12:05:17 UTC
Pei,

A couple of suggestions to help see if you can get a reproducer:
- raise the # of VFs per port (go to 8 or maybe even higher, the card supports it)
- create VFs for every port on every NIC.  I see 2 nics with 2 ports each in one of your tests so that should result in 32 or more VFs for testing
- reduce the number of house keeping threads - a typical NFV server would only allocate core 0 of each numa to housekeeping.

Comment 22 Pei Zhang 2020-08-12 02:16:40 UTC
(In reply to broskos from comment #21)
> Pei,
> 
> A couple of suggestions to help see if you can get a reproducer:
> - raise the # of VFs per port (go to 8 or maybe even higher, the card
> supports it)
> - create VFs for every port on every NIC.  I see 2 nics with 2 ports each in
> one of your tests so that should result in 32 or more VFs for testing
> - reduce the number of house keeping threads - a typical NFV server would
> only allocate core 0 of each numa to housekeeping.

Thank you Brent for the suggestions. Now this issue can be reproduced on my setup after creating 32 VFs for both XL710 NICs and isolating 2-19 (only leave 0 and 1 as housekeeping cores).

Best regards,

Pei

Comment 23 Pei Zhang 2020-08-12 03:31:08 UTC
Steps:

Following Nitesh's reproducer in Description.

1. Setup RT host, leave 0 of each NUMA as housekeeping cores, other cores are isolated. In this setup, we isolate 2-19 (0,1 as housekeeping cores)

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              20
On-line CPU(s) list: 0-19
Thread(s) per core:  1
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             2297.583
BogoMIPS:            4594.85
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            25600K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-... isolcpus=managed_irq,domain,2-19 intel_pstate=disable nosoftlockup tsc=nowatchdog nohz=on nohz_full=2-19 rcu_nocbs=2-19

2. Create 32 VFs per PF. This NIC is XL710.

# lspci | grep Eth
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
...

# echo 32 > /sys/bus/pci/devices/0000\:04\:00.0/sriov_numvfs
# echo 32 > /sys/bus/pci/devices/0000\:04\:00.1/sriov_numvfs

3. Check irqs of iavf.

# find /proc/irq/ -name "*iav*"
/proc/irq/202/iavf-0000:04:02.0:mbx
/proc/irq/203/iavf-enp4s0f0v0-TxRx-0
/proc/irq/204/iavf-enp4s0f0v0-TxRx-1
/proc/irq/205/iavf-enp4s0f0v0-TxRx-2
/proc/irq/206/iavf-enp4s0f0v0-TxRx-3
/proc/irq/207/iavf-0000:04:02.1:mbx
/proc/irq/212/iavf-0000:04:02.4:mbx
/proc/irq/213/iavf-enp4s0f0v4-TxRx-0
/proc/irq/214/iavf-enp4s0f0v4-TxRx-1
/proc/irq/215/iavf-enp4s0f0v4-TxRx-2
/proc/irq/216/iavf-enp4s0f0v4-TxRx-3
/proc/irq/217/iavf-0000:04:02.5:mbx
/proc/irq/222/iavf-0000:04:03.2:mbx
/proc/irq/227/iavf-0000:04:02.7:mbx
/proc/irq/232/iavf-0000:04:03.1:mbx
/proc/irq/237/iavf-0000:04:03.3:mbx
/proc/irq/242/iavf-0000:04:04.2:mbx
/proc/irq/247/iavf-0000:04:05.2:mbx
/proc/irq/252/iavf-0000:04:04.3:mbx
/proc/irq/257/iavf-0000:04:04.1:mbx
/proc/irq/262/iavf-0000:04:05.1:mbx
/proc/irq/267/iavf-0000:04:03.4:mbx
/proc/irq/272/iavf-0000:04:04.0:mbx
/proc/irq/277/iavf-0000:04:05.0:mbx
/proc/irq/282/iavf-0000:04:03.6:mbx
/proc/irq/287/iavf-0000:04:05.5:mbx
/proc/irq/292/iavf-0000:04:04.6:mbx
/proc/irq/297/iavf-0000:04:03.7:mbx
/proc/irq/302/iavf-0000:04:05.6:mbx
/proc/irq/307/iavf-0000:04:05.3:mbx
/proc/irq/312/iavf-0000:04:04.4:mbx
/proc/irq/317/iavf-0000:04:03.5:mbx
/proc/irq/322/iavf-0000:04:05.4:mbx
/proc/irq/327/iavf-0000:04:04.5:mbx
/proc/irq/332/iavf-0000:04:04.7:mbx
/proc/irq/337/iavf-0000:04:05.7:mbx
/proc/irq/342/iavf-0000:04:0b.5:mbx
/proc/irq/347/iavf-0000:04:0a.3:mbx
/proc/irq/352/iavf-0000:04:0c.4:mbx
/proc/irq/357/iavf-0000:04:0c.5:mbx
/proc/irq/362/iavf-0000:04:0c.6:mbx
/proc/irq/367/iavf-0000:04:0a.0:mbx
/proc/irq/372/iavf-0000:04:0c.1:mbx
/proc/irq/377/iavf-0000:04:0a.1:mbx
/proc/irq/382/iavf-0000:04:0c.2:mbx
/proc/irq/387/iavf-0000:04:0a.2:mbx
/proc/irq/392/iavf-0000:04:0c.3:mbx
/proc/irq/397/iavf-0000:04:0b.6:mbx
/proc/irq/402/iavf-0000:04:0b.7:mbx
/proc/irq/407/iavf-0000:04:0c.0:mbx
/proc/irq/412/iavf-0000:04:0d.0:mbx
/proc/irq/417/iavf-0000:04:0d.1:mbx
/proc/irq/422/iavf-0000:04:0d.2:mbx
/proc/irq/427/iavf-0000:04:0c.7:mbx
/proc/irq/432/iavf-0000:04:0a.4:mbx
/proc/irq/437/iavf-0000:04:0d.3:mbx
/proc/irq/442/iavf-0000:04:0d.4:mbx
/proc/irq/447/iavf-0000:04:0a.5:mbx
/proc/irq/452/iavf-0000:04:0a.6:mbx
/proc/irq/457/iavf-0000:04:0d.5:mbx
/proc/irq/462/iavf-0000:04:0a.7:mbx
/proc/irq/467/iavf-0000:04:0d.6:mbx
/proc/irq/472/iavf-0000:04:0b.0:mbx
/proc/irq/477/iavf-0000:04:0d.7:mbx
/proc/irq/482/iavf-0000:04:0b.1:mbx
/proc/irq/487/iavf-0000:04:0b.2:mbx
/proc/irq/492/iavf-0000:04:0b.4:mbx
/proc/irq/497/iavf-0000:04:0b.3:mbx

4. Check each iavf threads cpu pin.
# for i in `seq 202 497`;do cat /proc/irq/$i/smp_affinity_list; done

Reproduced with 4.18.0-228.rt7.40.el8.x86_64:

After step 4, some iavf threads are pin to isolated cores, like 2,3. So this issue has been reproduced.

# for i in `seq 202 497`;do cat /proc/irq/$i/smp_affinity_list; done
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0-1
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3

Verified with 4.18.0-232.rt7.44.el8.x86_64:

After step 4, all iavf threads are pin to housekeeping cores(in this example, housekeeping cores are 0,1). 

# for i in `seq 202 497`;do cat /proc/irq/$i/smp_affinity_list; done
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0
0-1
1
0-1
1
0


So this issue has been fixed very well.


Move to 'VERIFIED'.

Comment 24 Jaroslav Klech 2020-08-19 11:01:08 UTC
I am migrating this bz's doctext to bz#1867174, as the problem needs to be published in 8.2 and imho 1867174 suits better for that purposes.

Regards,
Jaroslav

Comment 27 errata-xmlrpc 2020-11-04 01:20:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4431


Note You need to log in before you can comment on or make changes to this bug.