Bug 1687320

Summary: [openvswitch2.11] EAL control threads do not comply with cpu affinity
Product: Red Hat Enterprise Linux Fast Datapath Reporter: David Marchand <dmarchan>
Component: openvswitch2.11Assignee: David Marchand <dmarchan>
Status: CLOSED ERRATA QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: FDP 19.ACC: ctrautma, jhsiao, ovs-qe, pvauter, qding, ralongi, tli
Target Milestone: ---   
Target Release: FDP 19.C   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1687316 Environment:
Last Closed: 2019-06-05 14:57:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1687316, 1713702    
Bug Blocks:    

Description David Marchand 2019-03-11 09:12:44 UTC
+++ This bug was initially created as a clone of Bug #1687316 +++

Description of problem:

The problem affects both Linux and FreeBSD implementations.
When a dpdk control thread is created (interrupt handler thread, multi process thread, vhost reconn thread, etc...), EAL constructs the cpu affinity of the newly created thread by looking at all availables cpu and remove the cpus from the startup corelist/coremask parameter.
The startup cpu affinity list is not taken into account which can have an impact on other performance critical processes running on the system.

How reproducible:
100%

Steps to Reproduce:
1. Start a dpdk application with a restricted cpu affinity list. Example:
taskset -c 0,1,2 testpmd -c 0x3 --no-pci --no-huge -m 512 -- -i --total-num-mbufs 2048


Actual results:
EAL started control threads out of the cpu affinity list:

$ grep -E '(Name|Cpus_allowed_list):' /proc/$(pidof testpmd)/task/*/status
/proc/87677/task/87677/status:Name:	testpmd
/proc/87677/task/87677/status:Cpus_allowed_list:	0
/proc/87677/task/87678/status:Name:	eal-intr-thread
/proc/87677/task/87678/status:Cpus_allowed_list:	2-27
/proc/87677/task/87679/status:Name:	rte_mp_handle
/proc/87677/task/87679/status:Cpus_allowed_list:	2-27
/proc/87677/task/87680/status:Name:	lcore-slave-1
/proc/87677/task/87680/status:Cpus_allowed_list:	1

Expected results:
Control threads should be kept in the initial cpu affinity list, not colliding with "datapath" threads specified via the coremask/corelist EAL parameters.


Additional info:

Bug reported upstream and fixed in commit:
https://git.dpdk.org/dpdk/commit/?id=c3568ea376700df061abcbeabc40ddaed7841e1a

Comment 7 Jean-Tsung Hsiao 2019-05-08 09:29:21 UTC
Hi David,
I have verified the fix.
Please check the logs attached below --- openvswitch2.11-2.11.0-9.el7fdp.x86_64 against openvswitch2.11-2.11.0-5.el7fdp.x86_64.
Thanks!
Jean

==============================================

NUMA node0 CPU(s):     0,2,4,6,8,10,12,14
NUMA node1 CPU(s):     1,3,5,7,9,11,13,15

[root@netqe7 tuned]# cat cpu-partitioning-variables.conf
# Examples:
# isolated_cores=2,4-7
# isolated_cores=2-23
isolated_cores=1,3,5,7,9,11
#
# To disable the kernel load balancing in certain isolated CPUs:
# no_balance_cores=5-10
no_balance_cores=1,3,5,7,9,11

[root@netqe7 jhsiao]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-957.12.1.el7.x86_64 root=/dev/mapper/rhel_netqe7-root ro intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=32 crashkernel=auto rd.lvm.lv=rhel_netqe7/root rd.lvm.lv=rhel_netqe7/swap console=ttyS1,115200 skew_tick=1 nohz=on nohz_full=1,3,5,7,9,11 rcu_nocbs=1,3,5,7,9,11 tuned.non_isolcpus=0000f555 intel_pstate=disable nosoftlockup
[root@netqe7 jhsiao]#

other_config        : {dpdk-init="true", dpdk-socket-mem="4096,4096", pmd-cpu-mask="0x0a0a"}

=============================================
openvswitch2.11-2.11.0-5.el7fdp.x86_64

grep -E '(Name|Cpus_allowed_list):' /proc/$(pidof
ovs-vswitchd)/task/*/status
/proc/16682/task/16682/status:Name:    ovs-vswitchd
/proc/16682/task/16682/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16683/status:Name:    ovs-vswitchd
/proc/16682/task/16683/status:Cpus_allowed_list:    1-15
/proc/16682/task/16684/status:Name:    ovs-vswitchd
/proc/16682/task/16684/status:Cpus_allowed_list:    1-15
/proc/16682/task/16685/status:Name:    dpdk_watchdog1
/proc/16682/task/16685/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16688/status:Name:    urcu2
/proc/16682/task/16688/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16693/status:Name:    ct_clean3
/proc/16682/task/16693/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16710/status:Name:    pmd20
/proc/16682/task/16710/status:Cpus_allowed_list:    11
/proc/16682/task/16711/status:Name:    pmd21
/proc/16682/task/16711/status:Cpus_allowed_list:    3
/proc/16682/task/16712/status:Name:    pmd22
/proc/16682/task/16712/status:Cpus_allowed_list:    9
/proc/16682/task/16713/status:Name:    pmd23
/proc/16682/task/16713/status:Cpus_allowed_list:    1
/proc/16682/task/16714/status:Name:    handler24
/proc/16682/task/16714/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16715/status:Name:    handler25
/proc/16682/task/16715/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16716/status:Name:    handler26
/proc/16682/task/16716/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16717/status:Name:    handler27
/proc/16682/task/16717/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16718/status:Name:    handler28
/proc/16682/task/16718/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16719/status:Name:    handler29
/proc/16682/task/16719/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16720/status:Name:    handler30
/proc/16682/task/16720/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16721/status:Name:    handler31
/proc/16682/task/16721/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16722/status:Name:    handler32
/proc/16682/task/16722/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16723/status:Name:    handler33
/proc/16682/task/16723/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16724/status:Name:    handler34
/proc/16682/task/16724/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16725/status:Name:    revalidator35
/proc/16682/task/16725/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16726/status:Name:    revalidator36
/proc/16682/task/16726/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16727/status:Name:    revalidator37
/proc/16682/task/16727/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16728/status:Name:    revalidator38
/proc/16682/task/16728/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16682/task/16729/status:Name:    revalidator39
/proc/16682/task/16729/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15

============================================

openvswitch2.11-2.11.0-9.el7fdp.x86_64

grep -E '(Name|Cpus_allowed_list):' /proc/$(pidof
ovs-vswitchd)/task/*/status
/proc/16421/task/16421/status:Name:    ovs-vswitchd
/proc/16421/task/16421/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16422/status:Name:    ovs-vswitchd
/proc/16421/task/16422/status:Cpus_allowed_list:    2,4,6,8,10,12-15
/proc/16421/task/16423/status:Name:    ovs-vswitchd
/proc/16421/task/16423/status:Cpus_allowed_list:    2,4,6,8,10,12-15
/proc/16421/task/16424/status:Name:    dpdk_watchdog1
/proc/16421/task/16424/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16427/status:Name:    urcu2
/proc/16421/task/16427/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16432/status:Name:    ct_clean3
/proc/16421/task/16432/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16449/status:Name:    pmd20
/proc/16421/task/16449/status:Cpus_allowed_list:    11
/proc/16421/task/16450/status:Name:    pmd21
/proc/16421/task/16450/status:Cpus_allowed_list:    3
/proc/16421/task/16451/status:Name:    pmd22
/proc/16421/task/16451/status:Cpus_allowed_list:    9
/proc/16421/task/16452/status:Name:    pmd23
/proc/16421/task/16452/status:Cpus_allowed_list:    1
/proc/16421/task/16453/status:Name:    handler24
/proc/16421/task/16453/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16454/status:Name:    handler25
/proc/16421/task/16454/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16455/status:Name:    handler26
/proc/16421/task/16455/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16456/status:Name:    handler27
/proc/16421/task/16456/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16457/status:Name:    handler28
/proc/16421/task/16457/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16458/status:Name:    handler29
/proc/16421/task/16458/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16459/status:Name:    handler30
/proc/16421/task/16459/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16460/status:Name:    handler31
/proc/16421/task/16460/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16461/status:Name:    handler32
/proc/16421/task/16461/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16462/status:Name:    handler33
/proc/16421/task/16462/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16463/status:Name:    handler34
/proc/16421/task/16463/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16464/status:Name:    revalidator35
/proc/16421/task/16464/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16465/status:Name:    revalidator36
/proc/16421/task/16465/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16466/status:Name:    revalidator37
/proc/16421/task/16466/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16467/status:Name:    revalidator38
/proc/16421/task/16467/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15
/proc/16421/task/16468/status:Name:    revalidator39
/proc/16421/task/16468/status:Cpus_allowed_list:    0,2,4,6,8,10,12-15

Comment 8 David Marchand 2019-05-08 14:22:58 UTC
Hello Jean,

We can see the issue is present in your first output:

/proc/16682/task/16683/status:Name:    ovs-vswitchd
/proc/16682/task/16683/status:Cpus_allowed_list:    1-15
/proc/16682/task/16684/status:Name:    ovs-vswitchd
/proc/16682/task/16684/status:Cpus_allowed_list:    1-15


And with the new build, the similar ovs threads are skipping the isolated cpus:

/proc/16421/task/16422/status:Name:    ovs-vswitchd
/proc/16421/task/16422/status:Cpus_allowed_list:    2,4,6,8,10,12-15
/proc/16421/task/16423/status:Name:    ovs-vswitchd
/proc/16421/task/16423/status:Cpus_allowed_list:    2,4,6,8,10,12-15


So I agree.
Your test is ok and the issue is fixed.
Thanks!

Comment 10 errata-xmlrpc 2019-06-05 14:57:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1384