Description of problem: Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: Run ovs dpdk pvp performance of fdp22.J Actual results: >>>>>>>>>>>mlx5_core card<<<<<<<<<<<<<<<<: fdp22.I rhel8.6 ovs2.15: https://beaker.engineering.redhat.com/jobs/7129749 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71297/7129749/12786323/151620471/mlx5_100.html fdp22.J rhel8.6 ovs2.15 https://beaker.engineering.redhat.com/jobs/7036119 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70361/7036119/12646514/150570581/mlx5_100.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 5.1mpps fdp22.I 1q4pmd viommu case: 10.5mpps fdp22.I 2q4pmd viommu case: 9.9mpps fdp22.I 4q8pmd viommu case: 12.1mpps fdp22.J 1q2pmd viommu case: 2.9mpps fdp22.J 1q4pmd viommu case: 6.1mpps fdp22.J 2q4pmd viommu case: 5.9mpps fdp22.J 4q8pmd viommu case: 11.4mpps fdp22.J rhel8.4 ovs2.16 https://beaker.engineering.redhat.com/jobs/7133513 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71335/7133513/12791340/151660830/mlx5_100.html fdp22.I rhel8.4 ovs2.16 https://beaker.engineering.redhat.com/jobs/7028534 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70285/7028534/12634828/150483696/mlx5_100.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 5mpps fdp22.I 1q4pmd viommu case: 10.4mpps fdp22.I 2q4pmd viommu case: 9.6mpps fdp22.I 4q8pmd viommu case: 10.9mpps fdp22.J 1q2pmd viommu case: 2.9mpps fdp22.J 1q4pmd viommu case: 5.9mpps fdp22.J 2q4pmd viommu case: 5.8mpps fdp22.J 4q8pmd viommu case: 11.4mpps fdp22.J rhel9 ovs2.17 https://beaker.engineering.redhat.com/jobs/7149681 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html fdp22.I rhel9 ovs2.17 https://beaker.engineering.redhat.com/jobs/7090550 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 5.7mpps fdp22.I 1q4pmd viommu case: 10.5mpps fdp22.I 2q4pmd viommu case: 12.1mpps fdp22.I 4q8pmd viommu case: 15mpps fdp22.J 1q2pmd viommu case: 2.3mpps fdp22.J 1q4pmd viommu case: 5.7mpps fdp22.J 2q4pmd viommu case: 7.7mpps fdp22.J 4q8pmd viommu case: 12.7mpps >>>>>>>>>>>ice card<<<<<<<<<<<<<<<<: fdp22.J rhel9 ovs2.17 https://beaker.engineering.redhat.com/jobs/7138162 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71381/7138162/12798096/151711192/ice_25.html fdp22.I rhel9 ovs2.17 https://beaker.engineering.redhat.com/jobs/7033799 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033799/12642949/150542075/ice_25.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 4.6mpps fdp22.I 1q4pmd viommu case: 10.3mpps fdp22.I 2q4pmd viommu case: 14.7mpps fdp22.I 4q8pmd viommu case: 16.1mpps fdp22.J 1q2pmd viommu case: 4.6mpps fdp22.J 1q4pmd viommu case: 8.5mpps fdp22.J 2q4pmd viommu case: 9.6mpps fdp22.J 4q8pmd viommu case: 13.8mpps fdp22.I rhel8.4 ovs2.16 https://beaker.engineering.redhat.com/jobs/7065221 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70652/7065221/12690545/150890276/ice_25.html fdp22.J rhel8.4 ovs2.16 https://beaker.engineering.redhat.com/jobs/7123818 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71238/7123818/12777095/151554035/ice_25.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 6.2mpps fdp22.I 1q4pmd viommu case: 12mpps fdp22.I 2q4pmd viommu case: 9.2mpps fdp22.I 4q8pmd viommu case: 14.7mpps fdp22.J 1q2pmd viommu case: 3.6mpps fdp22.J 1q4pmd viommu case: 7.4mpps fdp22.J 2q4pmd viommu case: 7.1mpps fdp22.J 4q8pmd viommu case: 14.4mpps fdp22.I rhel8.6 ovs2.15 https://beaker.engineering.redhat.com/jobs/7120026 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71200/7120026/12771290/151509639/ice_25.html fdp22.J rhel8.6 ovs2.15 https://beaker.engineering.redhat.com/jobs/7037370 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70373/7037370/12648815/150584723/ice_25.html such as 64bytes viommu case: fdp22.I 1q2pmd viommu case: 5.9mpps fdp22.I 1q4pmd viommu case: 13.1mpps fdp22.I 2q4pmd viommu case: 12mpps fdp22.I 4q8pmd viommu case: 13.1mpps fdp22.J 1q2pmd viommu case: 3.7mpps fdp22.J 1q4pmd viommu case: 7.5mpps fdp22.J 2q4pmd viommu case: 7.4mpps fdp22.J 4q8pmd viommu case: 14.8mpps Expected results: The performance of fdp22.J should not lower than fdp22.I. Additional info: Run more tests on mlx5_core card: fdp22.I: fdp22.I dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7125487 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125487/12780457/151576028/mlx5_100.html such as 64bytes viommu case: 1q2pmd viommu case: 4.1mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 6.9mpps 4q8pmd viommu case: 14.8mpps fdp22.I dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7124434 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71244/7124434/12778223/151561666/mlx5_100.html such as 64bytes viommu case: 1q2pmd viommu case: 4.1mpps 1q4pmd viommu case: 6.7mpps 2q4pmd viommu case: 7.7mpps 4q8pmd viommu case: 14.7mpps fdp22.I dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7033793 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033793/12642941/150542058/mlx5_100.html such as 64bytes viommu case: 1q2pmd viommu case: 7.0mpps 1q4pmd viommu case: 10.6mpps 2q4pmd viommu case: 14mpps 4q8pmd viommu case: 15mpps fdp22.I dpdk-21.11.2-1.el9_1, https://beaker.engineering.redhat.com/jobs/7090550 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html such as 64bytes viommu case: 1q2pmd viommu case: 5.7mpps 1q4pmd viommu case: 10.5mpps 2q4pmd viommu case: 12.1mpps 4q8pmd viommu case: 15mpps fdp22.J: fdp22.J dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7119988 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71199/7119988/12771248/151509376/mlx5_100.html such as 64bytes viommu case: 1q2pmd viommu case: 3.4mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 7.7mpps 4q8pmd viommu case: 9.2mpps fdp22.J dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7125483 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html 1q2pmd viommu case: 2.3mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 7.7mpps 4q8pmd viommu case: 14.7mpps fdp22.J dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7123721 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71237/7123721/12776960/151552834/mlx5_100.html 1q2pmd viommu case: 4.0mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 3.4mpps 4q8pmd viommu case: 12.7mpps fdp22.J dpdk-21.11-1.el9_0, https://beaker.engineering.redhat.com/jobs/7125483 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html 1q2pmd viommu case: 2.3mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 7.7mpps 4q8pmd viommu case: 14.7mpps fdp22.J dpdk-21.11.2-1.el9_1, https://beaker.engineering.redhat.com/jobs/7149681 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html 1q2pmd viommu case: 2.3mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 7.7mpps 4q8pmd viommu case: 12.7mpps
HI Li Ting, FYI I have run two sets of manual tests using OVS 2.15.0-119 and -124 respectively. And, I don't see significant difference between the two sets. Actually, 124 is a little bit better. Please see test results below. Thanks! Jean *** ovs-dpdk 1Q/4PMD, guest 1Q/2PMD *** 2.15.0-119 --- 8.0 Mpps 2.15.0-124 --- 8.2 Mpps *** ovs-dpdk 2Q/8PMD, guest 2Q/4PMD *** 2.15.0-119 --- 16.0 Mpps 2.15.0-124 --- 16.4 Mpps NIC connection anl151/CX-5 100 Gb <-> anl152/CX-6 100 Gb Host info [root@wsfd-advnetlab151 jhsiao]# uname -r 4.18.0-372.32.1.el8_6.x86_64 [root@wsfd-advnetlab151 jhsiao]# Trex binary search [root@wsfd-advnetlab152 trafficgen]# cat search-host-30-60-002.sh ./binary-search.py \ --traffic-generator=trex-txrx --frame-size=64 --traffic-direction=bidirectional \ --search-granularity=5 \ --search-runtime=30 --validation-runtime=60 --rate=10 \ --use-device-stats \ --num-flows=1024 \ --max-loss-pct=0.002 \ --measure-latency=1 --latency-rate=100000 \ --rate-tolerance=20 [root@wsfd-advnetlab152 trafficgen]#
I am also seeing decline in performance in FDP 22.J versus FDP 22.I with mlx5_core ConnectX-6 Dx card: # openvswitch2.16-2.16.0-103.el8fdp.x86_64.rpm, RHEL-8.4.0-updates-20211026.0, dpdk-20.11-3.el8.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114553 Performance results link: https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71145/7114553/12763059/151444245/mlx5_100_cx6.html # openvswitch2.15-2.15.0-124.el8fdp.x86_64.rpm, RHEL-8.6.0-updates-20221014.0, dpdk-21.11-1.el8.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114570 Performance results link: https://beaker.engineering.redhat.com/recipes/12763088/tasks/151444393/logs/mlx5_100_cx6.html Beaker job link: https://beaker.engineering.redhat.com/jobs/7114571 Performance results link: https://beaker.engineering.redhat.com/recipes/12763090/tasks/151444397/logs/mlx5_100_cx6.html # openvswitch2.17-2.17.0-49.el9fdp.x86_64.rpm, RHEL-9.0.0-updates-20221014.0, dpdk-21.11-1.el9_0.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114594 Performance results link: https://beaker.engineering.redhat.com/recipes/12763128/tasks/151444838/logs/mlx5_100_cx6.html
Those results are strange. Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 (current report for 22.J) are really trivial. I would not expect a performance impact. Was there other changes as part of those tests? - for the dut: is OVS the only package that changes in this comparison? For example, do we have a different kernel versions? If so, please compare with the same kernel version. - for the traffic generator / tester system: is it the same hw, sw, and test scripts being used in the comparison?
(In reply to David Marchand from comment #3) > Those results are strange. > > Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 > (current report for 22.J) are really trivial. > I would not expect a performance impact. > > Was there other changes as part of those tests? > - for the dut: is OVS the only package that changes in this comparison? For > example, do we have a different kernel versions? If so, please compare with > the same kernel version. > - for the traffic generator / tester system: is it the same hw, sw, and test > scripts being used in the comparison? I reviewed followings results with rhel9 on mlx5, it is mainly the performance degradation of 1q4pmd and 2q4pmd. So I just debug the 1q 4pmd and 2q 4pmd viommu cases on mlx5 with rhel9. https://beaker.engineering.redhat.com/jobs/7090550 fdp22.I 1q4pmd viommu case: 10.5mpps fdp22.I 2q4pmd viommu case: 12.1mpps https://beaker.engineering.redhat.com/jobs/7149681 fdp22.J 1q4pmd viommu case: 5.7mpps fdp22.J 2q4pmd viommu case: 7.7mpps For above results, following is the os version and ovs version info. fdp22.I: RHEL-9.0.0-updates-20221006.0, openvswitch2.17-2.17.0-44.el9fdp, dpdk-21.11.2-1.el9_1 fdp22.J: RHEL-9.1.0-updates-20221019.1, openvswitch2.17-2.17.0-49.el9fdp, dpdk-21.11.2-1.el9_1 And I run the 64bytes cases again with the same os version(RHEL-9.0.0-updates-20221014.0). The performance is similar, and both results are lower than previous job(jobid 7090550) https://beaker.engineering.redhat.com/jobs/7168059 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168059/12838932/151978209/mlx5_100.html fdp22.I 1q4pmd viommu case: 5.7mpps fdp22.I 2q4pmd viommu case: 6.9mpps https://beaker.engineering.redhat.com/jobs/7168056 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168056/12838928/151978199/mlx5_100.html fdp22.J 1q4pmd viommu case: 5.7mpps fdp22.J 2q4pmd viommu case: 5.7mpps Because there are no RHEL-9.0.0-updates-20221006.0 anymore. and then I run fdp22.I with RHEL-9.0.0-updates-20220829.0. And this result is similar to fdp22.J result. It is also lower than the results of fdp22.I previous job(jobid 7090550). It is strange. https://beaker.engineering.redhat.com/jobs/7171899 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71718/7171899/12843483/152004942/mlx5_100.html fdp22.I 1q4pmd viommu case: 5.7mpps fdp22.I 2q4pmd viommu case: 7.5mpps For the traffic generator, T-rex is used for all tests without change. For test scripts, because the job on October 8 can get good performance results, I check that the test script from October 8 to now, there is only one following commit "remove unnecessary restart of openvswitch, and wait 5s after init dpdk". I think it should not affect the performance. [tli@localhost tools]$ git show 490a46c6aa38b7d2807d7f00a7471737e5075304 commit 490a46c6aa38b7d2807d7f00a7471737e5075304 (tag: kernel-networking-common-1_1-981) Author: Qijun DING <qding> Date: Fri Oct 14 15:23:59 2022 +0800 common/tools/setup_ovs_dpdk_vhostuser.sh: remove unnecessary restart of openvswitch, and wait 5s after init dpdk diff --git a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh index 39bb87b9bc..63534c5ca5 100755 --- a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh +++ b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh @@ -197,7 +197,7 @@ ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=${dpdk_lcore_mask} ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=${vhost_iommu_support} ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=${userspace_tso_enable} ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true - +sleep 5 systemctl restart openvswitch || \ { echo "FAIL to start openvswitch!"; exit 1; } @@ -249,9 +249,9 @@ chown -R qemu:hugetlbfs /tmp/vhostuser/ chcon -R -u system_u -t qemu_var_run_t /tmp/vhostuser/ #semanage fcontext -a -t qemu_var_run_t '/tmp/vhostuser(/.*)?' restorecon -Rv /tmp/vhostuser -systemctl restart openvswitch || \ -{ echo "FAIL to start openvswitch!"; exit 1; } -sleep 5 +#systemctl restart openvswitch || \ +#{ echo "FAIL to start openvswitch!"; exit 1; } +#sleep 5
I found the difference between fdp22.j and fdp22.i. It is caused by the following commit. such as 2q 4pmd case: fdp22.j guest cpu setting[1]: <vcpu placement='static'>5</vcpu> <cputune> <vcpupin vcpu='0' cpuset='24'/> <vcpupin vcpu='1' cpuset='2'/> <vcpupin vcpu='2' cpuset='26'/> <vcpupin vcpu='3' cpuset='4'/> <vcpupin vcpu='4' cpuset='28'/> <emulatorpin cpuset='0'/> </cputune> fdp22.I guest cpu setting[2]: <vcpu placement='static'>5</vcpu> <cputune> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='4'/> <vcpupin vcpu='2' cpuset='6'/> <vcpupin vcpu='3' cpuset='8'/> <vcpupin vcpu='4' cpuset='10'/> <emulatorpin cpuset='0,24'/> </cputune> [tli@localhost perf]$ git show d279ae6e5d9224b7ae290b3ca60528d031c51b34 commit d279ae6e5d9224b7ae290b3ca60528d031c51b34 Author: Minxi Hou <mhou> Date: Thu Oct 13 18:59:59 2022 +0800 common/tools/get_info.sh fix when the hyper thread starts, allocate siblings under the same core diff --git a/networking/common/tools/get_info.sh b/networking/common/tools/get_info.sh index ec354a8777..703157b7da 100755 --- a/networking/common/tools/get_info.sh +++ b/networking/common/tools/get_info.sh @@ -90,7 +90,7 @@ cpus_on_numa() #4-7 local thread_siblings_list="" thread_siblings_list=$(eval echo $(cat /sys/devices/system/cpu/cpu$i/topology/thread_siblings_list | sed -e 's/\([0-9]\+-[0-9]\+\)/{\1}/g' -e 's/,/ /g' -e 's/-/../g')) - if [[ -n "$s" ]] && [[ "$SIBLING_ORDER" == "yes" ]] + if [[ -n "$thread_siblings_list" ]] && [[ "$SIBLING_ORDER" == "yes" ]] then cpu="${thread_siblings_list}" else After I changed the guest xml to use fdp22.i setting. Both fdp22.i and fdp22.j can got the good performance. Here, I have a question about above it. Are following results the expected performance difference? with above cpu setting[2], 1q2pmd viommu case: 5.7mpps 1q4pmd viommu case: 10.5mpps 2q4pmd viommu case: 12.8pps 4q8pmd viommu case: 15mpps with above cpu setting[1], 1q2pmd viommu case: 4mpps 1q4pmd viommu case: 5.7mpps 2q4pmd viommu case: 7.5mpps 4q8pmd viommu case: 13.4mpps
I am going to close this ticket as NotABug because the testing scripts have changed the CPU allocation causing the performance regression. Please re-open if you think ovs needs fixing. Thanks, fbl