Bug 2137242 - Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I
Summary: Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 22.J
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Timothy Redaelli
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-24 08:30 UTC by liting
Modified: 2022-11-01 13:18 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-01 13:18:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2396 0 None None None 2022-10-24 08:47:55 UTC

Description liting 2022-10-24 08:30:11 UTC
Description of problem:
Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
Run ovs dpdk pvp performance of fdp22.J 

Actual results:
>>>>>>>>>>>mlx5_core card<<<<<<<<<<<<<<<<:
fdp22.I rhel8.6 ovs2.15:
https://beaker.engineering.redhat.com/jobs/7129749
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71297/7129749/12786323/151620471/mlx5_100.html

fdp22.J rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7036119
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70361/7036119/12646514/150570581/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.1mpps
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 9.9mpps
fdp22.I 4q8pmd viommu case: 12.1mpps

fdp22.J 1q2pmd viommu case: 2.9mpps
fdp22.J 1q4pmd viommu case: 6.1mpps
fdp22.J 2q4pmd viommu case: 5.9mpps
fdp22.J 4q8pmd viommu case: 11.4mpps


fdp22.J rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7133513
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71335/7133513/12791340/151660830/mlx5_100.html

fdp22.I rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7028534
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70285/7028534/12634828/150483696/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5mpps
fdp22.I 1q4pmd viommu case: 10.4mpps
fdp22.I 2q4pmd viommu case: 9.6mpps
fdp22.I 4q8pmd viommu case: 10.9mpps

fdp22.J 1q2pmd viommu case: 2.9mpps
fdp22.J 1q4pmd viommu case: 5.9mpps
fdp22.J 2q4pmd viommu case: 5.8mpps
fdp22.J 4q8pmd viommu case: 11.4mpps

fdp22.J rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7149681
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html

fdp22.I rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7090550
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.7mpps
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 12.1mpps
fdp22.I 4q8pmd viommu case: 15mpps

fdp22.J 1q2pmd viommu case: 2.3mpps
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 7.7mpps
fdp22.J 4q8pmd viommu case: 12.7mpps

>>>>>>>>>>>ice card<<<<<<<<<<<<<<<<:
fdp22.J rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7138162
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71381/7138162/12798096/151711192/ice_25.html

fdp22.I rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7033799
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033799/12642949/150542075/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 4.6mpps
fdp22.I 1q4pmd viommu case: 10.3mpps
fdp22.I 2q4pmd viommu case: 14.7mpps
fdp22.I 4q8pmd viommu case: 16.1mpps

fdp22.J 1q2pmd viommu case: 4.6mpps
fdp22.J 1q4pmd viommu case: 8.5mpps
fdp22.J 2q4pmd viommu case: 9.6mpps
fdp22.J 4q8pmd viommu case: 13.8mpps

fdp22.I rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7065221
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70652/7065221/12690545/150890276/ice_25.html

fdp22.J rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7123818
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71238/7123818/12777095/151554035/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 6.2mpps
fdp22.I 1q4pmd viommu case: 12mpps
fdp22.I 2q4pmd viommu case: 9.2mpps
fdp22.I 4q8pmd viommu case: 14.7mpps

fdp22.J 1q2pmd viommu case: 3.6mpps
fdp22.J 1q4pmd viommu case: 7.4mpps
fdp22.J 2q4pmd viommu case: 7.1mpps
fdp22.J 4q8pmd viommu case: 14.4mpps

fdp22.I rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7120026
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71200/7120026/12771290/151509639/ice_25.html

fdp22.J rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7037370
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70373/7037370/12648815/150584723/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.9mpps
fdp22.I 1q4pmd viommu case: 13.1mpps
fdp22.I 2q4pmd viommu case: 12mpps
fdp22.I 4q8pmd viommu case: 13.1mpps

fdp22.J 1q2pmd viommu case: 3.7mpps
fdp22.J 1q4pmd viommu case: 7.5mpps
fdp22.J 2q4pmd viommu case: 7.4mpps
fdp22.J 4q8pmd viommu case: 14.8mpps

Expected results:
The performance of fdp22.J should not lower than fdp22.I.

Additional info:
Run more tests on mlx5_core card:
fdp22.I:
fdp22.I dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125487
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125487/12780457/151576028/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 4.1mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 6.9mpps
4q8pmd viommu case: 14.8mpps

fdp22.I dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7124434
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71244/7124434/12778223/151561666/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 4.1mpps
1q4pmd viommu case: 6.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps


fdp22.I dpdk-21.11-1.el9_0, 
https://beaker.engineering.redhat.com/jobs/7033793
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033793/12642941/150542058/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 7.0mpps
1q4pmd viommu case: 10.6mpps
2q4pmd viommu case: 14mpps
4q8pmd viommu case: 15mpps

fdp22.I dpdk-21.11.2-1.el9_1, 
https://beaker.engineering.redhat.com/jobs/7090550
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 5.7mpps
1q4pmd viommu case: 10.5mpps
2q4pmd viommu case: 12.1mpps
4q8pmd viommu case: 15mpps

fdp22.J:
fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7119988
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71199/7119988/12771248/151509376/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 3.4mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 9.2mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125483
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7123721
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71237/7123721/12776960/151552834/mlx5_100.html
1q2pmd viommu case: 4.0mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 3.4mpps
4q8pmd viommu case: 12.7mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125483
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps

fdp22.J dpdk-21.11.2-1.el9_1,
https://beaker.engineering.redhat.com/jobs/7149681
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 12.7mpps

Comment 1 Jean-Tsung Hsiao 2022-10-25 14:17:57 UTC
HI Li Ting,
FYI
I have run two sets of manual tests using OVS 2.15.0-119 and -124 respectively. And, I don't see significant difference between the two sets. Actually, 124 is a little bit better.
Please see test results below.
Thanks!
Jean

*** ovs-dpdk 1Q/4PMD, guest 1Q/2PMD ***

2.15.0-119 --- 8.0 Mpps
2.15.0-124 --- 8.2 Mpps

*** ovs-dpdk 2Q/8PMD, guest 2Q/4PMD ***

2.15.0-119 --- 16.0 Mpps
2.15.0-124 --- 16.4 Mpps

NIC connection

anl151/CX-5 100 Gb <-> anl152/CX-6 100 Gb

Host info

[root@wsfd-advnetlab151 jhsiao]# uname -r
4.18.0-372.32.1.el8_6.x86_64
[root@wsfd-advnetlab151 jhsiao]#

Trex binary search

[root@wsfd-advnetlab152 trafficgen]# cat search-host-30-60-002.sh
./binary-search.py \
--traffic-generator=trex-txrx --frame-size=64 --traffic-direction=bidirectional \
--search-granularity=5 \
--search-runtime=30 --validation-runtime=60 --rate=10 \
--use-device-stats \
--num-flows=1024 \
--max-loss-pct=0.002 \
--measure-latency=1 --latency-rate=100000 \
--rate-tolerance=20
[root@wsfd-advnetlab152 trafficgen]#

Comment 2 Rick Alongi 2022-10-25 18:23:44 UTC
I am also seeing decline in performance in FDP 22.J versus FDP 22.I with mlx5_core ConnectX-6 Dx card:

# openvswitch2.16-2.16.0-103.el8fdp.x86_64.rpm, RHEL-8.4.0-updates-20211026.0, dpdk-20.11-3.el8.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114553
Performance results link: https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71145/7114553/12763059/151444245/mlx5_100_cx6.html

# openvswitch2.15-2.15.0-124.el8fdp.x86_64.rpm, RHEL-8.6.0-updates-20221014.0, dpdk-21.11-1.el8.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114570
Performance results link: https://beaker.engineering.redhat.com/recipes/12763088/tasks/151444393/logs/mlx5_100_cx6.html

Beaker job link: https://beaker.engineering.redhat.com/jobs/7114571
Performance results link: https://beaker.engineering.redhat.com/recipes/12763090/tasks/151444397/logs/mlx5_100_cx6.html

# openvswitch2.17-2.17.0-49.el9fdp.x86_64.rpm, RHEL-9.0.0-updates-20221014.0, dpdk-21.11-1.el9_0.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114594
Performance results link: https://beaker.engineering.redhat.com/recipes/12763128/tasks/151444838/logs/mlx5_100_cx6.html

Comment 3 David Marchand 2022-10-26 13:03:38 UTC
Those results are strange.

Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 (current report for 22.J) are really trivial.
I would not expect a performance impact.

Was there other changes as part of those tests?
- for the dut: is OVS the only package that changes in this comparison? For example, do we have a different kernel versions? If so, please compare with the same kernel version.
- for the traffic generator / tester system: is it the same hw, sw, and test scripts being used in the comparison?

Comment 4 liting 2022-10-27 06:56:08 UTC
(In reply to David Marchand from comment #3)
> Those results are strange.
> 
> Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124
> (current report for 22.J) are really trivial.
> I would not expect a performance impact.
> 
> Was there other changes as part of those tests?
> - for the dut: is OVS the only package that changes in this comparison? For
> example, do we have a different kernel versions? If so, please compare with
> the same kernel version.
> - for the traffic generator / tester system: is it the same hw, sw, and test
> scripts being used in the comparison?

I reviewed followings results with rhel9 on mlx5, it is mainly the performance degradation of 1q4pmd and 2q4pmd. So I just debug the 1q 4pmd and 2q 4pmd viommu cases on mlx5 with rhel9.
https://beaker.engineering.redhat.com/jobs/7090550
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 12.1mpps
https://beaker.engineering.redhat.com/jobs/7149681
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 7.7mpps

For above results, following is the os version and ovs version info.
fdp22.I: RHEL-9.0.0-updates-20221006.0, openvswitch2.17-2.17.0-44.el9fdp, dpdk-21.11.2-1.el9_1
fdp22.J: RHEL-9.1.0-updates-20221019.1, openvswitch2.17-2.17.0-49.el9fdp, dpdk-21.11.2-1.el9_1

And I run the 64bytes cases again with the same os version(RHEL-9.0.0-updates-20221014.0). The performance is similar, and both results are lower than previous job(jobid 7090550)
https://beaker.engineering.redhat.com/jobs/7168059
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168059/12838932/151978209/mlx5_100.html
fdp22.I 1q4pmd viommu case: 5.7mpps
fdp22.I 2q4pmd viommu case: 6.9mpps

https://beaker.engineering.redhat.com/jobs/7168056
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168056/12838928/151978199/mlx5_100.html
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 5.7mpps

Because there are no RHEL-9.0.0-updates-20221006.0 anymore. and then I run fdp22.I with RHEL-9.0.0-updates-20220829.0. And this result is similar to fdp22.J result. It is also lower than the results of fdp22.I previous job(jobid 7090550). It is strange.
https://beaker.engineering.redhat.com/jobs/7171899
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71718/7171899/12843483/152004942/mlx5_100.html
fdp22.I 1q4pmd viommu case: 5.7mpps
fdp22.I 2q4pmd viommu case: 7.5mpps

For the traffic generator, T-rex is used for all tests without change.

For test scripts, because the job on October 8 can get good performance results, I check that the test script from October 8 to now, there is only one following commit "remove unnecessary restart of openvswitch, and wait 5s after init dpdk". I think it should not affect the performance.

[tli@localhost tools]$ git show 490a46c6aa38b7d2807d7f00a7471737e5075304
commit 490a46c6aa38b7d2807d7f00a7471737e5075304 (tag: kernel-networking-common-1_1-981)
Author: Qijun DING <qding>
Date:   Fri Oct 14 15:23:59 2022 +0800

    common/tools/setup_ovs_dpdk_vhostuser.sh: remove unnecessary restart of openvswitch, and wait 5s after init dpdk

diff --git a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
index 39bb87b9bc..63534c5ca5 100755
--- a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
+++ b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
@@ -197,7 +197,7 @@ ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=${dpdk_lcore_mask}
 ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=${vhost_iommu_support}
 ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=${userspace_tso_enable}
 ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
-
+sleep 5
 systemctl restart openvswitch || \
 { echo "FAIL to start openvswitch!"; exit 1; }
 
@@ -249,9 +249,9 @@ chown -R qemu:hugetlbfs /tmp/vhostuser/
 chcon -R -u system_u -t qemu_var_run_t /tmp/vhostuser/
 #semanage fcontext -a -t qemu_var_run_t '/tmp/vhostuser(/.*)?'
 restorecon -Rv /tmp/vhostuser
-systemctl restart openvswitch || \
-{ echo "FAIL to start openvswitch!"; exit 1; }
-sleep 5
+#systemctl restart openvswitch || \
+#{ echo "FAIL to start openvswitch!"; exit 1; }
+#sleep 5

Comment 5 liting 2022-10-27 11:04:24 UTC
I found the difference between fdp22.j and fdp22.i. It is caused by the following commit.
such as 2q 4pmd case:
fdp22.j guest cpu setting[1]:
  <vcpu placement='static'>5</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='24'/>
    <vcpupin vcpu='1' cpuset='2'/>
    <vcpupin vcpu='2' cpuset='26'/>
    <vcpupin vcpu='3' cpuset='4'/>
    <vcpupin vcpu='4' cpuset='28'/>
    <emulatorpin cpuset='0'/>
  </cputune>

fdp22.I guest cpu setting[2]:
  <vcpu placement='static'>5</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='4'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='8'/>
    <vcpupin vcpu='4' cpuset='10'/>
    <emulatorpin cpuset='0,24'/>
  </cputune>

[tli@localhost perf]$ git show d279ae6e5d9224b7ae290b3ca60528d031c51b34
commit d279ae6e5d9224b7ae290b3ca60528d031c51b34
Author: Minxi Hou <mhou>
Date:   Thu Oct 13 18:59:59 2022 +0800

    common/tools/get_info.sh fix when the hyper thread starts, allocate siblings under the same core

diff --git a/networking/common/tools/get_info.sh b/networking/common/tools/get_info.sh
index ec354a8777..703157b7da 100755
--- a/networking/common/tools/get_info.sh
+++ b/networking/common/tools/get_info.sh
@@ -90,7 +90,7 @@ cpus_on_numa()
         #4-7
                local thread_siblings_list=""
                thread_siblings_list=$(eval echo $(cat /sys/devices/system/cpu/cpu$i/topology/thread_siblings_list | sed -e 's/\([0-9]\+-[0-9]\+\)/{\1}/g' -e 's/,/ /g' -e 's/-/../g'))
-               if [[ -n "$s" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
+               if [[ -n "$thread_siblings_list" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
                then
                        cpu="${thread_siblings_list}"
                else


After I changed the guest xml to use fdp22.i setting. Both fdp22.i and fdp22.j can got the good performance. Here, I have a question about above it. Are following results the expected performance difference? 
with above cpu setting[2],
 1q2pmd viommu case: 5.7mpps
 1q4pmd viommu case: 10.5mpps
 2q4pmd viommu case: 12.8pps
 4q8pmd viommu case: 15mpps

with above cpu setting[1], 
 1q2pmd viommu case: 4mpps
 1q4pmd viommu case: 5.7mpps
 2q4pmd viommu case: 7.5mpps
 4q8pmd viommu case: 13.4mpps

Comment 6 Flavio Leitner 2022-11-01 13:18:58 UTC
I am going to close this ticket as NotABug because the testing scripts have changed the CPU allocation causing the performance regression.
Please re-open if you think ovs needs fixing.
Thanks,
fbl


Note You need to log in before you can comment on or make changes to this bug.