Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2137242

Summary:	Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I
Product:	Red Hat Enterprise Linux Fast Datapath	Reporter:	liting <tli>
Component:	openvswitch	Assignee:	Timothy Redaelli <tredaelli>
openvswitch sub component:	ovs-dpdk	QA Contact:	qding
Status:	CLOSED NOTABUG	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	ctrautma, dmarchan, fleitner, jhsiao, ktraynor, ralongi
Version:	FDP 22.J
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-11-01 13:18:58 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description liting 2022-10-24 08:30:11 UTC

Description of problem:
Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
Run ovs dpdk pvp performance of fdp22.J 

Actual results:
>>>>>>>>>>>mlx5_core card<<<<<<<<<<<<<<<<:
fdp22.I rhel8.6 ovs2.15:
https://beaker.engineering.redhat.com/jobs/7129749
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71297/7129749/12786323/151620471/mlx5_100.html

fdp22.J rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7036119
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70361/7036119/12646514/150570581/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.1mpps
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 9.9mpps
fdp22.I 4q8pmd viommu case: 12.1mpps

fdp22.J 1q2pmd viommu case: 2.9mpps
fdp22.J 1q4pmd viommu case: 6.1mpps
fdp22.J 2q4pmd viommu case: 5.9mpps
fdp22.J 4q8pmd viommu case: 11.4mpps


fdp22.J rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7133513
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71335/7133513/12791340/151660830/mlx5_100.html

fdp22.I rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7028534
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70285/7028534/12634828/150483696/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5mpps
fdp22.I 1q4pmd viommu case: 10.4mpps
fdp22.I 2q4pmd viommu case: 9.6mpps
fdp22.I 4q8pmd viommu case: 10.9mpps

fdp22.J 1q2pmd viommu case: 2.9mpps
fdp22.J 1q4pmd viommu case: 5.9mpps
fdp22.J 2q4pmd viommu case: 5.8mpps
fdp22.J 4q8pmd viommu case: 11.4mpps

fdp22.J rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7149681
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html

fdp22.I rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7090550
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.7mpps
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 12.1mpps
fdp22.I 4q8pmd viommu case: 15mpps

fdp22.J 1q2pmd viommu case: 2.3mpps
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 7.7mpps
fdp22.J 4q8pmd viommu case: 12.7mpps

>>>>>>>>>>>ice card<<<<<<<<<<<<<<<<:
fdp22.J rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7138162
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71381/7138162/12798096/151711192/ice_25.html

fdp22.I rhel9 ovs2.17
https://beaker.engineering.redhat.com/jobs/7033799
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033799/12642949/150542075/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 4.6mpps
fdp22.I 1q4pmd viommu case: 10.3mpps
fdp22.I 2q4pmd viommu case: 14.7mpps
fdp22.I 4q8pmd viommu case: 16.1mpps

fdp22.J 1q2pmd viommu case: 4.6mpps
fdp22.J 1q4pmd viommu case: 8.5mpps
fdp22.J 2q4pmd viommu case: 9.6mpps
fdp22.J 4q8pmd viommu case: 13.8mpps

fdp22.I rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7065221
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70652/7065221/12690545/150890276/ice_25.html

fdp22.J rhel8.4 ovs2.16
https://beaker.engineering.redhat.com/jobs/7123818
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71238/7123818/12777095/151554035/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 6.2mpps
fdp22.I 1q4pmd viommu case: 12mpps
fdp22.I 2q4pmd viommu case: 9.2mpps
fdp22.I 4q8pmd viommu case: 14.7mpps

fdp22.J 1q2pmd viommu case: 3.6mpps
fdp22.J 1q4pmd viommu case: 7.4mpps
fdp22.J 2q4pmd viommu case: 7.1mpps
fdp22.J 4q8pmd viommu case: 14.4mpps

fdp22.I rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7120026
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71200/7120026/12771290/151509639/ice_25.html

fdp22.J rhel8.6 ovs2.15
https://beaker.engineering.redhat.com/jobs/7037370
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70373/7037370/12648815/150584723/ice_25.html

such as 64bytes viommu case:
fdp22.I 1q2pmd viommu case: 5.9mpps
fdp22.I 1q4pmd viommu case: 13.1mpps
fdp22.I 2q4pmd viommu case: 12mpps
fdp22.I 4q8pmd viommu case: 13.1mpps

fdp22.J 1q2pmd viommu case: 3.7mpps
fdp22.J 1q4pmd viommu case: 7.5mpps
fdp22.J 2q4pmd viommu case: 7.4mpps
fdp22.J 4q8pmd viommu case: 14.8mpps

Expected results:
The performance of fdp22.J should not lower than fdp22.I.

Additional info:
Run more tests on mlx5_core card:
fdp22.I:
fdp22.I dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125487
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125487/12780457/151576028/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 4.1mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 6.9mpps
4q8pmd viommu case: 14.8mpps

fdp22.I dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7124434
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71244/7124434/12778223/151561666/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 4.1mpps
1q4pmd viommu case: 6.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps


fdp22.I dpdk-21.11-1.el9_0, 
https://beaker.engineering.redhat.com/jobs/7033793
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/09/70337/7033793/12642941/150542058/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 7.0mpps
1q4pmd viommu case: 10.6mpps
2q4pmd viommu case: 14mpps
4q8pmd viommu case: 15mpps

fdp22.I dpdk-21.11.2-1.el9_1, 
https://beaker.engineering.redhat.com/jobs/7090550
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/70905/7090550/12729773/151183868/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 5.7mpps
1q4pmd viommu case: 10.5mpps
2q4pmd viommu case: 12.1mpps
4q8pmd viommu case: 15mpps

fdp22.J:
fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7119988
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71199/7119988/12771248/151509376/mlx5_100.html
such as 64bytes viommu case:
1q2pmd viommu case: 3.4mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 9.2mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125483
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7123721
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71237/7123721/12776960/151552834/mlx5_100.html
1q2pmd viommu case: 4.0mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 3.4mpps
4q8pmd viommu case: 12.7mpps

fdp22.J dpdk-21.11-1.el9_0,
https://beaker.engineering.redhat.com/jobs/7125483
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71254/7125483/12780452/151576016/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 14.7mpps

fdp22.J dpdk-21.11.2-1.el9_1,
https://beaker.engineering.redhat.com/jobs/7149681
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71496/7149681/12812959/151814353/mlx5_100.html
1q2pmd viommu case: 2.3mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.7mpps
4q8pmd viommu case: 12.7mpps

Comment 1 Jean-Tsung Hsiao 2022-10-25 14:17:57 UTC

HI Li Ting,
FYI
I have run two sets of manual tests using OVS 2.15.0-119 and -124 respectively. And, I don't see significant difference between the two sets. Actually, 124 is a little bit better.
Please see test results below.
Thanks!
Jean

*** ovs-dpdk 1Q/4PMD, guest 1Q/2PMD ***

2.15.0-119 --- 8.0 Mpps
2.15.0-124 --- 8.2 Mpps

*** ovs-dpdk 2Q/8PMD, guest 2Q/4PMD ***

2.15.0-119 --- 16.0 Mpps
2.15.0-124 --- 16.4 Mpps

NIC connection

anl151/CX-5 100 Gb <-> anl152/CX-6 100 Gb

Host info

[root@wsfd-advnetlab151 jhsiao]# uname -r
4.18.0-372.32.1.el8_6.x86_64
[root@wsfd-advnetlab151 jhsiao]#

Trex binary search

[root@wsfd-advnetlab152 trafficgen]# cat search-host-30-60-002.sh
./binary-search.py \
--traffic-generator=trex-txrx --frame-size=64 --traffic-direction=bidirectional \
--search-granularity=5 \
--search-runtime=30 --validation-runtime=60 --rate=10 \
--use-device-stats \
--num-flows=1024 \
--max-loss-pct=0.002 \
--measure-latency=1 --latency-rate=100000 \
--rate-tolerance=20
[root@wsfd-advnetlab152 trafficgen]#

Comment 2 Rick Alongi 2022-10-25 18:23:44 UTC

I am also seeing decline in performance in FDP 22.J versus FDP 22.I with mlx5_core ConnectX-6 Dx card:

# openvswitch2.16-2.16.0-103.el8fdp.x86_64.rpm, RHEL-8.4.0-updates-20211026.0, dpdk-20.11-3.el8.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114553
Performance results link: https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71145/7114553/12763059/151444245/mlx5_100_cx6.html

# openvswitch2.15-2.15.0-124.el8fdp.x86_64.rpm, RHEL-8.6.0-updates-20221014.0, dpdk-21.11-1.el8.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114570
Performance results link: https://beaker.engineering.redhat.com/recipes/12763088/tasks/151444393/logs/mlx5_100_cx6.html

Beaker job link: https://beaker.engineering.redhat.com/jobs/7114571
Performance results link: https://beaker.engineering.redhat.com/recipes/12763090/tasks/151444397/logs/mlx5_100_cx6.html

# openvswitch2.17-2.17.0-49.el9fdp.x86_64.rpm, RHEL-9.0.0-updates-20221014.0, dpdk-21.11-1.el9_0.x86_64.rpm:
Beaker job link: https://beaker.engineering.redhat.com/jobs/7114594
Performance results link: https://beaker.engineering.redhat.com/recipes/12763128/tasks/151444838/logs/mlx5_100_cx6.html

Comment 3 David Marchand 2022-10-26 13:03:38 UTC

Those results are strange.

Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 (current report for 22.J) are really trivial.
I would not expect a performance impact.

Was there other changes as part of those tests?
- for the dut: is OVS the only package that changes in this comparison? For example, do we have a different kernel versions? If so, please compare with the same kernel version.
- for the traffic generator / tester system: is it the same hw, sw, and test scripts being used in the comparison?

Comment 4 liting 2022-10-27 06:56:08 UTC

(In reply to David Marchand from comment #3)
> Those results are strange.
> 
> Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124
> (current report for 22.J) are really trivial.
> I would not expect a performance impact.
> 
> Was there other changes as part of those tests?
> - for the dut: is OVS the only package that changes in this comparison? For
> example, do we have a different kernel versions? If so, please compare with
> the same kernel version.
> - for the traffic generator / tester system: is it the same hw, sw, and test
> scripts being used in the comparison?

I reviewed followings results with rhel9 on mlx5, it is mainly the performance degradation of 1q4pmd and 2q4pmd. So I just debug the 1q 4pmd and 2q 4pmd viommu cases on mlx5 with rhel9.
https://beaker.engineering.redhat.com/jobs/7090550
fdp22.I 1q4pmd viommu case: 10.5mpps
fdp22.I 2q4pmd viommu case: 12.1mpps
https://beaker.engineering.redhat.com/jobs/7149681
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 7.7mpps

For above results, following is the os version and ovs version info.
fdp22.I: RHEL-9.0.0-updates-20221006.0, openvswitch2.17-2.17.0-44.el9fdp, dpdk-21.11.2-1.el9_1
fdp22.J: RHEL-9.1.0-updates-20221019.1, openvswitch2.17-2.17.0-49.el9fdp, dpdk-21.11.2-1.el9_1

And I run the 64bytes cases again with the same os version(RHEL-9.0.0-updates-20221014.0). The performance is similar, and both results are lower than previous job(jobid 7090550)
https://beaker.engineering.redhat.com/jobs/7168059
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168059/12838932/151978209/mlx5_100.html
fdp22.I 1q4pmd viommu case: 5.7mpps
fdp22.I 2q4pmd viommu case: 6.9mpps

https://beaker.engineering.redhat.com/jobs/7168056
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168056/12838928/151978199/mlx5_100.html
fdp22.J 1q4pmd viommu case: 5.7mpps
fdp22.J 2q4pmd viommu case: 5.7mpps

Because there are no RHEL-9.0.0-updates-20221006.0 anymore. and then I run fdp22.I with RHEL-9.0.0-updates-20220829.0. And this result is similar to fdp22.J result. It is also lower than the results of fdp22.I previous job(jobid 7090550). It is strange.
https://beaker.engineering.redhat.com/jobs/7171899
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71718/7171899/12843483/152004942/mlx5_100.html
fdp22.I 1q4pmd viommu case: 5.7mpps
fdp22.I 2q4pmd viommu case: 7.5mpps

For the traffic generator, T-rex is used for all tests without change.

For test scripts, because the job on October 8 can get good performance results, I check that the test script from October 8 to now, there is only one following commit "remove unnecessary restart of openvswitch, and wait 5s after init dpdk". I think it should not affect the performance.

[tli@localhost tools]$ git show 490a46c6aa38b7d2807d7f00a7471737e5075304
commit 490a46c6aa38b7d2807d7f00a7471737e5075304 (tag: kernel-networking-common-1_1-981)
Author: Qijun DING <qding>
Date:   Fri Oct 14 15:23:59 2022 +0800

    common/tools/setup_ovs_dpdk_vhostuser.sh: remove unnecessary restart of openvswitch, and wait 5s after init dpdk

diff --git a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
index 39bb87b9bc..63534c5ca5 100755
--- a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
+++ b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh
@@ -197,7 +197,7 @@ ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=${dpdk_lcore_mask}
 ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=${vhost_iommu_support}
 ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=${userspace_tso_enable}
 ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
-
+sleep 5
 systemctl restart openvswitch || \
 { echo "FAIL to start openvswitch!"; exit 1; }
 
@@ -249,9 +249,9 @@ chown -R qemu:hugetlbfs /tmp/vhostuser/
 chcon -R -u system_u -t qemu_var_run_t /tmp/vhostuser/
 #semanage fcontext -a -t qemu_var_run_t '/tmp/vhostuser(/.*)?'
 restorecon -Rv /tmp/vhostuser
-systemctl restart openvswitch || \
-{ echo "FAIL to start openvswitch!"; exit 1; }
-sleep 5
+#systemctl restart openvswitch || \
+#{ echo "FAIL to start openvswitch!"; exit 1; }
+#sleep 5

Comment 5 liting 2022-10-27 11:04:24 UTC

I found the difference between fdp22.j and fdp22.i. It is caused by the following commit.
such as 2q 4pmd case:
fdp22.j guest cpu setting[1]:
  <vcpu placement='static'>5</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='24'/>
    <vcpupin vcpu='1' cpuset='2'/>
    <vcpupin vcpu='2' cpuset='26'/>
    <vcpupin vcpu='3' cpuset='4'/>
    <vcpupin vcpu='4' cpuset='28'/>
    <emulatorpin cpuset='0'/>
  </cputune>

fdp22.I guest cpu setting[2]:
  <vcpu placement='static'>5</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='4'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='8'/>
    <vcpupin vcpu='4' cpuset='10'/>
    <emulatorpin cpuset='0,24'/>
  </cputune>

[tli@localhost perf]$ git show d279ae6e5d9224b7ae290b3ca60528d031c51b34
commit d279ae6e5d9224b7ae290b3ca60528d031c51b34
Author: Minxi Hou <mhou>
Date:   Thu Oct 13 18:59:59 2022 +0800

    common/tools/get_info.sh fix when the hyper thread starts, allocate siblings under the same core

diff --git a/networking/common/tools/get_info.sh b/networking/common/tools/get_info.sh
index ec354a8777..703157b7da 100755
--- a/networking/common/tools/get_info.sh
+++ b/networking/common/tools/get_info.sh
@@ -90,7 +90,7 @@ cpus_on_numa()
         #4-7
                local thread_siblings_list=""
                thread_siblings_list=$(eval echo $(cat /sys/devices/system/cpu/cpu$i/topology/thread_siblings_list | sed -e 's/\([0-9]\+-[0-9]\+\)/{\1}/g' -e 's/,/ /g' -e 's/-/../g'))
-               if [[ -n "$s" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
+               if [[ -n "$thread_siblings_list" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
                then
                        cpu="${thread_siblings_list}"
                else


After I changed the guest xml to use fdp22.i setting. Both fdp22.i and fdp22.j can got the good performance. Here, I have a question about above it. Are following results the expected performance difference? 
with above cpu setting[2],
 1q2pmd viommu case: 5.7mpps
 1q4pmd viommu case: 10.5mpps
 2q4pmd viommu case: 12.8pps
 4q8pmd viommu case: 15mpps

with above cpu setting[1], 
 1q2pmd viommu case: 4mpps
 1q4pmd viommu case: 5.7mpps
 2q4pmd viommu case: 7.5mpps
 4q8pmd viommu case: 13.4mpps

Comment 6 Flavio Leitner 2022-11-01 13:18:58 UTC

I am going to close this ticket as NotABug because the testing scripts have changed the CPU allocation causing the performance regression.
Please re-open if you think ovs needs fixing.
Thanks,
fbl