Bug 2137242
| Summary: | Some ovs dpdk pvp case of fdp22.J got lower performance than fdp 22.I | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | liting <tli> |
| Component: | openvswitch | Assignee: | Timothy Redaelli <tredaelli> |
| openvswitch sub component: | ovs-dpdk | QA Contact: | qding |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | ctrautma, dmarchan, fleitner, jhsiao, ktraynor, ralongi |
| Version: | FDP 22.J | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-01 13:18:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
liting
2022-10-24 08:30:11 UTC
HI Li Ting, FYI I have run two sets of manual tests using OVS 2.15.0-119 and -124 respectively. And, I don't see significant difference between the two sets. Actually, 124 is a little bit better. Please see test results below. Thanks! Jean *** ovs-dpdk 1Q/4PMD, guest 1Q/2PMD *** 2.15.0-119 --- 8.0 Mpps 2.15.0-124 --- 8.2 Mpps *** ovs-dpdk 2Q/8PMD, guest 2Q/4PMD *** 2.15.0-119 --- 16.0 Mpps 2.15.0-124 --- 16.4 Mpps NIC connection anl151/CX-5 100 Gb <-> anl152/CX-6 100 Gb Host info [root@wsfd-advnetlab151 jhsiao]# uname -r 4.18.0-372.32.1.el8_6.x86_64 [root@wsfd-advnetlab151 jhsiao]# Trex binary search [root@wsfd-advnetlab152 trafficgen]# cat search-host-30-60-002.sh ./binary-search.py \ --traffic-generator=trex-txrx --frame-size=64 --traffic-direction=bidirectional \ --search-granularity=5 \ --search-runtime=30 --validation-runtime=60 --rate=10 \ --use-device-stats \ --num-flows=1024 \ --max-loss-pct=0.002 \ --measure-latency=1 --latency-rate=100000 \ --rate-tolerance=20 [root@wsfd-advnetlab152 trafficgen]# I am also seeing decline in performance in FDP 22.J versus FDP 22.I with mlx5_core ConnectX-6 Dx card: # openvswitch2.16-2.16.0-103.el8fdp.x86_64.rpm, RHEL-8.4.0-updates-20211026.0, dpdk-20.11-3.el8.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114553 Performance results link: https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71145/7114553/12763059/151444245/mlx5_100_cx6.html # openvswitch2.15-2.15.0-124.el8fdp.x86_64.rpm, RHEL-8.6.0-updates-20221014.0, dpdk-21.11-1.el8.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114570 Performance results link: https://beaker.engineering.redhat.com/recipes/12763088/tasks/151444393/logs/mlx5_100_cx6.html Beaker job link: https://beaker.engineering.redhat.com/jobs/7114571 Performance results link: https://beaker.engineering.redhat.com/recipes/12763090/tasks/151444397/logs/mlx5_100_cx6.html # openvswitch2.17-2.17.0-49.el9fdp.x86_64.rpm, RHEL-9.0.0-updates-20221014.0, dpdk-21.11-1.el9_0.x86_64.rpm: Beaker job link: https://beaker.engineering.redhat.com/jobs/7114594 Performance results link: https://beaker.engineering.redhat.com/recipes/12763128/tasks/151444838/logs/mlx5_100_cx6.html Those results are strange. Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 (current report for 22.J) are really trivial. I would not expect a performance impact. Was there other changes as part of those tests? - for the dut: is OVS the only package that changes in this comparison? For example, do we have a different kernel versions? If so, please compare with the same kernel version. - for the traffic generator / tester system: is it the same hw, sw, and test scripts being used in the comparison? (In reply to David Marchand from comment #3) > Those results are strange. > > Focusing on OVS 2.15 package, the changes between .122 (22.I) and .124 > (current report for 22.J) are really trivial. > I would not expect a performance impact. > > Was there other changes as part of those tests? > - for the dut: is OVS the only package that changes in this comparison? For > example, do we have a different kernel versions? If so, please compare with > the same kernel version. > - for the traffic generator / tester system: is it the same hw, sw, and test > scripts being used in the comparison? I reviewed followings results with rhel9 on mlx5, it is mainly the performance degradation of 1q4pmd and 2q4pmd. So I just debug the 1q 4pmd and 2q 4pmd viommu cases on mlx5 with rhel9. https://beaker.engineering.redhat.com/jobs/7090550 fdp22.I 1q4pmd viommu case: 10.5mpps fdp22.I 2q4pmd viommu case: 12.1mpps https://beaker.engineering.redhat.com/jobs/7149681 fdp22.J 1q4pmd viommu case: 5.7mpps fdp22.J 2q4pmd viommu case: 7.7mpps For above results, following is the os version and ovs version info. fdp22.I: RHEL-9.0.0-updates-20221006.0, openvswitch2.17-2.17.0-44.el9fdp, dpdk-21.11.2-1.el9_1 fdp22.J: RHEL-9.1.0-updates-20221019.1, openvswitch2.17-2.17.0-49.el9fdp, dpdk-21.11.2-1.el9_1 And I run the 64bytes cases again with the same os version(RHEL-9.0.0-updates-20221014.0). The performance is similar, and both results are lower than previous job(jobid 7090550) https://beaker.engineering.redhat.com/jobs/7168059 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168059/12838932/151978209/mlx5_100.html fdp22.I 1q4pmd viommu case: 5.7mpps fdp22.I 2q4pmd viommu case: 6.9mpps https://beaker.engineering.redhat.com/jobs/7168056 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71680/7168056/12838928/151978199/mlx5_100.html fdp22.J 1q4pmd viommu case: 5.7mpps fdp22.J 2q4pmd viommu case: 5.7mpps Because there are no RHEL-9.0.0-updates-20221006.0 anymore. and then I run fdp22.I with RHEL-9.0.0-updates-20220829.0. And this result is similar to fdp22.J result. It is also lower than the results of fdp22.I previous job(jobid 7090550). It is strange. https://beaker.engineering.redhat.com/jobs/7171899 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71718/7171899/12843483/152004942/mlx5_100.html fdp22.I 1q4pmd viommu case: 5.7mpps fdp22.I 2q4pmd viommu case: 7.5mpps For the traffic generator, T-rex is used for all tests without change. For test scripts, because the job on October 8 can get good performance results, I check that the test script from October 8 to now, there is only one following commit "remove unnecessary restart of openvswitch, and wait 5s after init dpdk". I think it should not affect the performance. [tli@localhost tools]$ git show 490a46c6aa38b7d2807d7f00a7471737e5075304 commit 490a46c6aa38b7d2807d7f00a7471737e5075304 (tag: kernel-networking-common-1_1-981) Author: Qijun DING <qding> Date: Fri Oct 14 15:23:59 2022 +0800 common/tools/setup_ovs_dpdk_vhostuser.sh: remove unnecessary restart of openvswitch, and wait 5s after init dpdk diff --git a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh index 39bb87b9bc..63534c5ca5 100755 --- a/networking/common/tools/setup_ovs_dpdk_vhostuser.sh +++ b/networking/common/tools/setup_ovs_dpdk_vhostuser.sh @@ -197,7 +197,7 @@ ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=${dpdk_lcore_mask} ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=${vhost_iommu_support} ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=${userspace_tso_enable} ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true - +sleep 5 systemctl restart openvswitch || \ { echo "FAIL to start openvswitch!"; exit 1; } @@ -249,9 +249,9 @@ chown -R qemu:hugetlbfs /tmp/vhostuser/ chcon -R -u system_u -t qemu_var_run_t /tmp/vhostuser/ #semanage fcontext -a -t qemu_var_run_t '/tmp/vhostuser(/.*)?' restorecon -Rv /tmp/vhostuser -systemctl restart openvswitch || \ -{ echo "FAIL to start openvswitch!"; exit 1; } -sleep 5 +#systemctl restart openvswitch || \ +#{ echo "FAIL to start openvswitch!"; exit 1; } +#sleep 5 I found the difference between fdp22.j and fdp22.i. It is caused by the following commit.
such as 2q 4pmd case:
fdp22.j guest cpu setting[1]:
<vcpu placement='static'>5</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='24'/>
<vcpupin vcpu='1' cpuset='2'/>
<vcpupin vcpu='2' cpuset='26'/>
<vcpupin vcpu='3' cpuset='4'/>
<vcpupin vcpu='4' cpuset='28'/>
<emulatorpin cpuset='0'/>
</cputune>
fdp22.I guest cpu setting[2]:
<vcpu placement='static'>5</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='2'/>
<vcpupin vcpu='1' cpuset='4'/>
<vcpupin vcpu='2' cpuset='6'/>
<vcpupin vcpu='3' cpuset='8'/>
<vcpupin vcpu='4' cpuset='10'/>
<emulatorpin cpuset='0,24'/>
</cputune>
[tli@localhost perf]$ git show d279ae6e5d9224b7ae290b3ca60528d031c51b34
commit d279ae6e5d9224b7ae290b3ca60528d031c51b34
Author: Minxi Hou <mhou>
Date: Thu Oct 13 18:59:59 2022 +0800
common/tools/get_info.sh fix when the hyper thread starts, allocate siblings under the same core
diff --git a/networking/common/tools/get_info.sh b/networking/common/tools/get_info.sh
index ec354a8777..703157b7da 100755
--- a/networking/common/tools/get_info.sh
+++ b/networking/common/tools/get_info.sh
@@ -90,7 +90,7 @@ cpus_on_numa()
#4-7
local thread_siblings_list=""
thread_siblings_list=$(eval echo $(cat /sys/devices/system/cpu/cpu$i/topology/thread_siblings_list | sed -e 's/\([0-9]\+-[0-9]\+\)/{\1}/g' -e 's/,/ /g' -e 's/-/../g'))
- if [[ -n "$s" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
+ if [[ -n "$thread_siblings_list" ]] && [[ "$SIBLING_ORDER" == "yes" ]]
then
cpu="${thread_siblings_list}"
else
After I changed the guest xml to use fdp22.i setting. Both fdp22.i and fdp22.j can got the good performance. Here, I have a question about above it. Are following results the expected performance difference?
with above cpu setting[2],
1q2pmd viommu case: 5.7mpps
1q4pmd viommu case: 10.5mpps
2q4pmd viommu case: 12.8pps
4q8pmd viommu case: 15mpps
with above cpu setting[1],
1q2pmd viommu case: 4mpps
1q4pmd viommu case: 5.7mpps
2q4pmd viommu case: 7.5mpps
4q8pmd viommu case: 13.4mpps
I am going to close this ticket as NotABug because the testing scripts have changed the CPU allocation causing the performance regression. Please re-open if you think ovs needs fixing. Thanks, fbl |