Bug 2196789
| Summary: | mlx_5_core/ice driver: ovs dpdk pvp cross numa case got lower performance than ovs dpdk same numa case | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | liting <tli> |
| Component: | openvswitch3.1 | Assignee: | Kevin Traynor <ktraynor> |
| Status: | NEW --- | QA Contact: | ovs-qe |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | FDP 23.C | CC: | ctrautma, fleitner, jhsiao, ralongi |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This issue also exist on ice driver. 4q 8pmd same numa case got 14.2mpps and 17mpps, and cross numa case got 10.8mpps and 11.8mpps. same numa case: 64byte 1q 2pmd noviommu vlan case: 3.1mpps 64byte 1q 4pmd noviommu vlan case: 6.8mpps 64byte 2q 4pmd noviommu vlan case: 7.8mpps 64byte 4q 8pmd noviommu vlan case: 14.2mpps 64byte 1q 2pmd viommu novlan case: 4mpps 64byte 1q 4pmd viommu novlan case: 8.2mpps 64byte 2q 4pmd viommu novlan case: 7.8mpps 64byte 4q 8pmd viommu novlan case: 17mpps https://beaker.engineering.redhat.com/jobs/7851028 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78510/7851028/13903404/160162049/ice_25.html cross numa case: 64byte 1q 2pmd noviommu vlan case: 3.7mpps 64byte 1q 4pmd noviommu vlan case: 6.5mpps 64byte 2q 4pmd noviommu vlan case: 6.9mpps 64byte 4q 8pmd noviommu vlan case: 10.8mpps 64byte 1q 2pmd viommu novlan case: 4.2mpps 64byte 1q 4pmd viommu novlan case: 7.8mpps 64byte 2q 4pmd viommu novlan case: 7.9mpps 64byte 4q 8pmd viommu novlan case: 11.8mpps https://beaker.engineering.redhat.com/jobs/7856225 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78562/7856225/13910888/160210038/ice_25.html |
Description of problem: Version-Release number of selected component (if applicable): kernel 5.14.0-284.11.1.el9_2.x86_64 openvswitch3.1-3.1.0-14.el9fdp.x86_64 How reproducible: Steps to Reproduce: Run ovs dpdk pvp cross numa case and same numa case, 1q2pmd, 1q4pmd, 2q4pmd, 4q8pmd case, such as 4q 8pmd case /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.14.0-284.11.1.el9_2.x86_64 root=/dev/mapper/rhel_dell--per730--56-root ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel_dell--per730--56-swap rd.lvm.lv=rhel_dell-per730-56/root rd.lvm.lv=rhel_dell-per730-56/swap console=ttyS0,115200n81 skew_tick=1 nohz=on nohz_full=2,26,4,28,6,30,8,32,10,34,12,36,14,38,16,40,18,42,20,44,22,46 rcu_nocbs=2,26,4,28,6,30,8,32,10,34,12,36,14,38,16,40,18,42,20,44,22,46 tuned.non_isolcpus=0000aaaa,abaaaaab intel_pstate=disable nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=24 isolcpus=2,26,4,28,6,30,8,32,10,34,12,36,14,38,16,40,18,42,20,44,22,46 intel_iommu=on iommu=pt intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable pci=realloc 1.Build ovs dpdk pvp topo(4q 8pmd), and pmd use cpu on numa0 b80b000f-20dc-4917-bce1-73bd607afe39 Bridge ovsbr0 datapath_type: netdev Port dpdk0 Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:07:00.0", n_rxq="4", n_rxq_desc="1024", n_txq_desc="1024"} Port vhost0 Interface vhost0 type: dpdkvhostuserclient options: {vhost-server-path="/tmp/vhostuser/vhost0"} Port dpdk1 Interface dpdk1 type: dpdk options: {dpdk-devargs="0000:07:00.1", n_rxq="4", n_rxq_desc="1024", n_txq_desc="1024"} Port ovsbr0 Interface ovsbr0 type: internal Port vhost1 Interface vhost1 type: dpdkvhostuserclient options: {vhost-server-path="/tmp/vhostuser/vhost1"} ovs_version: "3.1.1" ovs config: {dpdk-init="true", dpdk-lcore-mask="0x1", dpdk-socket-mem="4096", pmd-cpu-mask="550000550000", userspace-tso-enable="false", vhost-iommu-support="true"} For guest cpu on same numa: <cputune> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='26'/> <vcpupin vcpu='2' cpuset='4'/> <vcpupin vcpu='3' cpuset='28'/> <vcpupin vcpu='4' cpuset='6'/> <vcpupin vcpu='5' cpuset='30'/> <vcpupin vcpu='6' cpuset='8'/> <vcpupin vcpu='7' cpuset='32'/> <vcpupin vcpu='8' cpuset='10'/> <emulatorpin cpuset='0,24'/> </cputune> For guest cpu on cross numa: <vcpu placement='static'>9</vcpu> <cputune> <vcpupin vcpu='0' cpuset='3'/> <vcpupin vcpu='1' cpuset='5'/> <vcpupin vcpu='2' cpuset='7'/> <vcpupin vcpu='3' cpuset='9'/> <vcpupin vcpu='4' cpuset='11'/> <vcpupin vcpu='5' cpuset='13'/> <vcpupin vcpu='6' cpuset='15'/> <vcpupin vcpu='7' cpuset='17'/> <vcpupin vcpu='8' cpuset='19'/> <emulatorpin cpuset='0,24'/> </cputune> 2. start testpmd inside guest dpdk-testpmd -l 0-8 -n 1 --socket-mem 1024 -- -i --forward-mode=io --burst=32 --rxd=8192 --txd=8192 --max-pkt-len=9600 --mbuf-size=9728 --nb-cores=8 --rxq=4 --txq=4 --mbcache=512 --auto-start 3. send traffic with T-rex sender ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1024 --max-loss-pct=0 --search-runtime=10 --validation-runtime=60 --rate-tolerance=10 --runtime-tolerance=10 --rate=25 --rate-unit=% --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 --rate=25 --rate-unit=% --one-shot=0 --use-src-ip-flows=1 --use-dst-ip-flows=1 --use-src-mac-flows=1 --use-dst-mac-flows=1 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic --teaching-warmup-packet-rate=10000 --use-src-ip-flows=1 --use-dst-ip-flows=1 --use-src-mac-flows=1 --use-dst-mac-flows=0 --use-device-stats Actual results: 25g cx6 dx 4q 64byte cross numa and same numa: cross numa viommu case: 12.5mpps cross numa noviommu case: 12.8mpps same numa viommu case: 14.8mpps same numa noviommu case: 14.2mpps https://beaker.engineering.redhat.com/jobs/7831950 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78319/7831950/13873005/159923234/mlx5_25.html cx6 dx 4q 64byte cross numa and same numa : cross numa viommu case: 11.5mpps cross numa noviommu case: 10.5mpps same numa viommu case: 13mpps same numa noviommu case: 16.2mpps https://beaker.engineering.redhat.com/jobs/7829103 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78291/7829103/13868832/159892570/mlx5_25.html cx6 dx 4q viommu case cross numa and same numa case: cross numa: 12.5mpps same numa: 14.2mpps https://beaker.engineering.redhat.com/jobs/7828758 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78287/7828758/13868369/159889510/mlx5_25.html bf2 4q viommu cross numa and same numa case: cross numa: 13mpps same numa: 16mpps https://beaker.engineering.redhat.com/jobs/7828713 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78287/7828713/13868295/159888908/mlx5_25.html bf2 64byte noviommu case cross numa compare same numa case: cross numa: 1q 2pmd: 3.4mpps 1q 4pmd: 7.5mpps 2q 4pmd: 8.9mpps 4q 8pmd: 15.1mpps https://beaker.engineering.redhat.com/jobs/7827978 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78279/7827978/13867467/159885708/mlx5_25.html same numa: 1q 2pmd: 3.6mpps 1q 4pmd: 7.5mpps 2q 4pmd: 9.9mpps 4q 8pmd: 16.6mpps https://beaker.engineering.redhat.com/jobs/7827979 https://beaker-archive.hosts.prod.psi.bos.redhat.com/beaker-logs/2023/05/78279/7827979/13867469/159885712/mlx5_25.html Expected results: The 4q 8pmd cases cross numa should slightly lower than same numa case. But the 4q 8pmd cases cross numa lower than same numa case. Additional info: