Bug 2188899
| Summary: | [nfv virt][pvp][cross numa] The vm's vhostuser interface throughput drops significantly after adding emulatorpin cfg | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Yanghang Liu <yanghliu> |
| Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> |
| qemu-kvm sub component: | Networking | QA Contact: | Yanghang Liu <yanghliu> |
| Status: | NEW --- | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | chayang, coli, jinzhao, juzhang, lvivier, maxime.coquelin, mprivozn, virt-maint, yama, yanghliu |
| Version: | 9.3 | Keywords: | CustomerScenariosInitiative, Triaged |
| Target Milestone: | rc | Flags: | yanghliu:
needinfo?
(lvivier) |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Michal, as you worked on related bug, is the configuration used in this BZ valid? IS the performance drop expected? I don't think it is expected. The linked bug(s) are about ThreadContext, i.e. how QEMU allocates the memory. The emulator thread isn't affected. Yanghang, can you please share the QEMU cmd line in both cases? Also, what is the CPU topology? I'm wondering whether those CPU ids from <emulatorpin/> aren't just another CPU thread to those in <vcpupin/>, in which case the emulator thread can't run really if a vCPU is running. And maybe without <emulatorpin/> kernel is free to schedule the emulator thread onto a different core. (In reply to Michal Privoznik from comment #2) > I don't think it is expected. The linked bug(s) are about ThreadContext, > i.e. how QEMU allocates the memory. The emulator thread isn't affected. > > Yanghang, can you please share the QEMU cmd line in both cases? Also, what > is the CPU topology? I'm wondering whether those CPU ids from <emulatorpin/> > aren't just another CPU thread to those in <vcpupin/>, in which case the > emulator thread can't run really if a vCPU is running. And maybe without > <emulatorpin/> kernel is free to schedule the emulator thread onto a > different core. Hi Michal, Thanks for the confirmation. I have listed the related info in Comment 0, please let me know if I need to provide more info. We can get the detailed test log as well as the full domain xml from: (1) The detailed test log with emulatorpin cfg http://10.73.72.41/log/2023-04-22_23:53/nfv_pvp_2q_cross_numa_with_emulatorpin (2) The detailed test log without emulatorpin cfg http://10.73.72.41/log/2023-04-22_23:53/nfv_pvp_2q_cross_numa_without_emulatorpin And the domain's CPU topology is like: <cputune> <vcpupin vcpu='0' cpuset='30'/> <vcpupin vcpu='1' cpuset='28'/> <vcpupin vcpu='2' cpuset='26'/> <vcpupin vcpu='3' cpuset='24'/> <vcpupin vcpu='4' cpuset='22'/> <vcpupin vcpu='5' cpuset='20'/> <emulatorpin cpuset='25,27,29,31'/> <--- I run my tests with/without this cfg. </cputune> The list of host cores which dpdk-testpmd is running on is 15,31,29,27,25,23,21,19,17 The related cmd line is : # dpdk-testpmd -l 15,31,29,27,25,23,21,19,17 --socket-mem 1024,1024 -n 4 --vdev 'net_vhost0,iface=/tmp/vhost-user1,queues=2,client=1,iommu-support=1' --vdev 'net_vhost1,iface=/tmp/vhost-user2,queues=2,client=1,iommu-support=1' -b 0000:3b:00.0 -b 0000:3b:00.1 -d /usr/lib64/librte_net_vhost.so -- --portmask=f -i --rxd=512 --txd=512 --rxq=2 --txq=2 --nb-cores=8 --forward-mode=io This issue can still be reproduced in:
qemu-kvm-8.0.0-4.el9.x86_64
libvirt-9.3.0-2.el9.x86_64
5.14.0-319.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
Check point:
Test *with* <emulatorpin cpuset='25,27,29,31'/> cfg:
Throughput(Mpps) : 3.132936
Test *without* <emulatorpin cpuset='25,27,29,31'/> cfg:
Throughput(Mpps) :21.127461
This issue can still be reproduced in:
host:
qemu-kvm-8.0.0-9.el9.x86_64
tuned-2.20.0-1.el9.noarch
libvirt-9.5.0-5.el9.x86_64
openvswitch3.1-3.1.0-42.el9fdp.x86_64
dpdk-22.11-3.el9_2.x86_64
edk2-ovmf-20230524-2.el9.noarch
guest:
5.14.0-346.el9.x86_64
Test log: http://10.73.72.41/log/2023-08-07_20:17/nfv_pvp_1q_cross_numa
Check point:
[1] The statistics of dpdk-testpmd in the VM:
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 12822137264 RX-dropped: 494648264 RX-total: 13316785528
TX-packets: 12548106232 TX-dropped: 274031032 TX-total: 12822137264
[2] The VM Throughput(Mpps) : 2.240211
Hi Laurent, May I ask if you could help cc some developers to look at this bug ? From a QE point of view, we expect this BZ to be handled with priority, because this BZ has customer impact. |
Description of problem: The vm's vhostuser interface throughput drops significantly after adding emulatorpin cfg Version-Release number of selected component (if applicable): 5.14.0-301.el9.x86_64 qemu-kvm-7.2.0-14.el9_2.x86_64 libvirt-9.2.0-1.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. setup the host kernel option, like CPU isolation,huge-page, iommu # grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` # echo "isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,31,29,27,25,23,21,19,17,15,13,11" >> /etc/tuned/cpu-partitioning-variables.conf tuned-adm profile cpu-partitioning # reboot 2. start a dpdk-testpmd on the host # echo 20 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages # echo 20 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages # modprobe vfio # modprobe vfio-pci # dpdk-devbind.py --bind=vfio-pci 0000:5e:00.0 # dpdk-devbind.py --bind=vfio-pci 0000:5e:00.1 # dpdk-devbind.py --bind=vfio-pci 0000:60:00.0 # dpdk-testpmd -l 15,31,29,27,25,23,21,19,17 --socket-mem 1024,1024 -n 4 --vdev 'net_vhost0,iface=/tmp/vhost-user1,queues=2,client=1,iommu-support=1' --vdev 'net_vhost1,iface=/tmp/vhost-user2,queues=2,client=1,iommu-support=1' -b 0000:3b:00.0 -b 0000:3b:00.1 -d /usr/lib64/librte_net_vhost.so -- --portmask=f -i --rxd=512 --txd=512 --rxq=2 --txq=2 --nb-cores=8 --forward-mode=io testpmd> set portlist 0,2,1,3 testpmd> start 3. start a domain with vhost-user interfaces and <emulatorpin cpuset='25,27,29,31'/> <cputune> <vcpupin vcpu='0' cpuset='30'/> <vcpupin vcpu='1' cpuset='28'/> <vcpupin vcpu='2' cpuset='26'/> <vcpupin vcpu='3' cpuset='24'/> <vcpupin vcpu='4' cpuset='22'/> <vcpupin vcpu='5' cpuset='20'/> <emulatorpin cpuset='25,27,29,31'/> </cputune> ... <interface type='vhostuser'> <mac address='88:66:da:5f:dd:12'/> <source type='unix' path='/tmp/vhost-user1' mode='server'/> <model type='virtio'/> <driver name='vhost' queues='2' rx_queue_size='1024' iommu='on' ats='on'/> </interface> <interface type='vhostuser'> <mac address='88:66:da:5f:dd:13'/> <source type='unix' path='/tmp/vhost-user2' mode='server'/> <model type='virtio'/> <driver name='vhost' queues='2' rx_queue_size='1024' iommu='on' ats='on'/> </interface> Note : the full domain xml is in the test log 4. setup the kernel option in the domain # grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G" --update-kernel=`grubby --default-kernel` # echo "isolated_cores=1,2,3,4,5" >> /etc/tuned/cpu-partitioning-variables.conf # tuned-adm profile cpu-partitioning # reboot 5. start a dpdk-testpmd in the domain # echo 2 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages # modprobe vfio # modprobe vfio-pci # dpdk-devbind.py --bind=vfio-pci 0000:06:00.0 # dpdk-devbind.py --bind=vfio-pci 0000:07:00.0 # dpdk-testpmd -l 1,2,3,4,5 -n 4 -d /usr/lib64/librte_net_virtio.so -- --nb-cores=4 -i --disable-rss --rxd=512 --txd=512 --rxq=2 --txq=2 testpmd> start 6. do Moongen tests # ./build/MoonGen examples/opnfv-vsperf.lua > /tmp/throughput.log 7. check the Throughput **************************************************** Packets_loss Frame_Size(Byte) Run_No Throughput(Mpps) 0 64 0 3.034078 **************************************************** 8. repeat the above step 1- step 7, but without <emulatorpin cpuset='25,27,29,31'/> **************************************************** Packets_loss Frame_Size(Byte) Run_No Throughput(Mpps) 0 64 0 21.127439 **************************************************** Actual results: The vm's vhostuser interface throughput drops around 85% after adding emulatorpin cfg Expected results: No significant throughput drops Additional info: (1) The detailed test log with emulatorpin cfg http://10.73.72.41/log/2023-04-22_23:53/nfv_pvp_2q_cross_numa_with_emulatorpin (2) The detailed test log without emulatorpin cfg http://10.73.72.41/log/2023-04-22_23:53/nfv_pvp_2q_cross_numa_without_emulatorpin (3) related bug about emulatorpin xml Bug 2154750 - [numatune][cputune] qemu-kvm: Setting CPU affinity failed: Invalid argument Bug 2185039 - [numatune][cputune] qemu-kvm: Setting CPU affinity failed: Invalid argument [rhel-9.2.0.z]