Bug 1405036
| Summary: | A vhost port is being added to numa node inconsistently | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jean-Tsung Hsiao <jhsiao> |
| Component: | openvswitch | Assignee: | Kevin Traynor <ktraynor> |
| Status: | CLOSED NOTABUG | QA Contact: | ovs-qe |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.3 | CC: | aconole, ailan, atelang, atragler, berrange, ctrautma, fbaudin, fherrman, fleitner, jhsiao, ktraynor, kzhang, osabart, pagupta, rcain, rkhan |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-07 20:44:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jean-Tsung Hsiao
2016-12-15 12:36:22 UTC
### Please note that without this issue the Mpps rate from Xena to vhostuser 4Q testpmd is a perfect 14.88 Mpps. [root@netqe5 XenaScripts]# python multiple_streams 1000000 64 32 60 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14880322.00 pps [root@netqe5 XenaScripts]# ### And, the queue/core alignment is perfect. [root@netqe5 dpdk-multique-scripts]# ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 1 core_id 21: isolated : false port: vhost0 queue-id: 0 port: vhost1 queue-id: 0 port: dpdk0 queue-id: 0 port: dpdk1 queue-id: 0 pmd thread numa_id 1 core_id 17: isolated : false port: vhost0 queue-id: 1 port: vhost1 queue-id: 1 port: dpdk0 queue-id: 1 port: dpdk1 queue-id: 1 pmd thread numa_id 1 core_id 19: isolated : false port: vhost0 queue-id: 2 port: vhost1 queue-id: 2 port: dpdk0 queue-id: 2 port: dpdk1 queue-id: 2 pmd thread numa_id 1 core_id 23: isolated : false port: vhost0 queue-id: 3 port: vhost1 queue-id: 3 port: dpdk0 queue-id: 3 port: dpdk1 queue-id: 3 Your libvirt config seems to pin the VM to cores across different NUMA nodes.
<cputune>
<vcpupin vcpu='0' cpuset='0'/>
<vcpupin vcpu='1' cpuset='1'/>
<vcpupin vcpu='2' cpuset='3'/>
<vcpupin vcpu='3' cpuset='13'/>
<vcpupin vcpu='4' cpuset='15'/>
</cputune>
I suspect the reason for inconsistent NUMA placement of vhost ports is that depending on which cores testpmd uses in the VM, it means the vhost ports will be on different NUMA nodes.
Please change your libvirt config to pin the VM to cores on NUMA 1 only and let me know if the vhost ports are always on NUMA 1 then. thanks.
(In reply to Kevin Traynor from comment #13) > Your libvirt config seems to pin the VM to cores across different NUMA > nodes. > > <cputune> > <vcpupin vcpu='0' cpuset='0'/> > <vcpupin vcpu='1' cpuset='1'/> > <vcpupin vcpu='2' cpuset='3'/> > <vcpupin vcpu='3' cpuset='13'/> > <vcpupin vcpu='4' cpuset='15'/> > </cputune> > > I suspect the reason for inconsistent NUMA placement of vhost ports is that > depending on which cores testpmd uses in the VM, it means the vhost ports > will be on different NUMA nodes. I don't think that's the case. Once I started the guest, after a few seconds, the issue happened as reported by the daemon log. > > Please change your libvirt config to pin the VM to cores on NUMA 1 only and > let me know if the vhost ports are always on NUMA 1 then. thanks. Hi Kevin,
I tried your suggestion, but the same issue still exists.
As you can see from below, both vhost0 and vhost1 were being added on numa node 0.
Thanks!
Jean
=========================================
<cputune>
<vcpupin vcpu='0' cpuset='5'/>
<vcpupin vcpu='1' cpuset='1'/>
<vcpupin vcpu='2' cpuset='3'/>
<vcpupin vcpu='3' cpuset='13'/>
<vcpupin vcpu='4' cpuset='15'/>
</cputune>
2016-12-16T14:47:02.044Z|00011|dpdk(vhost_thread1)|INFO|vHost Device '/var/run/openvswitch/vhost0' has been added on numa node 0
2016-12-16T14:47:02.045Z|00012|dpdk(vhost_thread1)|INFO|vHost Device '/var/run/openvswitch/vhost1' has been added on numa node 0
(In reply to Jean-Tsung Hsiao from comment #15) > Hi Kevin, > > I tried your suggestion, but the same issue still exists. Thanks for trying, comment about config below. fyi, the kernel binds to the ports on boot up, which is what would explain you see any messages before testpmd is run. > > As you can see from below, both vhost0 and vhost1 were being added on numa > node 0. > > Thanks! > > Jean > ========================================= > <cputune> > <vcpupin vcpu='0' cpuset='5'/> > <vcpupin vcpu='1' cpuset='1'/> > <vcpupin vcpu='2' cpuset='3'/> > <vcpupin vcpu='3' cpuset='13'/> > <vcpupin vcpu='4' cpuset='15'/> > </cputune> Typically cores on a 12 core, 2 socket system with HT cores are laid out like this: 0-11: NUMA 0 12-23: NUMA 1 24-35: NUMA 0 36-47: NUMA 1 So I think the config is still using NUMA 0 cores. If you change to only use the 12-23 or 36-47 range they should be all be NUMA 1. Kevin. > > 2016-12-16T14:47:02.044Z|00011|dpdk(vhost_thread1)|INFO|vHost Device > '/var/run/openvswitch/vhost0' has been added on numa node 0 > 2016-12-16T14:47:02.045Z|00012|dpdk(vhost_thread1)|INFO|vHost Device > '/var/run/openvswitch/vhost1' has been added on numa node 0 (In reply to Kevin Traynor from comment #16) > (In reply to Jean-Tsung Hsiao from comment #15) > > Hi Kevin, > > > > I tried your suggestion, but the same issue still exists. > > Thanks for trying, comment about config below. fyi, the kernel binds to the > ports on boot up, which is what would explain you see any messages before > testpmd is run. > > > > > As you can see from below, both vhost0 and vhost1 were being added on numa > > node 0. > > > > Thanks! > > > > Jean > > ========================================= > > <cputune> > > <vcpupin vcpu='0' cpuset='5'/> > > <vcpupin vcpu='1' cpuset='1'/> > > <vcpupin vcpu='2' cpuset='3'/> > > <vcpupin vcpu='3' cpuset='13'/> > > <vcpupin vcpu='4' cpuset='15'/> > > </cputune> > > Typically cores on a 12 core, 2 socket system with HT cores are laid out > like this: > 0-11: NUMA 0 > 12-23: NUMA 1 > 24-35: NUMA 0 > 36-47: NUMA 1 > No, this is not the layout of my test-bed. Each socket has only 6 cores/12 HTs. Socket 0 0-12 2-14 4-16 6-18 8-20 10-22 Socket 1 1-13 3-15 5-17 7-19 9-21 11-23 > So I think the config is still using NUMA 0 cores. If you change to only use > the 12-23 or 36-47 range they should be all be NUMA 1. > > Kevin. > > > > > 2016-12-16T14:47:02.044Z|00011|dpdk(vhost_thread1)|INFO|vHost Device > > '/var/run/openvswitch/vhost0' has been added on numa node 0 > > 2016-12-16T14:47:02.045Z|00012|dpdk(vhost_thread1)|INFO|vHost Device > > '/var/run/openvswitch/vhost1' has been added on numa node 0 To prevent confusion, change "-" to "," from comment #17. Socket 0 0,12 2,14 4,16 6,18 8,20 10,22 Socket 1 1,13 3,15 5,17 7,19 9,21 11,23 While running OVS-dpdk bonding test I saw the same behavior. I need to allocate some even cores for vhost ports when NICs sit on numa node 1.
More interestingly, same config file produced different core-queue alignments.
*** Host netqe9 ***
[root@netqe9 dpdk-bond-ovs.2.6.1-dpdk-16.11]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
isolated : false
port: vhost0 queue-id: 0
port: dpdk0 queue-id: 0
port: dpdk1 queue-id: 0
pmd thread numa_id 1 core_id 23:
isolated : false
port: vhost0 queue-id: 1
port: dpdk0 queue-id: 1
port: dpdk1 queue-id: 1
pmd thread numa_id 1 core_id 17:
isolated : false
port: vhost0 queue-id: 2
port: dpdk0 queue-id: 2
port: dpdk1 queue-id: 2
pmd thread numa_id 1 core_id 21:
isolated : false
port: vhost0 queue-id: 3
port: dpdk0 queue-id: 3
port: dpdk1 queue-id: 3
[root@netqe9 dpdk-bond-ovs.2.6.1-dpdk-16.11]# cat ovs_config_add_bond_dpdk0_dpdk1_vhost0_balance_tcp.sh
ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0xaa0000
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,1"
sleep 5
systemctl restart openvswitch
sleep 5
ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xaa0154
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-bond ovsbr0 dpdkbond dpdk0 dpdk1 "lacp=active" "bond-mode=balance-tcp" -- set Interface dpdk0 type=dpdk ofport_request=10 -- set Interface dpdk1 type=dpdk ofport_request=11
ovs-vsctl add-port ovsbr0 vhost0 \
-- set interface vhost0 type=dpdkvhostuser ofport_request=20
ovs-vsctl --timeout 10 set Interface dpdk0 options:n_rxq=4
ovs-vsctl --timeout 10 set Interface dpdk1 options:n_rxq=4
chown qemu /var/run/openvswitch/vhost0
ll /var/run/openvswitch/vhost*
#ovs-ofctl del-flows ovsbr0
#ovs-ofctl add-flow ovsbr0 in_port=10,actions=output:20
#ovs-ofctl add-flow ovsbr0 in_port=20,actions=output:10
ovs-ofctl dump-flows ovsbr0
*** Host netqe10 ***
[root@netqe10 dpdk-bond-ovs.2.6.1-dpdk-16.11]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 6:
isolated : false
port: vhost0 queue-id: 0
pmd thread numa_id 1 core_id 17:
isolated : false
port: dpdk1 queue-id: 0
port: dpdk0 queue-id: 0
pmd thread numa_id 1 core_id 21:
isolated : false
port: dpdk1 queue-id: 1
port: dpdk0 queue-id: 1
pmd thread numa_id 0 core_id 4:
isolated : false
port: vhost0 queue-id: 1
pmd thread numa_id 0 core_id 8:
isolated : false
port: vhost0 queue-id: 2
pmd thread numa_id 0 core_id 2:
isolated : false
port: vhost0 queue-id: 3
pmd thread numa_id 1 core_id 19:
isolated : false
port: dpdk1 queue-id: 2
port: dpdk0 queue-id: 2
pmd thread numa_id 1 core_id 23:
isolated : false
port: dpdk1 queue-id: 3
port: dpdk0 queue-id: 3
[root@netqe10 dpdk-bond-ovs.2.6.1-dpdk-16.11]#
[root@netqe10 dpdk-bond-ovs.2.6.1-dpdk-16.11]# cat ovs_config_add_bond_dpdk0_dpdk1_vhost0_balance_tcp.sh
ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0xaa0000
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,1"
sleep 5
systemctl restart openvswitch
sleep 5
ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xaa0154
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-bond ovsbr0 dpdkbond dpdk0 dpdk1 "lacp=active" "bond-mode=balance-tcp" -- set Interface dpdk0 type=dpdk ofport_request=10 -- set Interface dpdk1 type=dpdk ofport_request=11
ovs-vsctl add-port ovsbr0 vhost0 \
-- set interface vhost0 type=dpdkvhostuser ofport_request=20
ovs-vsctl --timeout 10 set Interface dpdk0 options:n_rxq=4
ovs-vsctl --timeout 10 set Interface dpdk1 options:n_rxq=4
chown qemu /var/run/openvswitch/vhost0
ll /var/run/openvswitch/vhost*
#ovs-ofctl del-flows ovsbr0
#ovs-ofctl add-flow ovsbr0 in_port=10,actions=output:20
#ovs-ofctl add-flow ovsbr0 in_port=20,actions=output:10
ovs-ofctl dump-flows ovsbr0
I have run a similar test using qemu cmd line. I find if I do not pin qemu threads to numa 1, indeed vhost0 and vhost1 may appear on numa0 or numa1 as reported. When I taskset qemu (*at start time*) to numa 1, vhost0 and vhost1 always appear on numa 1. I tested this 10x times. i.e. taskset -c 5,7,9,11 ./qemu-system-x86_64 <qemu_cmd_line_args> taskset after qemu is run is not sufficient as the vhost devices are registered in OVS during vm boot up. I have 2 suggestions to continue progress: - Although it looks ok to me, maybe the libvirt/qemu config in the test is not sufficiently pinning all the threads to numa 1. Or at least not early enough. The libvirt commands around this are hard to follow, it would be good to get it checked from a libvirt expert. - One check that won't rule out libvirt config, but may confirm it would be to run the test when vhost ports land on wrong numa node, then check last scheduled cpu for qemu threads. i.e. top -H -p<qemu_pid> and show the last used cpu field for active threads. Jean, is this something you can run? one small item I noticed was that the socket-mem in some of the configs is 4096,1. This should be something like 4096,4096 as in some of other the configs. I tried with 4096,1 and it didn't seem to have any impact on numa location of vhost devices, but best to keep it consistently 4096,4096, to rule out any side effects. I can confirm Kevins findings. My testbed does not use libvirt but was using QEMU command line. The test scripts would use taskset after the guest was booted to bind the qemu cpus to the correct cpus. I could reproduce this issue every time as my NIC was on numa 0, but my vhostuser ports were ending up on numa 1 CPUs even with a PMD mask set to use only numa 0 cpus. I modified the script to do taskset in the qemu cmdline startup and I no longer have the issue. The vhost user ports are correctly binding to numa 0 cpus. (In reply to Kevin Traynor from comment #20) > I have run a similar test using qemu cmd line. I find if I do not pin qemu > threads to numa 1, indeed vhost0 and vhost1 may appear on numa0 or numa1 as > reported. > > When I taskset qemu (*at start time*) to numa 1, vhost0 and vhost1 always > appear on numa 1. I tested this 10x times. > i.e. taskset -c 5,7,9,11 ./qemu-system-x86_64 <qemu_cmd_lines> So, this is a workaround if you run qemu manually. But, my case uses guest xml so this workaround does NOT apply to it. > > taskset after qemu is run is not sufficient as the vhost devices are > registered in OVS during vm boot up. > > > I have 2 suggestions to continue progress: > > - Although it looks ok to me, maybe the libvirt/qemu config in the test is > not sufficiently pinning all the threads to numa 1. Or at least not early > enough. The libvirt commands around this are hard to follow, it would be > good to get it checked from a libvirt expert. > > - One check that won't rule out libvirt config, but may confirm it would be > to run the test when vhost ports land on wrong numa node, then check last > scheduled cpu for qemu threads. > i.e. top -H -p<qemu_pid> and show the last used cpu field for active > threads. Jean, is this something you can run? > Ok, I'll try today. > one small item I noticed was that the socket-mem in some of the configs is > 4096,1. This should be something like 4096,4096 as in some of other the > configs. I tried with 4096,1 and it didn't seem to have any impact on numa > location of vhost devices, but best to keep it consistently 4096,4096, to > rule out any side effects. NOTE: I have been using "4096,1" for all OVS-dpdk testing since my ixgbe NIC is on numa #1. So, this is NOT an issue.
> > - One check that won't rule out libvirt config, but may confirm it would be
> > to run the test when vhost ports land on wrong numa node, then check last
> > scheduled cpu for qemu threads.
> > i.e. top -H -p<qemu_pid> and show the last used cpu field for active
> > threads. Jean, is this something you can run?
> >
>
> Ok, I'll try today.
>
Jean and I tested this today and we saw that the vcpus were being pinned to the correct cores but the emulator threads were not being pinned.
Later on, I edited the libvirt xml to pin the emulator as well. I've tested this and the numa node info for vhost ports is consistent with the emulator pinning.
<cputune>
<vcpupin vcpu='0' cpuset='5'/>
<vcpupin vcpu='1' cpuset='7'/>
<vcpupin vcpu='2' cpuset='9'/>
<vcpupin vcpu='3' cpuset='11'/>
<emulatorpin cpuset='13'/>
</cputune>
OVS-DPDK will only poll for rx pkts from a DPDK port with a PMD thread
that is on the same numa node. This is true for both physical and virtual NICs and is done to avoid cross-numa performance issues. If the user has not permitted any PMDs to run on the matching numa node, then OVS will report an error and not poll that DPDK port.
For OVS 2.5, the dpdk vhost ports are associated with the numa node that the dpdk master lcore runs on (from the -c vswitchd cmd line args).
For OVS 2.6, the dpdk vhost ports are associated with the numa node the virtqueue memory has been allocated on by the emulator.
In this bz, what was observed was that OVS 2.6 vhost ports were being associated with different numa nodes on different trials. This was due to Linux scheduling the emulator across 2 numa nodes. What was then observed was that if there are no PMDs associated with the selected numa node, the vhost ports would not be polled (which is expected).
To solve this, taskset can be used for qemu or emulatorpin for libvirt. e.g
taskset 3,5,7,9,11,13 qemu-kvm <qemu_args>
or
<cputune>
<vcpupin vcpu='0' cpuset='5'/>
<vcpupin vcpu='1' cpuset='7'/>
<vcpupin vcpu='2' cpuset='9'/>
<vcpupin vcpu='3' cpuset='11'/>
<emulatorpin cpuset='13'/>
</cputune>
(In reply to Kevin Traynor from comment #24) > or > <cputune> > <vcpupin vcpu='0' cpuset='5'/> > <vcpupin vcpu='1' cpuset='7'/> > <vcpupin vcpu='2' cpuset='9'/> > <vcpupin vcpu='3' cpuset='11'/> > <emulatorpin cpuset='13'/> > </cputune> With this change got up to 14.88 Mpps one way from Xena to testpmd at vhostuser. *** Guest xml *** [root@netqe5 dpdk-multique-scripts]# virsh dumpxml mq-vhu-4 <domain type='kvm' id='14'> <name>mq-vhu-4</name> <uuid>e6ddf28c-3af9-43ee-a9ac-13ee5c2cf39d</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking> <vcpu placement='static'>5</vcpu> <cputune> <vcpupin vcpu='0' cpuset='0'/> <vcpupin vcpu='1' cpuset='1'/> <vcpupin vcpu='2' cpuset='3'/> <vcpupin vcpu='3' cpuset='13'/> <vcpupin vcpu='4' cpuset='15'/> <emulatorpin cpuset='13'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>Haswell-noTSX</model> <numa> <cell id='0' cpus='0' memory='4194304' unit='KiB' memAccess='shared'/> </numa> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/mnt/test/vhostuser/mq-vhu.img'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <alias name='usb'/> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <alias name='usb'/> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <alias name='usb'/> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='vhostuser'> <mac address='52:54:00:7e:c4:1c'/> <source type='unix' path='/var/run/openvswitch/vhost0' mode='client'/> <model type='virtio'/> <driver name='vhost' queues='4'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='vhostuser'> <mac address='52:54:00:83:fd:6b'/> <source type='unix' path='/var/run/openvswitch/vhost1' mode='client'/> <model type='virtio'/> <driver name='vhost' queues='4'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:3b:d1:3a'/> <source bridge='virbr0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/2'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/2'> <source path='/dev/pts/2'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-14-mq-vhu-4/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <alias name='input0'/> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'> <alias name='input1'/> </input> <input type='keyboard' bus='ps2'> <alias name='input2'/> </input> <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c479,c694</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c479,c694</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> *** OVS-dpdk config file *** ovs-vsctl set Open_vSwitch . other_config={} ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0xaa0000 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,1" sleep 5 systemctl restart openvswitch sleep 5 # config ovs-dpdk bridge with dpdk0, dpdk1, vhost0 and vhost1 ovs-vsctl --if-exists del-br ovsbr0 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xaa0000 ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev ovs-vsctl add-port ovsbr0 dpdk0 \ -- set interface dpdk0 type=dpdk ofport_request=10 ovs-vsctl add-port ovsbr0 dpdk1 \ -- set interface dpdk1 type=dpdk ofport_request=11 ovs-vsctl add-port ovsbr0 vhost0 \ -- set interface vhost0 type=dpdkvhostuser ofport_request=20 ovs-vsctl add-port ovsbr0 vhost1 \ -- set interface vhost1 type=dpdkvhostuser ofport_request=21 ovs-vsctl --timeout 10 set Interface dpdk0 options:n_rxq=4 ovs-vsctl --timeout 10 set Interface dpdk1 options:n_rxq=4 chown qemu /var/run/openvswitch/vhost0 chown qemu /var/run/openvswitch/vhost1 ls -l /var/run/openvswitch/vhost* ovs-ofctl del-flows ovsbr0 ovs-ofctl add-flow ovsbr0 in_port=10,actions=output:20 ovs-ofctl add-flow ovsbr0 in_port=21,actions=output:11 ovs-ofctl dump-flows ovsbr0 [root@netqe5 dpdk-multique-scripts]# *** queue-core alignment *** [root@netqe5 dpdk-multique-scripts]# ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 1 core_id 17: isolated : false port: vhost0 queue-id: 0 port: vhost1 queue-id: 0 port: dpdk0 queue-id: 0 port: dpdk1 queue-id: 0 pmd thread numa_id 1 core_id 19: isolated : false port: vhost0 queue-id: 1 port: vhost1 queue-id: 1 port: dpdk0 queue-id: 1 port: dpdk1 queue-id: 1 pmd thread numa_id 1 core_id 21: isolated : false port: vhost0 queue-id: 2 port: vhost1 queue-id: 2 port: dpdk0 queue-id: 2 port: dpdk1 queue-id: 2 pmd thread numa_id 1 core_id 23: isolated : false port: vhost0 queue-id: 3 port: vhost1 queue-id: 3 port: dpdk0 queue-id: 3 port: dpdk1 queue-id: 3 [root@netqe5 dpdk-multique-scripts]# *** One way Xena to testpmd/vhostuser Mpps throughput *** Test 1 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14773626.00 pps Test 2 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14738007.00 pps Test 3 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14880375.00 pps Test 4 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14880254.00 pps Test 5 rate_fration = 1000000 packet_length 64 num_of_streams = 32 test_duration = 60 INFO:root:XenaSocket: Connected INFO:root:XenaManager: Logged succefully INFO:root:XenaPort: 1/0 starting traffic INFO:root:XenaPort: 1/0 stopping traffic Average: 14843679.00 pps (In reply to Kevin Traynor from comment #24) > <cputune> > <vcpupin vcpu='0' cpuset='5'/> > <vcpupin vcpu='1' cpuset='7'/> > <vcpupin vcpu='2' cpuset='9'/> > <vcpupin vcpu='3' cpuset='11'/> > <emulatorpin cpuset='13'/> > </cputune> Should an upper layer bz (libvirt/Nova) be opened for that? Hi Amnon, I'm not sure what they currently do wrt NUMA. I assume they would have some general purpose cores on each NUMA node, so it would be a case of using emulatorpin/taskset/numactrl to ensure all the qemu threads are on the same NUMA node, if they don't already do that. Feel free to share comment 24, or contact me if someone needs a summary of the findings. Kevin. |