Bug 1450680

Summary: Migrating guest with vhost-user 2 queues and packets flow over dpdk+openvswitch fails: guest hang, and qemu hang or crash
Product: Red Hat Enterprise Linux 7 Reporter: Pei Zhang <pezhang>
Component: openvswitchAssignee: Open vSwitch development team <ovs-team>
Status: CLOSED CURRENTRELEASE QA Contact: Pei Zhang <pezhang>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: aconole, aglotov, aguetta, ailan, atragler, chayang, dgilbert, drjones, ealcaniz, eglynn, fbaudin, fherrman, fleitner, jmaxwell, jraju, jsuchane, juzhang, knoel, ktraynor, marcandre.lureau, marjones, maxime.coquelin, mleitner, mschuppe, pablo.iranzo, pezhang, rlondhe, sdubroca, skramaja, smykhail, sputhenp, tredaelli, victork, virt-maint, zshi
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openvswitch-2.6.1-28.git20180130.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-15 17:27:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1553812    
Bug Blocks: 1473046    
Attachments:
Description Flags
XML of VM none

Description Pei Zhang 2017-05-14 15:19:33 UTC
Description of problem:
Do migration with vhost-user 2 queues when there are packets receiving/sending in guest, migration always fail. QE hit 2 issues separately:
- guest hang, and qemu in des host become no response
- guest hang, and qemu crash in both src and des host


Version-Release number of selected component (if applicable):
3.10.0-666.el7.x86_64
qemu-kvm-rhev-2.9.0-4.el7.x86_64
dpdk-16.11-4.el7fdp.x86_64
openvswitch-2.6.1-18.git20161206.el7fdp.x86_64


How reproducible:
100%


Steps to Reproduce:
1. On src and des host, boot openvswitch with each port setting 2 queues, refer to [1]

2. On src host, boot guest with vhost-user setting 2 queues, refer to[2]

3. On des host, same qemu command as src host, but with '-incoming', refer to[3]

4. In guest, start testpmd, refer to[4]

5. In the third host, start MoonGen, refer to[5]

6. During the process of packets forwarding in guest, do migration, fail with 2 issues below.
(qemu) migrate -d tcp:10.73.72.152:5555

==Issue 1: guest hang, and qemu in des host become no response
On src host:
(qemu) migrate -d tcp:10.73.72.152:5555
(qemu) 2017-05-14T14:56:45.151252Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.151309Z qemu-kvm: vhost VQ 0 ring restore failed: -1: Invalid argument (22)
2017-05-14T14:56:45.151319Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.151325Z qemu-kvm: vhost VQ 1 ring restore failed: -1: Invalid argument (22)
2017-05-14T14:56:45.161328Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.161344Z qemu-kvm: vhost VQ 2 ring restore failed: -1: Invalid argument (22)
2017-05-14T14:56:45.161351Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.161357Z qemu-kvm: vhost VQ 3 ring restore failed: -1: Invalid argument (22)
2017-05-14T14:56:45.161846Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.161854Z qemu-kvm: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
2017-05-14T14:56:45.161861Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.161867Z qemu-kvm: vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)
2017-05-14T14:56:45.168900Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.168913Z qemu-kvm: vhost VQ 2 ring restore failed: -1: Resource temporarily unavailable (11)
2017-05-14T14:56:45.168920Z qemu-kvm: Failed to set msg fds.
2017-05-14T14:56:45.168926Z qemu-kvm: vhost VQ 3 ring restore failed: -1: Resource temporarily unavailable (11)
(qemu) info status
VM status: paused (postmigrate)

On des host(hang):
(qemu) 2017-05-14T14:56:54.782603Z qemu-kvm: VQ 0 size 0x100 Guest index 0xd3ad inconsistent with Host index 0x0: delta 0xd3ad
2017-05-14T14:56:54.782673Z qemu-kvm: Failed to load virtio-net:virtio
2017-05-14T14:56:54.782681Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-net'
2017-05-14T14:56:54.782701Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783436Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783505Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783562Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783610Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783663Z qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (2299998 kHz), and TSC scaling unavailable
2017-05-14T14:56:54.783768Z qemu-kvm: load of migration failed: Operation not permitted

==Issue2: guest hang, and qemu crash in both src and des host
On src host(qemu crash):
(qemu) migrate -d tcp:10.73.72.154:5555
(qemu) 2017-05-14T15:14:00.101923Z qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 6.
2017-05-14T15:14:00.101973Z qemu-kvm: vhost_set_log_base failed: Input/output error (5)
2017-05-14T15:14:00.102013Z qemu-kvm: Failed to set msg fds.
2017-05-14T15:14:00.102021Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2017-05-14T15:14:00.102028Z qemu-kvm: Failed to set msg fds.
2017-05-14T15:14:00.102034Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2017-05-14T15:14:00.102040Z qemu-kvm: Failed to set msg fds.
2017-05-14T15:14:00.102046Z qemu-kvm: vhost_set_features failed: Invalid argument (22)
Aborted

On des host(qemu crash):
(qemu) 2017-05-14T15:14:00.111140Z qemu-kvm: Not a migration stream
2017-05-14T15:14:00.111209Z qemu-kvm: load of migration failed: Invalid argument


Actual results:
qemu crash.


Expected results:
qemu should work well.


Additional info:
1. Without packets receiving/sending in guest, migration works well.
2. With vhost-user sing queue, migration works well.


Reference:
[1] boot openvswitch, and set up 2 queues
#!/bin/bash
set -e
echo "killing old ovs process"
pkill -f ovs- || true
pkill -f ovsdb || true
echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
rm -f $DB_FILE
mkdir /var/run/openvswitch

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,0"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"
ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuser
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl --if-exists del-br ovsbr1
ovs-vsctl add-br ovsbr1 -- set bridge ovsbr1 datapath_type=netdev
ovs-vsctl add-port ovsbr1 dpdk1 -- set Interface dpdk1 type=dpdk
ovs-vsctl add-port ovsbr1 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser
ovs-ofctl del-flows ovsbr1
ovs-ofctl add-flow ovsbr1 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr1 "in_port=2,idle_timeout=0 actions=output:1"

echo "all done"


# ovs-vsctl set Open_vSwitch . other_config={}
# ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x15554
# ovs-vsctl set Interface dpdk0 options:n_rxq=2
# ovs-vsctl set Interface dpdk1 options:n_rxq=2


[2] qemu command line in src host
/usr/libexec/qemu-kvm \
-name guest=rhel7.4_nonrt \
-cpu host \
-m 8G \
-smp 6,sockets=1,cores=6,threads=1 \
-object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages,share=yes,size=8G,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-5,memdev=ram-node0 \
-drive file=/mnt/nfv/rhel7.4_nonrt.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=18:66:da:5f:dd:01 \
-chardev socket,id=charnet1,path=/var/run/openvswitch/vhost-user0 \
-netdev vhost-user,chardev=charnet1,id=hostnet1,queues=2 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,mq=on \
-chardev socket,id=charnet2,path=/var/run/openvswitch/vhost-user1 \
-netdev vhost-user,chardev=charnet2,id=hostnet2,queues=2 \
-device virtio-net-pci,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:03,mq=on \
-msg timestamp=on \
-monitor stdio \
-vnc :2 \


[3] qemu command line in des host
...
-incoming tcp:0:5555 \

[4]start testpmd in guest
# echo 10 >  /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages

# modprobe vfio enable_unsafe_noiommu_mode=Y
# modprobe vfio-pci
# cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
Y

# dpdk-devbind --bind=vfio-pci 0000:00:04.0
# dpdk-devbind --bind=vfio-pci 0000:00:05.0

# /usr/bin/testpmd \
-l 1,2,3,4,5 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so.1 \
-w 0000:00:04.0 -w 0000:00:05.0 \
-- \
--nb-cores=4 \
--disable-hw-vlan \
-i \
--disable-rss \
--rxq=2 --txq=2 \

[5] start MooGen
./build/MoonGen examples/l2-load-latency.lua 0 1 5000

Comment 4 Pei Zhang 2017-05-16 10:36:37 UTC
When testing with PVP, hit same issue.


Steps:
1. In src and des host, boot testpmd 
testpmd -l 0,2,4,6,8 \
--socket-mem=1024 -n 4 \
--vdev 'net_vhost0,iface=/tmp/vhost-user0' \
--vdev 'net_vhost1,iface=/tmp/vhost-user1' -- \
--portmask=3F --disable-hw-vlan -i --rxq=1 --txq=1 \
--nb-cores=4 --forward-mode=io

testpmd> set portlist 0,2,1,3
testpmd> start


2. In src host, boot VM
/usr/libexec/qemu-kvm \
-name guest=rhel7.4_nonrt \
-cpu host \
-m 8G \
-smp 6,sockets=1,cores=6,threads=1 \
-object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages,share=yes,size=8G,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-5,memdev=ram-node0 \
-drive file=/mnt/nfv/rhel7.4_nonrt.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=18:66:da:5f:dd:01 \
-chardev socket,id=charnet1,path=/tmp/vhost-user0 \
-netdev vhost-user,chardev=charnet1,id=hostnet1,queues=2 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,mq=on \
-chardev socket,id=charnet2,path=/tmp/vhost-user1 \
-netdev vhost-user,chardev=charnet2,id=hostnet2,queues=2 \
-device virtio-net-pci,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:03,mq=on \
-msg timestamp=on \
-monitor stdio \
-vnc :2 \

Step 3~6 same as Description.


Best Regards,
Pei

Comment 5 Victor Kaplansky 2017-06-06 12:21:32 UTC
Hi Pei,

I'll take a look into this.

Comment 30 Victor Kaplansky 2017-12-14 11:40:25 UTC
Sent the fix to DPDK upstream - http://dpdk.org/ml/archives/dev/2017-December/083900.html

No changes in QEMU required.

Comment 31 Sahid Ferdjaoui 2018-01-10 16:16:49 UTC
*** Bug 1527532 has been marked as a duplicate of this bug. ***

Comment 34 Kevin Traynor 2018-02-12 11:31:30 UTC
This looks to be a dup of Bug 1528229 .  Can you confirm?

Comment 35 Kevin Traynor 2018-02-12 12:03:21 UTC
Just checked upstream and this patch was accepted onto DPDK 17.11 stable branch but it is *not* on DPDK 16.11 stable branch. Can that be rectified?

I would prefer it to be upstreamed in DPDK 16.11 stable branch before we backport to our OVS2.6 (DPDK16.11).

Comment 38 Sanjay Upadhyay 2018-02-27 12:13:02 UTC
*** Bug 1543740 has been marked as a duplicate of this bug. ***

Comment 42 Pei Zhang 2018-03-16 07:59:56 UTC
Created attachment 1408673 [details]
XML of VM

==update==

Summary: Do live migration with vhost-user 2 queues, both ovs2.6 and ovs2.9 support very well now. So this bug has been fixed well.

(1) With openvswitch-2.6.1-22.git20180130.el7ost.x86_64
===========Stream Rate: 1Mpps===========
No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss
 0       1Mpps      158     19090        18     11778215
 1       1Mpps      141     20118        15      9109284
 2       1Mpps      185     20272        18     11893682
 3       1Mpps      155     15177        18       554391
 4       1Mpps      155     20650        15     14811809
 5       1Mpps      142     23900        18     20734173
 6       1Mpps      157     21933        15     20217899
 7       1Mpps      153     20842        15     11342950
 8       1Mpps      150     22799        15     13016815
 9       1Mpps      155     22830        15     20582389

(2)With openvswitch-2.9.0-1.el7fdb.x86_64
===========Stream Rate: 1Mpps===========
No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss
 0       1Mpps      148     24328        17     27680612
 1       1Mpps      155     19165        15     13763045
 2       1Mpps      140     17258        17      4130227
 3       1Mpps      157     19193        16     12237228
 4       1Mpps      155     19613        16     10173339
 5       1Mpps      143     20357        15     17095979
 6       1Mpps      159     20391        18     10285979
 7       1Mpps      152     20559        18      9290060
 8       1Mpps      146     29477        17     16083082
 9       1Mpps      154     19353        16      8995992


More details:
1. Versions besides above 2 openvswitch:
kernel-3.10.0-855.el7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64
libvirt-3.9.0-14.el7.x86_64
tuned-2.9.0-1.el7.noarch

2. During above 2 testings, we hit high packets loss issue which bug [1] is tracking it now.

[1]Bug 1552465 - High TRex packets loss during live migration over ovs+dpdk+vhost-user

3. With above 2 testings, ovs acts as vhost-user client mode (Testing with dpdkvhostuserclient ports like below).
# ovs-vsctl show
2dfd9e80-233e-44d3-9e39-e9288b1d63f5
    Bridge "ovsbr1"
        Port "ovsbr1"
            Interface "ovsbr1"
                type: internal
        Port "dpdk2"
            Interface "dpdk2"
                type: dpdk
                options: {dpdk-devargs="0000:06:00.0", n_rxq="2", n_txq="2"}
        Port "vhost-user2"
            Interface "vhost-user2"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser2.sock"}
    Bridge "ovsbr0"
        Port "vhost-user1"
            Interface "vhost-user1"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}
        Port "vhost-user0"
            Interface "vhost-user0"
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                options: {dpdk-devargs="0000:04:00.0", n_rxq="2", n_txq="2"}
        Port "dpdk1"
            Interface "dpdk1"
                type: dpdk
                options: {dpdk-devargs="0000:04:00.1", n_rxq="2", n_txq="2"}

4. We test without vIOMMU in these runs. VM xml is attached to this comment.

Comment 54 Pei Zhang 2018-04-18 06:51:43 UTC
==Update==
Versions:
3.10.0-862.el7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64
libvirt-3.9.0-14.el7.x86_64
tuned-2.9.0-1.el7.noarch
openvswitch-2.6.1-28.git20180130.el7ost.x86_64


vhost-user 2 queues live migration works well like below:

=======================Stream Rate: 1Mpps=========================
No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss
 0       1Mpps      113     64270        18   40950642.0
 1       1Mpps      149     20399        17    8149062.0
 2       1Mpps      154     20149        15   10759365.0
 3       1Mpps      145     23425        15   13075563.0
 4       1Mpps      146     15720        15     831627.0
 5       1Mpps      155     20596        15   18343823.0
 6       1Mpps      135     64234        14   54869708.0
 7       1Mpps      155     20359        16    6423787.0
 8       1Mpps      159     17722        15    3210414.0
 9       1Mpps      151     16915        15     599545.0
<------------------------Summary------------------------>
   Max   1Mpps      159     64270        18     54869708
   Min   1Mpps      113     15720        14       599545
  Mean   1Mpps      146     28378        15     15721353
Median   1Mpps      150     20379        15      9454213
 Stdev       0     13.5  19031.67      1.18   18130056.5

Comment 75 Marcelo Ricardo Leitner 2021-07-15 17:27:09 UTC
(In reply to Pei Zhang from comment #71)
> (In reply to Aaron Conole from comment #70)
> > Given comment #68 can we close this?
> 
> Hi Aaron, QE has closed this bug as 'VERIFIED'.

Closed for good now.
(this was showing up on RHEL queries for "zstream?" flags)