The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1798996 - [RHEL7] Both qemu and guest hang after migrating guest in which vhost-user NIC is using virtio-pci [ovs2.11]
Summary: [RHEL7] Both qemu and guest hang after migrating guest in which vhost-user NI...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.11
Version: FDP 20.A
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Maxime Coquelin
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1799017 1806599
TreeView+ depends on / blocked
 
Reported: 2020-02-06 12:56 UTC by Pei Zhang
Modified: 2020-03-10 09:35 UTC (History)
9 users (show)

Fixed In Version: 2.11.0-48
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1799017 1807131 (view as bug list)
Environment:
Last Closed: 2020-03-10 09:35:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0743 0 None None None 2020-03-10 09:35:43 UTC

Description Pei Zhang 2020-02-06 12:56:23 UTC
Description of problem:
Boot guest over ovs with vhost-user ports. In guest, keep vhost-user NIC using virtio-pci driver. Then migrate guest from src to des host. Both qemu and guest will hang on src host.

Version-Release number of selected component (if applicable):
4.18.0-176.el8.x86_64
qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64
openvswitch2.11-2.11.0-47.el8fdp.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot ovs with 1 vhost-user NIC on both src and des host. Refer to [1]

2. Boot qemu with vhost-user. Refer to [2]

3. Check vhost-user driver in guest. Keep it's default virtio-pci.

4. Migrate guest from src to des. Both src qemu and guest hang.

(qemu) migrate -d tcp:10.73.72.196:5555
(qemu) 

Actual results:
Both qemu and guest hang when do migration.

Expected results:
Both qemu and guest should keep working well and can migrate successfully.

Additional info:
1. This is a regression bug.
openvswitch2.11-2.11.0-35.el8fdp.x86_64   works well.

2. If binding vhost-user NIC from virtio-pci to vfio-pci in guest, this issue is gone.

3. openvswitch2.13-2.13.0-0.20200121git2a4f006.el8fdp.x86_64 works well.

Reference:
[1]
#!/bin/bash

set -e

echo "killing old ovs process"
pkill -f ovs-vswitchd || true
sleep 5
pkill -f ovsdb-server || true

echo "probing ovs kernel module"
modprobe -r openvswitch || true
modprobe openvswitch

echo "clean env"
DB_FILE=/etc/openvswitch/conf.db
rm -rf /var/run/openvswitch
mkdir /var/run/openvswitch
rm -f $DB_FILE

echo "init ovs db and boot db server"
export DB_SOCK=/var/run/openvswitch/db.sock
ovsdb-tool create /etc/openvswitch/conf.db /usr/share/openvswitch/vswitch.ovsschema
ovsdb-server --remote=punix:$DB_SOCK --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach --log-file
ovs-vsctl --no-wait init

echo "start ovs vswitch daemon"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask="0x1"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vswitchd unix:$DB_SOCK --pidfile --detach --log-file=/var/log/openvswitch/ovs-vswitchd.log

echo "creating bridge and ports"

ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:5e:00.0 
ovs-vsctl add-port ovsbr0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhostuser0.sock
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 "in_port=1,idle_timeout=0 actions=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2,idle_timeout=0 actions=output:1"

ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x14
ovs-vsctl set Interface dpdk0 options:n_rxq=1


echo "all done"

[2]
/usr/libexec/qemu-kvm \
-name guest=rhel8.2 \
-machine pc-q35-rhel8.2.0,kernel_irqchip=split \
-cpu host \
-m 8192 \
-overcommit mem-lock=on \
-smp 6,sockets=6,cores=1,threads=1 \
-object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/1-rhel8.2,share=yes,size=8589934592,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-5,memdev=ram-node0 \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \
-blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/mnt/nfv/rhel8.2.qcow2,node-name=my_file \
-blockdev driver=qcow2,node-name=my,file=my_file \
-device virtio-blk-pci,scsi=off,iommu_platform=on,ats=on,bus=pci.2,addr=0x0,drive=my,id=virtio-disk0,bootindex=1,write-cache=on \
-chardev socket,id=charnet1,path=/tmp/vhostuser0.sock,server \
-netdev vhost-user,chardev=charnet1,id=hostnet1 \
-device virtio-net-pci,rx_queue_size=1024,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,bus=pci.3,addr=0x0,iommu_platform=on,ats=on \
-monitor stdio \
-vnc 0:1 \

Comment 1 Pei Zhang 2020-02-06 13:03:54 UTC
This was testing with FDP 20.B. As there is no version "FDP 20.B" in bugzilla now, so I chose 20.A. I would highlight 20.A version works well.

Comment 2 Pei Zhang 2020-02-06 13:07:44 UTC
Though qemu hang, I don't think it's qemu issue. 

As below versions (1) and (2) work well.

(1)qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64 & openvswitch2.13-2.13.0-0.20200121git2a4f006.el8fdp.x86_64 works well

(2)qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64 & openvswitch2.11-2.11.0-35.el8fdp.x86_64                   works well

(3)qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904.x86_64 & openvswitch2.11-2.11.0-47.el8fdp.x86_64(bug version)      fail

Comment 3 Maxime Coquelin 2020-02-11 09:57:18 UTC
Thanks to Pei, I managed to reproduce on her testbed.

It seems there is a deadlock on VHOST_USER_SET_VRING_ADDR handling:

(gdb) info threads
  Id   Target Id                                          Frame
* 1    Thread 0x7fca28a1fbc0 (LWP 9156) "ovs-vswitchd"    0x00007fca26acff21 in poll () from /lib64/libc.so.6
  2    Thread 0x7fca24f03700 (LWP 9157) "eal-intr-thread" 0x00007fca26adb1b7 in epoll_wait () from /lib64/libc.so.6
  3    Thread 0x7fca24702700 (LWP 9158) "rte_mp_handle"   0x00007fca27672a67 in recvmsg () from /lib64/libpthread.so.0
  4    Thread 0x7fca23f01700 (LWP 9159) "dpdk_watchdog1"  0x00007fca26aa7238 in nanosleep () from /lib64/libc.so.6
  5    Thread 0x7fca23700700 (LWP 9161) "urcu2"           0x00007fca26acff21 in poll () from /lib64/libc.so.6
  6    Thread 0x7fca22eff700 (LWP 9165) "ct_clean8"       0x00007fca26acff21 in poll () from /lib64/libc.so.6
  7    Thread 0x7fca226fe700 (LWP 9166) "ipf_clean5"      0x00007fca26acff21 in poll () from /lib64/libc.so.6
  8    Thread 0x7fca01e6b700 (LWP 9175) "vhost_reconn"    0x00007fca26aa7238 in nanosleep () from /lib64/libc.so.6
  9    Thread 0x7fca0166a700 (LWP 9176) "vhost-events"    rte_rwlock_read_lock (rwl=<optimized out>) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/generic/rte_rwlock.h:71
  10   Thread 0x7fca21634700 (LWP 9182) "handler12"       0x00007fca26acff21 in poll () from /lib64/libc.so.6
  11   Thread 0x7fca20e33700 (LWP 9183) "revalidator11"   0x00007fca26acff21 in poll () from /lib64/libc.so.6
  12   Thread 0x7fca0266c700 (LWP 9192) "pmd13"           rte_rdtsc () at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/build-static/../dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/rte_cycles.h:49
  13   Thread 0x7fca03736700 (LWP 9193) "pmd14"           0x00007ffc78f6b9c9 in ?? ()
  14   Thread 0x7fca03fff700 (LWP 9194) "pmd15"           0x000055cd83c3fc3e in histogram_add_sample (val=0, hist=0x55cd874dcf50) at ../lib/dpif-netdev-perf.h:326
  15   Thread 0x7fca00e69700 (LWP 9195) "pmd16"           rte_rdtsc () at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/build-static/../dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/rte_cycles.h:49
(gdb) t 9
[Switching to thread 9 (Thread 0x7fca0166a700 (LWP 9176))]
#0  rte_rwlock_read_lock (rwl=<optimized out>) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/generic/rte_rwlock.h:71
71	/usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/generic/rte_rwlock.h: No such file or directory.
(gdb) bt
#0  rte_rwlock_read_lock (rwl=<optimized out>) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/x86_64-native-linuxapp-gcc/include/generic/rte_rwlock.h:71
#1  vhost_user_iotlb_rd_lock (vq=<optimized out>) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/iotlb.h:42
#2  __vhost_iova_to_vva (dev=0x15024de40, vq=vq@entry=0x15024db00, iova=10489253888, iova@entry=10489251904, size=size@entry=0x7fca01669630, perm=perm@entry=3 '\003') at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost.c:66
#3  0x000055cd83ba3033 in vhost_iova_to_vva (perm=3 '\003', len=0x7fca01669630, iova=10489251904, vq=0x15024db00, dev=0x15024de40) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost.h:557
#4  translate_log_addr (log_addr=10489251904, vq=0x15024db00, dev=0x15024de40) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost_user.c:643
#5  translate_ring_addresses (dev=0x15024de40, vq_index=0) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost_user.c:670
#6  0x000055cd83ba340e in vhost_user_set_vring_addr (pdev=pdev@entry=0x7fca016696c8, msg=msg@entry=0x7fca016696d0, main_fd=main_fd@entry=63) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost_user.c:827
#7  0x000055cd839bba00 in vhost_user_msg_handler (vid=<optimized out>, fd=fd@entry=63) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/vhost_user.c:2189
#8  0x000055cd83b9ed73 in vhost_user_read_cb (connfd=63, dat=0x7fc9f8001200, remove=0x7fca01669a30) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/socket.c:298
#9  0x000055cd83b9dab7 in fdset_event_dispatch (arg=0x55cd8413d5a0 <vhost_user+8192>) at /usr/src/debug/openvswitch2.12-2.12.0-21.el8fdp.x86_64/dpdk-stable-18.11.5/lib/librte_vhost/fd_man.c:286
#10 0x00007fca276682de in start_thread () from /lib64/libpthread.so.0
#11 0x00007fca26adae83 in clone () from /lib64/libc.so.6

However, looking at their backtraces, no other thread seems to be holding the lock.
It is likely that a patch introduced a regression, by not releasing the lock in some condition.

Next step is check vhost patches that were introduced between this version and the known working one.

Comment 4 Maxime Coquelin 2020-02-24 15:30:32 UTC
Fix posted upstream in merged in master:

commit 4f37df14c405b754b5e971c75f4f67f4bb5bfdde
Author: Adrian Moreno <amorenoz>
Date:   Thu Feb 13 11:04:58 2020 +0100

    vhost: protect log address translation in IOTLB update

    Currently, the log address translation only  happens in the vhost-user's
    translate_ring_addresses(). However, the IOTLB update handler is not
    checking if it was mapped to re-trigger that translation.

    Since the log address mapping could fail, check it on iotlb updates.
    Also, check it on vring_translate() so we do not dirty pages if the
    logging address is not yet ready.

    Additionally, properly protect the accesses to the iotlb structures.

    Fixes: fbda9f145927 ("vhost: translate incoming log address to GPA")
    Cc: stable

    Signed-off-by: Adrian Moreno <amorenoz>
    Reviewed-by: Maxime Coquelin <maxime.coquelin>

Comment 5 Maxime Coquelin 2020-02-25 12:33:38 UTC
Backported two patches:
 - vhost: fix vring memory partially mapped
 - vhost: protect log address translation in IOTLB update

Comment 8 Pei Zhang 2020-02-29 04:13:33 UTC
Verified with openvswitch2.11-2.11.0-48.el7fdp.x86_64:

All migration test cases get PASS. And all OVS related cases from Virt get PASS.

==Results==

Testcase: live_migration_nonrt_server_2Q_1G_ovs
=======================Stream Rate: 1Mpps=========================
No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss
0 1Mpps 126 17495 0 401551
1 1Mpps 202 17019 0 1142800
2 1Mpps 225 17025 0 586946
3 1Mpps 131 16277 0 411604
Max 1Mpps 225 17495 0 1142800
Min 1Mpps 126 16277 0 401551
Mean 1Mpps 171 16954 0 635725
Median 1Mpps 166 17022 0 499275
Stdev 0 50.0 503.41 0.0 348602.99

Testcase: live_migration_nonrt_server_1Q_2M_ovs
=======================Stream Rate: 1Mpps=========================
No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss
0 1Mpps 149 13887 0 491829
1 1Mpps 133 13443 0 493933
2 1Mpps 154 13461 0 529340
3 1Mpps 204 13098 0 590595
Max 1Mpps 204 13887 0 590595
Min 1Mpps 133 13098 0 491829
Mean 1Mpps 160 13472 0 526424
Median 1Mpps 151 13452 0 511636
Stdev 0 30.66 323.04 0.0 46111.82

Testcase: live_migration_nonrt_server_1Q_1G_ovs
=======================Stream Rate: 1Mpps=========================
No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss
0 1Mpps 90 16178 0 326702
1 1Mpps 78 16050 0 302509
2 1Mpps 85 15957 0 312703
3 1Mpps 76 16094 0 300406
Max 1Mpps 90 16178 0 326702
Min 1Mpps 76 15957 0 300406
Mean 1Mpps 82 16069 0 310580
Median 1Mpps 81 16072 0 307606
Stdev 0 6.4 92.03 0.0 12014.95

Testcase: nfv_acceptance_nonrt_server_2Q_1G
Packets_loss Frame_Size Run_No Throughput Avg_Throughput
0 64 0 21.307379 21.307379

Testcase: vhostuser_reconnect_nonrt_ovs
Packets_loss Frame_Size Run_No Throughput Avg_Throughput
0 64 0 21.614404 21.614404
0 64 0 21.739714 21.739714
0 64 0 21.614388 21.614388

Testcase: vhostuser_hotplug_nonrt_server
Packets_loss Frame_Size Run_No Throughput Avg_Throughput
0 64 0 21.614407 21.614407

Testcase: vhostuser_reconnect_nonrt_qemu
Packets_loss Frame_Size Run_No Throughput Avg_Throughput
0 64 0 21.307390 21.30739
0 64 0 21.307424 21.307424
0 64 0 21.307412 21.307412

 

Versions:

3.10.0-1127.el7.x86_64
dpdk-18.11.2-1.el7.x86_64
qemu-kvm-rhev-2.12.0-44.el7.x86_64
openvswitch-selinux-extra-policy-1.0-15.el7fdp.noarch
tuned-2.11.0-8.el7.noarch
openvswitch2.11-2.11.0-48.el7fdp.x86_64
libvirt-4.5.0-33.el7.x86_64

So this bug has been fixed very well. Move to 'VERIFIED'.

Comment 11 errata-xmlrpc 2020-03-10 09:35:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0743


Note You need to log in before you can comment on or make changes to this bug.