RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1541881 - When doing live migration over dpdk/vhost-user, dpdk errors can cause qemu hang at recvmsg ()
Summary: When doing live migration over dpdk/vhost-user, dpdk errors can cause qemu ha...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openvswitch
Version: 7.5
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: ---
Assignee: Matteo Croce
QA Contact: Pei Zhang
URL:
Whiteboard:
: 1533408 (view as bug list)
Depends On:
Blocks: 1538953 1560628
TreeView+ depends on / blocked
 
Reported: 2018-02-05 06:30 UTC by Pei Zhang
Modified: 2018-05-03 15:02 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-03 14:23:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Pei Zhang 2018-02-05 06:30:33 UTC
Description of problem:
This is doing pvp live migration. Boot dpdk's testpmd in src and dest host. Then start guest and start testpmd in guest, meanwhile start MoonGen in the third host.

When migrating guest from src to des host, sometimes dpdk show errors and qemu hang.
== dpdk errors, see below is related to bug:
Bug 1538953 - IOTLB entry size mismatch before/after migration during DPDK PVP testing

errors info:
..
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool still empty, failure
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: IOTLB pool empty, clear pending misses
VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
PMD: Connection closed

== qemu hang should be a separate issue. qemu hang seems because it's main thread is waiting for dpdk's ack (recvmsg), however dpdk hit errors itself and can not ack.


Version-Release number of selected component (if applicable):
3.10.0-841.el7.x86_64
qemu-kvm-rhev-2.10.0-19.el7.x86_64
dpdk-17.11-7.el7.x86_64

How reproducible:
2/4

Steps to Reproduce:
1. Boot testpmd in src and des host, refer to [1]

2. Boot guest in src host, refer to[2]
 
3. Boot guest in des host, using same command line as [2] but adding "-incoming tcp:0:5555"

4. Start testpmd in guest, refer to[3]

5. Start MoonGen in another host. Guest dpdk's testpmd can receive packets.
 
6. Migrate guest from src to des
# migrate -d tcp:10.73.72.154:5555

7. Sometimes qemu in des host and guest will both become hang. Below is the gdb info of des qemu:

(gdb) thread apply all bt

Thread 7 (Thread 0x7fc1ecee8700 (LWP 45744)):
#0  0x00007fc1f4044e09 in syscall () at /lib64/libc.so.6
#1  0x0000558c4d3f4670 in qemu_event_wait ()
#2  0x0000558c4d40456e in call_rcu_thread ()
#3  0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#4  0x00007fc1f404aaed in clone () at /lib64/libc.so.6

Thread 6 (Thread 0x7fc1ebee6700 (LWP 45812)):
#0  0x00007fc1f43274cd in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fc1f4322dcb in _L_lock_812 () at /lib64/libpthread.so.0
#2  0x00007fc1f4322c98 in pthread_mutex_lock () at /lib64/libpthread.so.0
#3  0x0000558c4d3f3faf in qemu_mutex_lock ()
#4  0x0000558c4d10b9dc in qemu_mutex_lock_iothread ()
#5  0x0000558c4d0d525d in prepare_mmio_access.isra.23 ()
#6  0x0000558c4d0d61e0 in flatview_write ()
#7  0x0000558c4d0d9955 in address_space_rw ()
#8  0x0000558c4d12c978 in kvm_cpu_exec ()
#9  0x0000558c4d10bbe2 in qemu_kvm_cpu_thread_fn ()
#10 0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#11 0x00007fc1f404aaed in clone () at /lib64/libc.so.6

Thread 5 (Thread 0x7fc1eb6e5700 (LWP 45813)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007fc1f4041517 in ioctl () at /lib64/libc.so.6
#1  0x0000558c4d12c7f5 in kvm_vcpu_ioctl ()
#2  0x0000558c4d12c8c3 in kvm_cpu_exec ()
#3  0x0000558c4d10bbe2 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fc1f404aaed in clone () at /lib64/libc.so.6

Thread 4 (Thread 0x7fc1eaee4700 (LWP 45814)):
#0  0x00007fc1f4041517 in ioctl () at /lib64/libc.so.6
#1  0x0000558c4d12c7f5 in kvm_vcpu_ioctl ()
#2  0x0000558c4d12c8c3 in kvm_cpu_exec ()
#3  0x0000558c4d10bbe2 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fc1f404aaed in clone () at /lib64/libc.so.6

Thread 3 (Thread 0x7fc1ea2df700 (LWP 45815)):
#0  0x00007fc1f4041517 in ioctl () at /lib64/libc.so.6
#1  0x0000558c4d12c7f5 in kvm_vcpu_ioctl ()
#2  0x0000558c4d12c8c3 in kvm_cpu_exec ()
#3  0x0000558c4d10bbe2 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fc1f404aaed in clone () at /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 2 (Thread 0x7fc1e7dff700 (LWP 45816)):
#0  0x00007fc1f4324945 in pthread_cond_wait@@GLIBC_2.3.2 ()
    at /lib64/libpthread.so.0
#1  0x0000558c4d3f4240 in qemu_cond_wait ()
#2  0x0000558c4d32593b in vnc_worker_thread_loop ()
#3  0x0000558c4d325e78 in vnc_worker_thread ()
#4  0x00007fc1f4320dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fc1f404aaed in clone () at /lib64/libc.so.6

Thread 1 (Thread 0x7fc1fe6bbd00 (LWP 45743)):
#0  0x00007fc1f4327bfd in recvmsg () at /lib64/libpthread.so.0
#1  0x0000558c4d3ad7cc in qio_channel_socket_readv ()
#2  0x0000558c4d39ef12 in tcp_chr_recv ()
#3  0x0000558c4d3a0a31 in tcp_chr_sync_read ()
#4  0x0000558c4d39b171 in qemu_chr_fe_read_all ()
#5  0x0000558c4d167577 in vhost_user_read.isra.1 ()
#6  0x0000558c4d1679ad in process_message_reply.part.2 ()
#7  0x0000558c4d167acb in vhost_user_send_device_iotlb_msg ()
#8  0x0000558c4d166b8a in vhost_backend_update_device_iotlb ()
#9  0x0000558c4d164e33 in vhost_device_iotlb_miss ()
#10 0x0000558c4d166ed5 in slave_read ()
#11 0x0000558c4d3f1a78 in aio_dispatch_handlers ()
#12 0x0000558c4d3f2318 in aio_dispatch ()
---Type <return> to continue, or q <return> to quit---
#13 0x0000558c4d3ef47e in aio_ctx_dispatch ()
#14 0x00007fc1f5bc28f9 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#15 0x0000558c4d3f15ac in main_loop_wait ()
#16 0x0000558c4d0d18da in main ()


Actual results:
qemu and guest become hang.


Expected results:
qemu and guest should not become hang.


Additional info:
1. Peter suggested QE to file this bug with setting a low priority, as it's caused by dpdk's errors, and this situation may not very common

Hi Peter, 

Please feel free to add comments if needed. Thanks for your analyzing for QE to report this bug.


Reference:
[1]
/usr/bin/testpmd \
-l 2,4,6,8,10 \
--socket-mem 1024,1024 \
-n 4 \
--vdev net_vhost0,iface=/tmp/vhost-user1,client=0,iommu-support=1 \
--vdev net_vhost1,iface=/tmp/vhost-user2,client=0,iommu-support=1 \
-- \
--portmask=f \
--disable-hw-vlan \
-i \
--rxq=1 --txq=1 \
--nb-cores=4 \
--forward-mode=io

testpmd> set portlist 0,2,1,3
testpmd> start 

[2]
/usr/libexec/qemu-kvm \
-name rhel7.5_nonrt \
-M q35,kernel-irqchip=split \
-cpu host -m 8G \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-smp 4,sockets=1,cores=4,threads=1 \
-device pcie-root-port,id=root.1,chassis=1 \
-device pcie-root-port,id=root.2,chassis=2 \
-device pcie-root-port,id=root.3,chassis=3 \
-device pcie-root-port,id=root.4,chassis=4 \
-drive file=/mnt/nfv/rhel7.5_nonrt.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.1,iommu_platform=on,ats=on \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=18:66:da:5f:dd:01,bus=root.2 \
-chardev socket,id=charnet1,path=/tmp/vhost-user1 \
-netdev vhost-user,chardev=charnet1,id=hostnet1 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,iommu_platform=on,ats=on,bus=root.3 \
-chardev socket,id=charnet2,path=/tmp/vhost-user2 \
-netdev vhost-user,chardev=charnet2,id=hostnet2 \
-device virtio-net-pci,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:03,iommu_platform=on,ats=on,bus=root.4 \
-vnc :2 \
-monitor stdio \


[3]
# modprobe vfio
# modprobe vfio-pci
# dpdk-devbind --bind=vfio-pci 0000:03:00.0
# dpdk-devbind --bind=vfio-pci 0000:04:00.0

# /usr/bin/testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so.1 \
-w 0000:03:00.0 -w 0000:04:00.0 \
-- \
--nb-cores=2 \
--disable-hw-vlan \
-i \
--disable-rss \
--rxq=1 --txq=1

Comment 2 Maxime Coquelin 2018-02-14 15:54:36 UTC
Patches posted upstream and merged in to DPDK master, will be in upcoming v18.02 release, and queued for next v17.11 LTS release:

commit 82b9c1540348b6be7996203065e10421e953cea9
Author: Maxime Coquelin <maxime.coquelin>
Date:   Mon Feb 5 16:04:57 2018 +0100

    vhost: remove pending IOTLB entry if miss request failed
    
    In case vhost_user_iotlb_miss returns an error, the pending IOTLB
    entry has to be removed from the list as no IOTLB update will be
    received.
    
    Fixes: fed67a20ac94 ("vhost: introduce guest IOVA to backend VA helper")
    Cc: stable
    
    Suggested-by: Tiwei Bie <tiwei.bie>
    Signed-off-by: Maxime Coquelin <maxime.coquelin>

commit 37771844a05c7b0a7b039dcae1b4b0a69b4acced
Author: Maxime Coquelin <maxime.coquelin>
Date:   Mon Feb 5 16:04:56 2018 +0100

    vhost: fix IOTLB pool out-of-memory handling
    
    In the unlikely case the IOTLB memory pool runs out of memory,
    an issue may happen if all entries are used by the IOTLB cache,
    and an IOTLB miss happen. If the iotlb pending list is empty,
    then no memory is freed and allocation fails a second time.
    
    This patch fixes this by doing an IOTLB cache random evict if
    the IOTLB pending list is empty, ensuring the second allocation
    try will succeed.
    
    In the same spirit, the opposite is done when inserting an
    IOTLB entry in the IOTLB cache fails due to out of memory. In
    this case, the IOTLB pending is flushed if the IOTLB cache is
    empty to ensure the new entry can be inserted.
    
    Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
    Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers")
    Cc: stable
    
    Signed-off-by: Maxime Coquelin <maxime.coquelin>

Comment 3 Maxime Coquelin 2018-02-15 09:37:26 UTC
Assigning to Matteo for backport.

Thanks,
Maxime

Comment 4 Matteo Croce 2018-02-16 23:08:16 UTC
Got it!

Regards,

Comment 8 Pei Zhang 2018-04-18 05:09:26 UTC
This issue has been fixed well.

Versions:
3.10.0-862.el7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64
libvirt-3.9.0-14.el7.x86_64
dpdk-17.11-9.el7fdb.x86_64
openvswitch-2.9.0-15.el7fdp.x86_64
tuned-2.9.0-1.el7.noarch

(1)PVP live migration: PASS
=========================Stream Rate: 1Mpps=====================
No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss
 0       1Mpps      132     17985        17   13905081.0
 1       1Mpps      125     19738        15   12693592.0
 2       1Mpps      128     18629        14    6022531.0
 3       1Mpps      125     18623        15   12730592.0
 4       1Mpps      129     19396        15   14879368.0
 5       1Mpps      126     18717        16    6628282.0
 6       1Mpps      120     19164        14   12151658.0
 7       1Mpps      132     18893        15   12075355.0
 8       1Mpps      124     19046        14   12115482.0
 9       1Mpps      130     19258        16   12263311.0
<------------------------Summary------------------------>
   Max   1Mpps      132     19738        17     14879368
   Min   1Mpps      120     17985        14      6022531
  Mean   1Mpps      127     18944        15     11546525
Median   1Mpps      127     18969        15     12207484
 Stdev       0     3.81    490.83      0.99   2897803.22



(2) Live migration over Open vSwitch: PASS
=======================Stream Rate: 1Mpps=========================
No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss
 0       1Mpps      132     16140        15     789681.0
 1       1Mpps      121     14862        13    5363237.0
 2       1Mpps      121     17878       114   10558667.0
 3       1Mpps      119     18357       114   11016673.0
 4       1Mpps      121     15874        14    8146823.0
 5       1Mpps      119     17038       112    6472610.0
 6       1Mpps      114     14882        13    6177523.0
 7       1Mpps      120     17951       114    7082521.0
 8       1Mpps      123     17391       114    3009150.0
 9       1Mpps      117     16279       112    2434152.0
<------------------------Summary------------------------>
   Max   1Mpps      132     18357       114     11016673
   Min   1Mpps      114     14862        13       789681
  Mean   1Mpps      120     16665        73      6105103
Median   1Mpps      120     16658       112      6325066
 Stdev       0     4.69   1253.19     51.43   3351399.49

(Beaker job: https://beaker.engineering.redhat.com/recipes/5045063#tasks)

Comment 9 Timothy Redaelli 2018-05-03 14:23:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1267

Comment 10 Timothy Redaelli 2018-05-03 15:02:21 UTC
*** Bug 1533408 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.