Bug 1852906

Summary: packed=on:boot guest with vhost-user and vIOMMU, reboot guest will cause qemu crash when there are packets flow
Product: Red Hat Enterprise Linux 8 Reporter: Pei Zhang <pezhang>
Component: qemu-kvmAssignee: Eugenio Pérez Martín <eperezma>
qemu-kvm sub component: Networking QA Contact: Pei Zhang <pezhang>
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: high CC: aadam, chayang, eperezma, jinzhao, juzhang, jwboyer, virt-maint
Version: 8.3Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1844468 Environment:
Last Closed: 2021-02-01 02:26:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1844468    
Bug Blocks: 1897024    

Description Pei Zhang 2020-07-01 14:56:15 UTC
+++ This bug was initially created as a clone of Bug #1844468 +++

Description of problem:
Boot guest with vhost-user, vIOMMU and packed=on. Then sending Moongen packets from another server, guest can receive these packets. Reboot guest will cause qemu crash

Version-Release number of selected component (if applicable):
4.18.0-221.el8.x86_64
qemu-kvm-4.2.0-29.module+el8.3.0+7212+401047e6.x86_64
libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
openvswitch2.11-2.11.0-56.20200327gita4efc59.el8fdp.x86_64

How reproducible:
100%

Steps to Reproduce:

1. Boot ovs with vhostuserclient ports
# ovs-vsctl show
39b1fab4-13cf-4f16-ad64-b2292c79adcb
    Bridge ovsbr1
        datapath_type: netdev
        Port dpdk1
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.1", n_rxq="1"}
        Port ovsbr1
            Interface ovsbr1
                type: internal
        Port vhost-user1
            Interface vhost-user1
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser1.sock"}
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:5e:00.0", n_rxq="1"}
        Port vhost-user0
            Interface vhost-user0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhostuser0.sock"}

2. Boot guest with vhost-user,enable packed=on and vIOMMU. Full XML is attached.
  <devices>
    <interface type='bridge'>
      <mac address='88:66:da:5f:dd:01'/>
      <source bridge='switch'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:22'/>
      <source type='unix' path='/tmp/vhostuser0.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </interface>
    <interface type='vhostuser'>
      <mac address='18:66:da:5f:dd:23'/>
      <source type='unix' path='/tmp/vhostuser1.sock' mode='server'/>
      <model type='virtio'/>
      <driver name='vhost' rx_queue_size='1024' iommu='on' ats='on'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.net1.packed=on'/>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.net2.packed=on'/>
  </qemu:commandline>


qemu cmd line:

/usr/libexec/qemu-kvm \
-name guest=rhel8.3 \
-machine q35,kernel_irqchip=split \
-cpu host \
-m 8192 \
-smp 6,sockets=6,cores=1,threads=1 \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \
-device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \
-blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/mnt/nfv/rhel8.3.qcow2,node-name=my_file \
-blockdev driver=qcow2,node-name=my,file=my_file \
-device virtio-blk-pci,scsi=off,iommu_platform=on,ats=on,bus=pci.2,addr=0x0,drive=my,id=virtio-disk0,bootindex=1,write-cache=on \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=28:66:da:5f:dd:01,bus=pci.1,addr=0x0 \
-chardev socket,id=charnet1,path=/tmp/vhostuser0.sock,server \
-netdev vhost-user,chardev=charnet1,id=hostnet1 \
-device virtio-net-pci,rx_queue_size=1024,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:22,bus=pci.6,addr=0x0,iommu_platform=on,ats=on,packed=on \
-chardev socket,id=charnet2,path=/tmp/vhostuser1.sock,server \
-netdev vhost-user,chardev=charnet2,id=hostnet2 \
-device virtio-net-pci,rx_queue_size=1024,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:23,bus=pci.7,addr=0x0,iommu_platform=on,ats=on,packed=on \
-monitor stdio \
-vnc :2 \


3. In another host, start MoonGen to generate packets flow to guest

# ./build/MoonGen examples/l2-load-latency.lua 0 1 640

4. In guest, we can observer packets increase of the vhost-user nics

== results of time stamp1:

# ifconfig
...
enp6s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::2a56:a65d:9906:c56f  prefixlen 64  scopeid 0x20<link>
        ether 28:66:da:5f:dd:22  txqueuelen 1000  (Ethernet)
        RX packets 1763373  bytes 105803028 (100.9 MiB)
        RX errors 0  dropped 1763373  overruns 0  frame 3
        TX packets 26  bytes 3788 (3.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::876a:9327:63bb:812f  prefixlen 64  scopeid 0x20<link>
        ether 28:66:da:5f:dd:23  txqueuelen 1000  (Ethernet)
        RX packets 1721804  bytes 103311192 (98.5 MiB)
        RX errors 0  dropped 1721804  overruns 0  frame 2
        TX packets 26  bytes 3788 (3.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

== results of time stamp2:

# ifconfig
...
enp6s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::2a56:a65d:9906:c56f  prefixlen 64  scopeid 0x20<link>
        ether 28:66:da:5f:dd:22  txqueuelen 1000  (Ethernet)
        RX packets 6071299  bytes 364278588 (347.4 MiB)
        RX errors 0  dropped 6071299  overruns 0  frame 3
        TX packets 28  bytes 4168 (4.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::876a:9327:63bb:812f  prefixlen 64  scopeid 0x20<link>
        ether 28:66:da:5f:dd:23  txqueuelen 1000  (Ethernet)
        RX packets 5929404  bytes 355767192 (339.2 MiB)
        RX errors 0  dropped 5929404  overruns 0  frame 2
        TX packets 28  bytes 4168 (4.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


5. Reboot guest, qemu crash.

(qemu) qemu.sh: line 30:  3792 Segmentation fault      (core dumped) /usr/libexec/qemu-kvm -name guest=rhel8.3 -machine q35,kernel_irqchip=split -cpu host -m 8192 -smp 6,sockets=6,cores=1,threads=1 -object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/mnt/nfv/rhel8.3.qcow2,node-name=my_file -blockdev driver=qcow2,node-name=my,file=my_file -device virtio-blk-pci,scsi=off,iommu_platform=on,ats=on,bus=pci.2,addr=0x0,drive=my,id=virtio-disk0,bootindex=1,write-cache=on -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=28:66:da:5f:dd:01,bus=pci.1,addr=0x0 -chardev socket,id=charnet1,path=/tmp/vhostuser0.sock,server -netdev vhost-user,chardev=charnet1,id=hostnet1 -device virtio-net-pci,rx_queue_size=1024,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:22,bus=pci.6,addr=0x0,iommu_platform=on,ats=on,packed=on -chardev socket,id=charnet2,path=/tmp/vhostuser1.sock,server -netdev vhost-user,chardev=charnet2,id=hostnet2 -device virtio-net-pci,rx_queue_size=1024,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:23,bus=pci.7,addr=0x0,iommu_platform=on,ats=on,packed=on -monitor stdio -vnc :2


Actual results:
qemu crash.

Expected results:
qemu should not crash

Additional info:
1. Without packet=on, this issue is gone. Everything works well.

vhost-user + vIOMMU, no packet=on    Works well

2. Without vIOMMU, this issue is gone. Everything works well.

vhost-user + packet=on, no vIOMMU    Works well

3. I understand vhost-user + vIOMMU + guest virtio-net kernel driver is not a recommend configuration to customers (Maxime explained this situation in 1572879#c13), because it has lower performance. However guest hang should be a problem.

--- Additional comment from Pei Zhang on 2020-06-05 21:46:24 HKT ---

Additional info(continued):

4. When there is no packets flow, the issue is gone. Everything works well.

vhost-user + vIOMMU + guest virtio-net kernel driver, but no packets flow in guest.   works.

Comment 1 Pei Zhang 2020-07-01 15:20:08 UTC
(gdb) bt
#0  0x0000555555914000 in vhost_device_iotlb_miss
    (dev=dev@entry=0x5555565d7530, iova=10535211008, write=<optimized out>)
    at /usr/src/debug/qemu-kvm-4.2.0-29.module+el8.3.0+7212+401047e6.x86_64/hw/virtio/vhost.c:944
#1  0x0000555555916201 in vhost_backend_handle_iotlb_msg (imsg=0x7fffffffd110, dev=0x5555565d7530)
    at /usr/src/debug/qemu-kvm-4.2.0-29.module+el8.3.0+7212+401047e6.x86_64/hw/virtio/vhost-backend.c:351
#2  0x0000555555916201 in vhost_backend_handle_iotlb_msg
    (dev=dev@entry=0x5555565d7530, imsg=imsg@entry=0x7fffffffd110)
    at /usr/src/debug/qemu-kvm-4.2.0-29.module+el8.3.0+7212+401047e6.x86_64/hw/virtio/vhost-backend.c:344
#3  0x0000555555916dab in slave_read (opaque=0x5555565d7530)
    at /usr/src/debug/qemu-kvm-4.2.0-29.module+el8.3.0+7212+401047e6.x86_64/hw/virtio/vhost-user.c:1048
#4  0x0000555555bd0a32 in aio_dispatch_handlers (ctx=ctx@entry=0x5555564f2bc0) at util/aio-posix.c:429
#5  0x0000555555bd13dc in aio_dispatch (ctx=0x5555564f2bc0) at util/aio-posix.c:460
#6  0x0000555555bcdec2 in aio_ctx_dispatch
    (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#7  0x00007ffff76a767d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#8  0x0000555555bd0488 in glib_pollfds_poll () at util/main-loop.c:219
#9  0x0000555555bd0488 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#10 0x0000555555bd0488 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
#11 0x00005555559afcc1 in main_loop () at vl.c:1828
#12 0x000055555585cf92 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504
(gdb) c
Continuing.

[1]+  Stopped                 gdb /usr/libexec/qemu-kvm