Bug 2027208

Summary: [virtual network][vDPA] qemu crash after hot unplug vdpa device
Product: Red Hat Enterprise Linux 8 Reporter: Lei Yang <leiyang>
Component: qemu-kvmAssignee: Laurent Vivier <lvivier>
qemu-kvm sub component: Networking QA Contact: Lei Yang <leiyang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aadam, chayang, jasowang, jinzhao, jmaloy, juzhang, lmiksik, lulu, lvivier, mrezanin, pezhang, virt-maint, wquan, yfu
Version: 8.6Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-6.2.0-9.module+el8.6.0+14480+c0a3aa0f Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2059786 2060843 (view as bug list) Environment:
Last Closed: 2022-05-10 13:24:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2059786, 2060843    

Description Lei Yang 2021-11-29 06:22:42 UTC
Description of problem:
After passing qmp hot plug vdpa device, then stop&cont guest. qemu crash after hot unplug vdpa device

Note: If stop/cont is not executed, only hot unplug vdpa device, guest works well

Version-Release number of selected component (if applicable):
qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85.x86_64
kernel-4.18.0-352.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest without vdpa device
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \



2.hot plug vdpa device
$ telnet 10.73.178.85 5555
Trying 10.73.178.85...
Connected to 10.73.178.85.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 6}, "package": "qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-7'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}
{"timestamp": {"seconds": 1638161987, "microseconds": 181742}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id7ARCys", "path": "/machine/peripheral/id7ARCys/virtio-backend"}}

3.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1638165586, "microseconds": 984624}, "event": "STOP"}
{"return": {}}

4.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1638165588, "microseconds": 628139}, "event": "RESUME"}
{"return": {}}

5. hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
Connection closed by foreign host.

6. qemu crash

(gdb) bt full
#0  0x00007f91b6900a4f in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f91b68d3db5 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f91b796e123 in g_assertion_message.cold () from /lib64/libglib-2.0.so.0
No symbol table info available.
#3  0x00007f91b79c720e in g_assertion_message_expr () from /lib64/libglib-2.0.so.0
No symbol table info available.
#4  0x0000563e931991c1 in object_unref (objptr=0x563e960d7e20) at ../qom/object.c:1183
        obj = <optimized out>
        obj = <optimized out>
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#5  object_unref (objptr=0x563e960d7e20) at ../qom/object.c:1177
        obj = 0x563e960d7e20
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#6  0x0000563e9319922f in object_property_del_all (obj=0x563e95157740) at ../qom/object.c:626
        done = <optimized out>
        prop = 0x7f8a0801e4c0
        iter = {nextclass = 0x563e94f1cd90, iter = {dummy1 = 0x563e96174c00, dummy2 = 0x563e93b3fd00 <io_mem_unassigned>, dummy3 = 0x85c68c00, dummy4 = 3, dummy5 = 0, 
            dummy6 = 0xdc2678e900000003}}
        released = false
        done = <optimized out>
        prop = <optimized out>
        iter = <optimized out>
        released = <optimized out>
#7  object_finalize (data=0x563e95157740) at ../qom/object.c:687
        obj = 0x563e95157740
        ti = 0x563e94d9d210
        obj = <optimized out>
        ti = <optimized out>
        __func__ = "object_finalize"
        _g_boolean_var_ = <optimized out>
        _g_boolean_var_ = <optimized out>
#8  object_unref (objptr=0x563e95157740) at ../qom/object.c:1187
        obj = 0x563e95157740
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#9  0x0000563e93193f21 in bus_free_bus_child (kid=0x563e956d0310) at ../hw/core/qdev.c:55
No locals.
#10 0x0000563e932b6a5b in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:281
--Type <RET> for more, q to quit, c to continue without paging--
        tries = <optimized out>
        n = 1
        node = 0x563e956d0310
#11 0x0000563e932ad794 in qemu_thread_start (args=0x563e94d34200) at ../util/qemu-thread-posix.c:541
        __clframe = {__cancel_routine = <optimized out>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <optimized out>}
        qemu_thread_args = 0x563e94d34200
        start_routine = 0x563e932b6990 <call_rcu_thread>
        arg = 0x0
        r = <optimized out>
#12 0x00007f91b6c7f17a in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#13 0x00007f91b68ebd83 in clone () from /lib64/libc.so.6
No symbol table info available.


Actual results:
qemu crash

Expected results:
guest works well

Additional info:
1.Use the same steps, tap device works well

Comment 1 Laurent Vivier 2021-12-10 16:56:27 UTC
I'm not able to reproduce the problem with the vdpa simulator.

Do you use a real vDPA device?

What is the version of the kernel running in the guest?

Thanks

Comment 2 Lei Yang 2021-12-13 01:39:42 UTC
(In reply to Laurent Vivier from comment #1)
> I'm not able to reproduce the problem with the vdpa simulator.
> 
> Do you use a real vDPA device?
> 
> What is the version of the kernel running in the guest?
> 
> Thanks

Hi Laurent

Yes I use real vdpa:
# lspci |grep ConnectX-6
3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

Guest kernel version:
kernel-4.18.0-353.el8.x86_64

Please feel free to tell me if you need me to provide an environment so that you can reproduce the problem.

Best Regards
Lei

Comment 3 Laurent Vivier 2021-12-13 13:05:30 UTC
(In reply to Lei Yang from comment #2)
> (In reply to Laurent Vivier from comment #1)
> > I'm not able to reproduce the problem with the vdpa simulator.
> > 
> > Do you use a real vDPA device?
> > 
> > What is the version of the kernel running in the guest?
> > 
> > Thanks
> 
> Hi Laurent
> 
> Yes I use real vdpa:
> # lspci |grep ConnectX-6
> 3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> Dx]
> 3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> Dx]
> 
> Guest kernel version:
> kernel-4.18.0-353.el8.x86_64
> 
> Please feel free to tell me if you need me to provide an environment so that
> you can reproduce the problem.

Could you try to reproduce the problem with the simulator?

You can use the following steps to create the device:

 # modprobe vhost-vdpa
 # modprobe vdpa_sim_net
 # vdpa dev add mgmtdev vdpasim_net name vdpasim0
 ls /sys/devices/vdpasim0
   driver  power  subsystem  uevent  vhost-vdpa-0

and you can use /dev/vhost-vdpa-0 with "vhostdev" in netdev_add

Comment 4 Lei Yang 2021-12-14 12:58:31 UTC
(In reply to Laurent Vivier from comment #3)
> (In reply to Lei Yang from comment #2)
> > (In reply to Laurent Vivier from comment #1)
> > > I'm not able to reproduce the problem with the vdpa simulator.
> > > 
> > > Do you use a real vDPA device?
> > > 
> > > What is the version of the kernel running in the guest?
> > > 
> > > Thanks
> > 
> > Hi Laurent
> > 
> > Yes I use real vdpa:
> > # lspci |grep ConnectX-6
> > 3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> > Dx]
> > 3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> > Dx]
> > 
> > Guest kernel version:
> > kernel-4.18.0-353.el8.x86_64
> > 
> > Please feel free to tell me if you need me to provide an environment so that
> > you can reproduce the problem.
> 
> Could you try to reproduce the problem with the simulator?
> You can use the following steps to create the device:
> 
>  # modprobe vhost-vdpa
>  # modprobe vdpa_sim_net
>  # vdpa dev add mgmtdev vdpasim_net name vdpasim0
>  ls /sys/devices/vdpasim0
>    driver  power  subsystem  uevent  vhost-vdpa-0
> 
> and you can use /dev/vhost-vdpa-0 with "vhostdev" in netdev_add

Hi Laurent

I tried to test this scenario with an emulator, did not reproduce current problem.

Test Steps:
1. Create simulator vdpa device
[root@dell-per440-23 ~]# modprobe vhost-vdpa
[root@dell-per440-23 ~]# modprobe vdpa_sim_net
[root@dell-per440-23 ~]# vdpa dev add mgmtdev vdpasim_net name vdpasim0
[root@dell-per440-23 ~]# ls /sys/devices/vdpasim0
driver  power  subsystem  uevent  vhost-vdpa-0

2. Boot a guest without vdpa device

3.hotplug this device via qmp
$ telnet 10.73.178.83 5555
Trying 10.73.178.83...
Connected to 10.73.178.83.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 6}, "package": "qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-0'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}
{"timestamp": {"seconds": 1639486083, "microseconds": 575719}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id7ARCys", "path": "/machine/peripheral/id7ARCys/virtio-backend"}}

4.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1638165586, "microseconds": 984624}, "event": "STOP"}
{"return": {}}

5.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1638165588, "microseconds": 628139}, "event": "RESUME"}
{"return": {}}

6.hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
{"timestamp": {"seconds": 1639486122, "microseconds": 330863}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/id7ARCys/virtio-backend"}}
{"timestamp": {"seconds": 1639486122, "microseconds": 382079}, "event": "DEVICE_DELETED", "data": {"device": "id7ARCys", "path": "/machine/peripheral/id7ARCys"}}
{'execute': 'netdev_del', 'arguments': {'id': 'idKcVRcM'}}
{"return": {}}

Guest works well.

Best Regards
Lei

Comment 5 Lei Yang 2022-01-27 04:24:33 UTC
Hit same issue

Test Version:
kernel-4.18.0-360.el8.mr1880_220122_0148.x86_64
qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

Comment 6 Laurent Vivier 2022-02-02 11:00:54 UTC
(In reply to Lei Yang from comment #5)
> Hit same issue
> 
> Test Version:
> kernel-4.18.0-360.el8.mr1880_220122_0148.x86_64
> qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

Is it possible for me to have access to the coredump of QEMU or to the machine to reproduce the problem myself?

Comment 8 Laurent Vivier 2022-02-07 18:04:21 UTC
(gdb) bt
#0  0x00007ffff488ba4f in raise () from /lib64/libc.so.6
#1  0x00007ffff485edb5 in abort () from /lib64/libc.so.6
#2  0x00007ffff58f9123 in g_assertion_message.cold ()
   from /lib64/libglib-2.0.so.0
#3  0x00007ffff595220e in g_assertion_message_expr ()
   from /lib64/libglib-2.0.so.0
#4  0x0000555555b41a41 in object_unref (objptr=0x55555699f6b0)
    at ../qom/object.c:1183
#5  object_unref (objptr=0x55555699f6b0) at ../qom/object.c:1177
#6  0x0000555555b41aaf in object_property_del_all (obj=0x555556970fa0)
    at ../qom/object.c:626
#7  object_finalize (data=0x555556970fa0) at ../qom/object.c:687
#8  object_unref (objptr=0x555556970fa0) at ../qom/object.c:1187
#9  0x0000555555b3ca51 in bus_free_bus_child (kid=0x55555697b750)
    at ../hw/core/qdev.c:55
#10 0x0000555555c6660b in call_rcu_thread (opaque=<optimized out>)
    at ../util/rcu.c:284
#11 0x0000555555c5d2f4 in qemu_thread_start (args=0x55555652d260)
    at ../util/qemu-thread-posix.c:556
#12 0x00007ffff4c0a17f in start_thread () from /lib64/libpthread.so.0
#13 0x00007ffff4876d83 in clone () from /lib64/libc.so.6

#6  0x0000555555b41aaf in object_property_del_all (obj=0x555556970fa0) at ../qom/object.c:626
626	                    prop->release(obj, prop->name, prop->opaque);

(gdb) p obj->ref
$2 = 0
(gdb) p *prop
$4 = {name = 0x7ff8d001f4b0 "vhost-vdpa\\x2fhost-notifier@0x55555699f5c0 mmaps\\x5b0\\x5d[0]", 
  type = 0x7ff8d001f3e0 "child<memory-region>", description = 0x0, get = 0x555555b43ab0 <object_get_child_property>, 
  set = 0x0, resolve = 0x555555b3fe50 <object_resolve_child_property>, 
  release = 0x555555b41b80 <object_finalize_child_property>, init = 0x0, opaque = 0x55555699f6b0, defval = 0x0}

"vhost-vdpa/host-notifier@0x55555699f5c0 mmaps\\x5b0\\x5d[0]" is added by vhost_vdpa_host_notifier_init() using virtio_queue_set_host_notifier_mr().

A first guess would be we have vhost_vdpa_host_notifier_uninit() on the stop command that decrease obj->ref.

Comment 9 Laurent Vivier 2022-02-08 09:30:13 UTC
Tested with upstream QEMU (55ef0b702bc2), and it seems fixed.

Comment 10 Laurent Vivier 2022-02-09 08:21:01 UTC
(In reply to Laurent Vivier from comment #9)
> Tested with upstream QEMU (55ef0b702bc2), and it seems fixed.

It's not fixed: I was unable to reproduce the coredump because of another error:

vhost_set_features failed: Device or resource busy (16)
unable to start vhost net: 16: falling back on userspace virtio

Comment 11 Laurent Vivier 2022-02-11 17:13:40 UTC
I have sent a fix upstream:

https://patchew.org/QEMU/20220211170259.1388734-1-lvivier@redhat.com/mbox

Author: Laurent Vivier <lvivier>
Date:   Fri Feb 11 17:49:36 2022 +0100

    hw/virtio: vdpa: Fix leak of host-notifier memory-region
    
    If call virtio_queue_set_host_notifier_mr fails, should free
    host-notifier memory-region.
    
    This problem can trigger a coredump with some vDPA drivers (mlx5,
    but not with the vdpasim), if we unplug the virtio-net card from
    the guest after a stop/start.
    
    The same fix has been done for vhost-user:
      1f89d3b91e3e ("hw/virtio: Fix leak of host-notifier memory-region")
    
    Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible")
    Cc: jasowang
    Resolves: https://bugzilla.redhat.com/2027208
    Signed-off-by: Laurent Vivier <lvivier>

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 04ea43704f5d..11f696468dc1 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -431,6 +431,7 @@ static int vhost_vdpa_host_notifier_init(struct vhost_dev *dev, int queue_index)
     g_free(name);
 
     if (virtio_queue_set_host_notifier_mr(vdev, queue_index, &n->mr, true)) {
+        object_unparent(OBJECT(&n->mr));
         munmap(addr, page_size);
         goto err;
     }

Comment 14 Lei Yang 2022-02-14 03:40:29 UTC
==> Reproduced on qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

1. Boot a guest without vdpa device
usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

2.hot plug vdpa device
$ telnet 10.73.178.85 5555
Trying 10.73.178.85...
Connected to 10.73.178.85.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 6}, "package": "qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-3'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}


3.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1644809353, "microseconds": 951537}, "event": "STOP"}
{"return": {}}


4.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1644809357, "microseconds": 199254}, "event": "RESUME"}
{"return": {}}


5. hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
Connection closed by foreign host.

6. qemu crash


==> Repeated the above process, test pass on qemu-kvm-6.2.0-5.el8.BZ2027208.

So this bug shuold fixed well on qemu-kvm-6.2.0-5.el8.BZ2027208.

Comment 15 Pei Zhang 2022-02-15 01:47:23 UTC
*** Bug 2039210 has been marked as a duplicate of this bug. ***

Comment 27 Lei Yang 2022-03-17 02:42:20 UTC
I tried to test it with qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43.x86_64, there is no issue any more.

Test Version:
qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43.x86_64
kernel-4.18.0-372.el8.x86_64

Test Steps
1. Boot a guest without vdpa device
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

2. hotplug a vdpa device
$ telnet 10.73.178.83 5555
Trying 10.73.178.83...
Connected to 10.73.178.83.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 6}, "package": "qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-7'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}

3.Stop guest
{'execute': 'stop'}
{"timestamp": {"seconds": 1647484484, "microseconds": 938175}, "event": "STOP"}
{"return": {}}

4. continue guest
{'execute': 'cont'}
{"timestamp": {"seconds": 1647484490, "microseconds": 139176}, "event": "RESUME"}
{"return": {}}

5. Hotunplug device
{"return": {}}
{"timestamp": {"seconds": 1647484496, "microseconds": 410998}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/id7ARCys/virtio-backend"}}
{"timestamp": {"seconds": 1647484496, "microseconds": 462144}, "event": "DEVICE_DELETED", "data": {"device": "id7ARCys", "path": "/machine/peripheral/id7ARCys"}}
{'execute': 'netdev_del', 'arguments': {'id': 'idKcVRcM'}}
{"return": {}}

6. guest wroks well

Comment 32 Laurent Vivier 2022-03-24 08:57:17 UTC
Mirek,

do you know why this BZ is stuck in MODIFIED state?

Thanks

Comment 37 errata-xmlrpc 2022-05-10 13:24:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759