RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2027208 - [virtual network][vDPA] qemu crash after hot unplug vdpa device
Summary: [virtual network][vDPA] qemu crash after hot unplug vdpa device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Laurent Vivier
QA Contact: Lei Yang
URL:
Whiteboard:
: 2039210 (view as bug list)
Depends On:
Blocks: 2059786 2060843
TreeView+ depends on / blocked
 
Reported: 2021-11-29 06:22 UTC by Lei Yang
Modified: 2022-11-30 02:30 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-6.2.0-9.module+el8.6.0+14480+c0a3aa0f
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2059786 2060843 (view as bug list)
Environment:
Last Closed: 2022-05-10 13:24:19 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/qemu-kvm qemu-kvm merge_requests 122 0 None None None 2022-03-09 16:41:47 UTC
Red Hat Issue Tracker RHELPLAN-104102 0 None None None 2021-11-29 06:24:11 UTC

Description Lei Yang 2021-11-29 06:22:42 UTC
Description of problem:
After passing qmp hot plug vdpa device, then stop&cont guest. qemu crash after hot unplug vdpa device

Note: If stop/cont is not executed, only hot unplug vdpa device, guest works well

Version-Release number of selected component (if applicable):
qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85.x86_64
kernel-4.18.0-352.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest without vdpa device
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \



2.hot plug vdpa device
$ telnet 10.73.178.85 5555
Trying 10.73.178.85...
Connected to 10.73.178.85.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 6}, "package": "qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-7'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}
{"timestamp": {"seconds": 1638161987, "microseconds": 181742}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id7ARCys", "path": "/machine/peripheral/id7ARCys/virtio-backend"}}

3.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1638165586, "microseconds": 984624}, "event": "STOP"}
{"return": {}}

4.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1638165588, "microseconds": 628139}, "event": "RESUME"}
{"return": {}}

5. hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
Connection closed by foreign host.

6. qemu crash

(gdb) bt full
#0  0x00007f91b6900a4f in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f91b68d3db5 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f91b796e123 in g_assertion_message.cold () from /lib64/libglib-2.0.so.0
No symbol table info available.
#3  0x00007f91b79c720e in g_assertion_message_expr () from /lib64/libglib-2.0.so.0
No symbol table info available.
#4  0x0000563e931991c1 in object_unref (objptr=0x563e960d7e20) at ../qom/object.c:1183
        obj = <optimized out>
        obj = <optimized out>
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#5  object_unref (objptr=0x563e960d7e20) at ../qom/object.c:1177
        obj = 0x563e960d7e20
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#6  0x0000563e9319922f in object_property_del_all (obj=0x563e95157740) at ../qom/object.c:626
        done = <optimized out>
        prop = 0x7f8a0801e4c0
        iter = {nextclass = 0x563e94f1cd90, iter = {dummy1 = 0x563e96174c00, dummy2 = 0x563e93b3fd00 <io_mem_unassigned>, dummy3 = 0x85c68c00, dummy4 = 3, dummy5 = 0, 
            dummy6 = 0xdc2678e900000003}}
        released = false
        done = <optimized out>
        prop = <optimized out>
        iter = <optimized out>
        released = <optimized out>
#7  object_finalize (data=0x563e95157740) at ../qom/object.c:687
        obj = 0x563e95157740
        ti = 0x563e94d9d210
        obj = <optimized out>
        ti = <optimized out>
        __func__ = "object_finalize"
        _g_boolean_var_ = <optimized out>
        _g_boolean_var_ = <optimized out>
#8  object_unref (objptr=0x563e95157740) at ../qom/object.c:1187
        obj = 0x563e95157740
        __func__ = "object_unref"
        _g_boolean_var_ = <optimized out>
#9  0x0000563e93193f21 in bus_free_bus_child (kid=0x563e956d0310) at ../hw/core/qdev.c:55
No locals.
#10 0x0000563e932b6a5b in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:281
--Type <RET> for more, q to quit, c to continue without paging--
        tries = <optimized out>
        n = 1
        node = 0x563e956d0310
#11 0x0000563e932ad794 in qemu_thread_start (args=0x563e94d34200) at ../util/qemu-thread-posix.c:541
        __clframe = {__cancel_routine = <optimized out>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <optimized out>}
        qemu_thread_args = 0x563e94d34200
        start_routine = 0x563e932b6990 <call_rcu_thread>
        arg = 0x0
        r = <optimized out>
#12 0x00007f91b6c7f17a in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#13 0x00007f91b68ebd83 in clone () from /lib64/libc.so.6
No symbol table info available.


Actual results:
qemu crash

Expected results:
guest works well

Additional info:
1.Use the same steps, tap device works well

Comment 1 Laurent Vivier 2021-12-10 16:56:27 UTC
I'm not able to reproduce the problem with the vdpa simulator.

Do you use a real vDPA device?

What is the version of the kernel running in the guest?

Thanks

Comment 2 Lei Yang 2021-12-13 01:39:42 UTC
(In reply to Laurent Vivier from comment #1)
> I'm not able to reproduce the problem with the vdpa simulator.
> 
> Do you use a real vDPA device?
> 
> What is the version of the kernel running in the guest?
> 
> Thanks

Hi Laurent

Yes I use real vdpa:
# lspci |grep ConnectX-6
3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

Guest kernel version:
kernel-4.18.0-353.el8.x86_64

Please feel free to tell me if you need me to provide an environment so that you can reproduce the problem.

Best Regards
Lei

Comment 3 Laurent Vivier 2021-12-13 13:05:30 UTC
(In reply to Lei Yang from comment #2)
> (In reply to Laurent Vivier from comment #1)
> > I'm not able to reproduce the problem with the vdpa simulator.
> > 
> > Do you use a real vDPA device?
> > 
> > What is the version of the kernel running in the guest?
> > 
> > Thanks
> 
> Hi Laurent
> 
> Yes I use real vdpa:
> # lspci |grep ConnectX-6
> 3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> Dx]
> 3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> Dx]
> 
> Guest kernel version:
> kernel-4.18.0-353.el8.x86_64
> 
> Please feel free to tell me if you need me to provide an environment so that
> you can reproduce the problem.

Could you try to reproduce the problem with the simulator?

You can use the following steps to create the device:

 # modprobe vhost-vdpa
 # modprobe vdpa_sim_net
 # vdpa dev add mgmtdev vdpasim_net name vdpasim0
 ls /sys/devices/vdpasim0
   driver  power  subsystem  uevent  vhost-vdpa-0

and you can use /dev/vhost-vdpa-0 with "vhostdev" in netdev_add

Comment 4 Lei Yang 2021-12-14 12:58:31 UTC
(In reply to Laurent Vivier from comment #3)
> (In reply to Lei Yang from comment #2)
> > (In reply to Laurent Vivier from comment #1)
> > > I'm not able to reproduce the problem with the vdpa simulator.
> > > 
> > > Do you use a real vDPA device?
> > > 
> > > What is the version of the kernel running in the guest?
> > > 
> > > Thanks
> > 
> > Hi Laurent
> > 
> > Yes I use real vdpa:
> > # lspci |grep ConnectX-6
> > 3b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> > Dx]
> > 3b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> > Dx]
> > 
> > Guest kernel version:
> > kernel-4.18.0-353.el8.x86_64
> > 
> > Please feel free to tell me if you need me to provide an environment so that
> > you can reproduce the problem.
> 
> Could you try to reproduce the problem with the simulator?
> You can use the following steps to create the device:
> 
>  # modprobe vhost-vdpa
>  # modprobe vdpa_sim_net
>  # vdpa dev add mgmtdev vdpasim_net name vdpasim0
>  ls /sys/devices/vdpasim0
>    driver  power  subsystem  uevent  vhost-vdpa-0
> 
> and you can use /dev/vhost-vdpa-0 with "vhostdev" in netdev_add

Hi Laurent

I tried to test this scenario with an emulator, did not reproduce current problem.

Test Steps:
1. Create simulator vdpa device
[root@dell-per440-23 ~]# modprobe vhost-vdpa
[root@dell-per440-23 ~]# modprobe vdpa_sim_net
[root@dell-per440-23 ~]# vdpa dev add mgmtdev vdpasim_net name vdpasim0
[root@dell-per440-23 ~]# ls /sys/devices/vdpasim0
driver  power  subsystem  uevent  vhost-vdpa-0

2. Boot a guest without vdpa device

3.hotplug this device via qmp
$ telnet 10.73.178.83 5555
Trying 10.73.178.83...
Connected to 10.73.178.83.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 6}, "package": "qemu-kvm-6.1.0-5.module+el8.6.0+13430+8fdd5f85"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-0'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}
{"timestamp": {"seconds": 1639486083, "microseconds": 575719}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id7ARCys", "path": "/machine/peripheral/id7ARCys/virtio-backend"}}

4.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1638165586, "microseconds": 984624}, "event": "STOP"}
{"return": {}}

5.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1638165588, "microseconds": 628139}, "event": "RESUME"}
{"return": {}}

6.hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
{"timestamp": {"seconds": 1639486122, "microseconds": 330863}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/id7ARCys/virtio-backend"}}
{"timestamp": {"seconds": 1639486122, "microseconds": 382079}, "event": "DEVICE_DELETED", "data": {"device": "id7ARCys", "path": "/machine/peripheral/id7ARCys"}}
{'execute': 'netdev_del', 'arguments': {'id': 'idKcVRcM'}}
{"return": {}}

Guest works well.

Best Regards
Lei

Comment 5 Lei Yang 2022-01-27 04:24:33 UTC
Hit same issue

Test Version:
kernel-4.18.0-360.el8.mr1880_220122_0148.x86_64
qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

Comment 6 Laurent Vivier 2022-02-02 11:00:54 UTC
(In reply to Lei Yang from comment #5)
> Hit same issue
> 
> Test Version:
> kernel-4.18.0-360.el8.mr1880_220122_0148.x86_64
> qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

Is it possible for me to have access to the coredump of QEMU or to the machine to reproduce the problem myself?

Comment 8 Laurent Vivier 2022-02-07 18:04:21 UTC
(gdb) bt
#0  0x00007ffff488ba4f in raise () from /lib64/libc.so.6
#1  0x00007ffff485edb5 in abort () from /lib64/libc.so.6
#2  0x00007ffff58f9123 in g_assertion_message.cold ()
   from /lib64/libglib-2.0.so.0
#3  0x00007ffff595220e in g_assertion_message_expr ()
   from /lib64/libglib-2.0.so.0
#4  0x0000555555b41a41 in object_unref (objptr=0x55555699f6b0)
    at ../qom/object.c:1183
#5  object_unref (objptr=0x55555699f6b0) at ../qom/object.c:1177
#6  0x0000555555b41aaf in object_property_del_all (obj=0x555556970fa0)
    at ../qom/object.c:626
#7  object_finalize (data=0x555556970fa0) at ../qom/object.c:687
#8  object_unref (objptr=0x555556970fa0) at ../qom/object.c:1187
#9  0x0000555555b3ca51 in bus_free_bus_child (kid=0x55555697b750)
    at ../hw/core/qdev.c:55
#10 0x0000555555c6660b in call_rcu_thread (opaque=<optimized out>)
    at ../util/rcu.c:284
#11 0x0000555555c5d2f4 in qemu_thread_start (args=0x55555652d260)
    at ../util/qemu-thread-posix.c:556
#12 0x00007ffff4c0a17f in start_thread () from /lib64/libpthread.so.0
#13 0x00007ffff4876d83 in clone () from /lib64/libc.so.6

#6  0x0000555555b41aaf in object_property_del_all (obj=0x555556970fa0) at ../qom/object.c:626
626	                    prop->release(obj, prop->name, prop->opaque);

(gdb) p obj->ref
$2 = 0
(gdb) p *prop
$4 = {name = 0x7ff8d001f4b0 "vhost-vdpa\\x2fhost-notifier@0x55555699f5c0 mmaps\\x5b0\\x5d[0]", 
  type = 0x7ff8d001f3e0 "child<memory-region>", description = 0x0, get = 0x555555b43ab0 <object_get_child_property>, 
  set = 0x0, resolve = 0x555555b3fe50 <object_resolve_child_property>, 
  release = 0x555555b41b80 <object_finalize_child_property>, init = 0x0, opaque = 0x55555699f6b0, defval = 0x0}

"vhost-vdpa/host-notifier@0x55555699f5c0 mmaps\\x5b0\\x5d[0]" is added by vhost_vdpa_host_notifier_init() using virtio_queue_set_host_notifier_mr().

A first guess would be we have vhost_vdpa_host_notifier_uninit() on the stop command that decrease obj->ref.

Comment 9 Laurent Vivier 2022-02-08 09:30:13 UTC
Tested with upstream QEMU (55ef0b702bc2), and it seems fixed.

Comment 10 Laurent Vivier 2022-02-09 08:21:01 UTC
(In reply to Laurent Vivier from comment #9)
> Tested with upstream QEMU (55ef0b702bc2), and it seems fixed.

It's not fixed: I was unable to reproduce the coredump because of another error:

vhost_set_features failed: Device or resource busy (16)
unable to start vhost net: 16: falling back on userspace virtio

Comment 11 Laurent Vivier 2022-02-11 17:13:40 UTC
I have sent a fix upstream:

https://patchew.org/QEMU/20220211170259.1388734-1-lvivier@redhat.com/mbox

Author: Laurent Vivier <lvivier>
Date:   Fri Feb 11 17:49:36 2022 +0100

    hw/virtio: vdpa: Fix leak of host-notifier memory-region
    
    If call virtio_queue_set_host_notifier_mr fails, should free
    host-notifier memory-region.
    
    This problem can trigger a coredump with some vDPA drivers (mlx5,
    but not with the vdpasim), if we unplug the virtio-net card from
    the guest after a stop/start.
    
    The same fix has been done for vhost-user:
      1f89d3b91e3e ("hw/virtio: Fix leak of host-notifier memory-region")
    
    Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible")
    Cc: jasowang
    Resolves: https://bugzilla.redhat.com/2027208
    Signed-off-by: Laurent Vivier <lvivier>

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 04ea43704f5d..11f696468dc1 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -431,6 +431,7 @@ static int vhost_vdpa_host_notifier_init(struct vhost_dev *dev, int queue_index)
     g_free(name);
 
     if (virtio_queue_set_host_notifier_mr(vdev, queue_index, &n->mr, true)) {
+        object_unparent(OBJECT(&n->mr));
         munmap(addr, page_size);
         goto err;
     }

Comment 14 Lei Yang 2022-02-14 03:40:29 UTC
==> Reproduced on qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64

1. Boot a guest without vdpa device
usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

2.hot plug vdpa device
$ telnet 10.73.178.85 5555
Trying 10.73.178.85...
Connected to 10.73.178.85.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 6}, "package": "qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-3'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}


3.stop guest via qmp
{'execute': 'stop'}
{"timestamp": {"seconds": 1644809353, "microseconds": 951537}, "event": "STOP"}
{"return": {}}


4.continue guest via qmp
{'execute': 'cont'}
{"timestamp": {"seconds": 1644809357, "microseconds": 199254}, "event": "RESUME"}
{"return": {}}


5. hot unplug vdpa device
{'execute': 'device_del', 'arguments': {'id': 'id7ARCys'}}
{"return": {}}
Connection closed by foreign host.

6. qemu crash


==> Repeated the above process, test pass on qemu-kvm-6.2.0-5.el8.BZ2027208.

So this bug shuold fixed well on qemu-kvm-6.2.0-5.el8.BZ2027208.

Comment 15 Pei Zhang 2022-02-15 01:47:23 UTC
*** Bug 2039210 has been marked as a duplicate of this bug. ***

Comment 27 Lei Yang 2022-03-17 02:42:20 UTC
I tried to test it with qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43.x86_64, there is no issue any more.

Test Version:
qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43.x86_64
kernel-4.18.0-372.el8.x86_64

Test Steps
1. Boot a guest without vdpa device
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 28672 \
-object memory-backend-ram,size=28672M,id=mem-machine_mem  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

2. hotplug a vdpa device
$ telnet 10.73.178.83 5555
Trying 10.73.178.83...
Connected to 10.73.178.83.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 6}, "package": "qemu-kvm-6.2.0-9.module+el8.6.0+14495+7194fa43"}, "capabilities": ["oob"]}}
{"execute":"qmp_capabilities"}
{"return": {}}
{'execute': 'netdev_add', 'arguments': {'type': 'vhost-vdpa', 'id': 'idKcVRcM', 'vhostdev': '/dev/vhost-vdpa-7'}}
{"return": {}}
{"execute": "device_add", "arguments": {"driver":"virtio-net-pci","netdev":"idKcVRcM","mac":"00:1a:4a:42:0b:01","id":"id7ARCys","bus":"pcie_extra_root_port_0","addr":"0x0"}}
{"return": {}}

3.Stop guest
{'execute': 'stop'}
{"timestamp": {"seconds": 1647484484, "microseconds": 938175}, "event": "STOP"}
{"return": {}}

4. continue guest
{'execute': 'cont'}
{"timestamp": {"seconds": 1647484490, "microseconds": 139176}, "event": "RESUME"}
{"return": {}}

5. Hotunplug device
{"return": {}}
{"timestamp": {"seconds": 1647484496, "microseconds": 410998}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/id7ARCys/virtio-backend"}}
{"timestamp": {"seconds": 1647484496, "microseconds": 462144}, "event": "DEVICE_DELETED", "data": {"device": "id7ARCys", "path": "/machine/peripheral/id7ARCys"}}
{'execute': 'netdev_del', 'arguments': {'id': 'idKcVRcM'}}
{"return": {}}

6. guest wroks well

Comment 32 Laurent Vivier 2022-03-24 08:57:17 UTC
Mirek,

do you know why this BZ is stuck in MODIFIED state?

Thanks

Comment 37 errata-xmlrpc 2022-05-10 13:24:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759


Note You need to log in before you can comment on or make changes to this bug.