Bug 1960137 - [incremental backup] qemu-kvm hangs when Rebooting the VM during full backup
Summary: [incremental backup] qemu-kvm hangs when Rebooting the VM during full backup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: rc
: 8.4
Assignee: Sergio Lopez
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks: 1892681 1957194 1973829
TreeView+ depends on / blocked
 
Reported: 2021-05-13 06:46 UTC by Ilan Zuckerman
Modified: 2021-11-16 08:37 UTC (History)
8 users (show)

Fixed In Version: qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1973829 (view as bug list)
Environment:
Last Closed: 2021-11-16 07:53:34 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
VM dump xml: VM_1_raw_disk_virtio_dump.xml Core dump of qemu-kvm pid (gcore {qemu-kvm-pid}): core.380155 Engine log + vdsm log + daemon.log from both, engine and vdsm (9.90 MB, application/zip)
2021-05-13 06:46 UTC, Ilan Zuckerman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:4684 0 None None None 2021-11-16 07:54:08 UTC

Description Ilan Zuckerman 2021-05-13 06:46:32 UTC
Created attachment 1782624 [details]
VM dump xml: VM_1_raw_disk_virtio_dump.xml Core dump of qemu-kvm pid (gcore {qemu-kvm-pid}): core.380155 Engine log + vdsm log + daemon.log from both, engine and vdsm

Description of problem:

When rebooting the VM (from within the VM) during full backup, right after the disk download finished, the VM becomes 'not responding'.
At this stage, we are not able to power off the VM from engine GUI due to "Cannot stop VM. The VM is during a backup operation.". The only way is to terminate it with (kill -9 {qemu-kvm-pid})
regular kill (without the -9) has no effect.



Version-Release number of selected component (if applicable):
rhv-4.4.6-8
qemu-kvm-core-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64



How reproducible:
I repeated the same flow with full backup and reboot with virtio raw disk VM as described in comment #25 four times.
Out of 4 reboots, the issue was seen in 4 of them.
Also tried the same flow with raw_disk_virtio-scsi VM, but at this configuration, it all worked as expected.


Steps to Reproduce:
1. Create raw OS disk with fedora on the host:
virt-builder fedora-32 -o /var/tmp/fedora-32.raw --format=qcow2 --hostname=f32 --ssh-inject=root --root-password=password:XXX --selinux-relabel --install=qemu-guest-agent

2. Upload the disk to file storage SD
python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py /var/tmp/fedora-32.raw --disk-format=raw --disk-sparse --sd-name=nfs_0 -c engine

3. Edit the 'backup_vm.py' script to have the disk download repeated 5 times:
vi /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py

diff --git a/sdk/examples/backup_vm.py b/sdk/examples/backup_vm.py
index f00f4e6..494716b 100755
--- a/sdk/examples/backup_vm.py
+++ b/sdk/examples/backup_vm.py
@@ -349,9 +349,10 @@ def download_backup(connection, backup_uuid, args, incremental=False):
     backup_type = "incremental" if incremental else "full"
     timestamp = time.strftime("%Y%m%d%H%M")
     for disk in backup_disks:
-        file_name = "{}.{}.{}.qcow2".format(disk.id, timestamp, backup_type)
-        disk_path = os.path.join(args.backup_dir, file_name)
-        download_disk(connection, backup_uuid, disk, disk_path, args, incremental=incremental)
+        for i in range(5):
+            file_name = "{}.{:02d}.qcow2".format(disk.id, i)
+            disk_path = os.path.join(args.backup_dir, file_name)
+            download_disk(connection, backup_uuid, disk, disk_path, args, incremental=incremental)


4. On the engine ui create blank VM, and attach the uploaded disk to it as an OS disk, and start the VM

5. Run the full backup flow. When the disk finishes to download, reboot the VM from within itself (ssh). Repeat the reboots until the script is done:

[root@storage-ge13-vdsm1 examples]# python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py -c engine full 2fe28915-fe33-4f92-9612-7684bbb0ea98 --backup-dir /rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com\:_nas01_ge__storage13__nfs__1/backup/raw/



Actual results:

After second reboot the qemu-kvm process hangs, and VM becomes 'not responding'

[root@storage-ge13-vdsm1 examples]# python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py -c engine full 2fe28915-fe33-4f92-9612-7684bbb0ea98 --backup-dir /rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com\:_nas01_ge__storage13__nfs__1/backup/raw/
[   0.0 ] Starting full backup for VM 2fe28915-fe33-4f92-9612-7684bbb0ea98
[   6.7 ] Waiting until backup 2eb2ea4d-bf57-4eec-a6fe-42a04e02b4fc is ready
[  28.0 ] Creating image transfer for disk 82b0903c-4076-4e80-9fd4-d7b80783913f
[  30.1 ] Image transfer 18b8cc2e-3c66-4663-8d6f-27d4eef2ac4f is ready
[ 100.00% ] 6.00 GiB, 70.75 seconds, 86.85 MiB/s                               
[ 100.9 ] Finalizing image transfer
[ 104.0 ] Creating image transfer for disk 82b0903c-4076-4e80-9fd4-d7b80783913f
[ 105.1 ] Image transfer a80f0c3f-95ad-4ed1-8a60-9394ce5a770e is ready
[ 100.00% ] 6.00 GiB, 73.14 seconds, 84.00 MiB/s                               
[ 178.2 ] Finalizing image transfer
[ 181.4 ] Creating image transfer for disk 82b0903c-4076-4e80-9fd4-d7b80783913f
[ 183.1 ] Image transfer a43573b0-802b-47cb-9b75-523085ac9716 is ready
[ 100.00% ] 6.00 GiB, 71.93 seconds, 85.42 MiB/s                               
[ 255.0 ] Finalizing image transfer
[ 257.6 ] Creating image transfer for disk 82b0903c-4076-4e80-9fd4-d7b80783913f
[ 259.7 ] Image transfer ee69b649-6747-4a93-816c-275aed3ed301 is ready
[ 100.00% ] 6.00 GiB, 73.66 seconds, 83.41 MiB/s                               
[ 333.4 ] Finalizing image transfer
[ 343.6 ] Creating image transfer for disk 82b0903c-4076-4e80-9fd4-d7b80783913f
[ 344.7 ] Image transfer 9c46a2be-d5cd-4200-b120-747d47c2061a is ready
[   8.33% ] 512.00 MiB, 70.47 seconds, 7.27 MiB/s                              
[ 415.2 ] Finalizing image transfer
[ 476.0 ] Finalizing backup
Traceback (most recent call last):
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 420, in download_disk
    **extra_args)
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/client/_api.py", line 186, in download
    name="download")
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 69, in copy
    log.debug("Executor failed")
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 189, in __exit__
    self.stop()
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 166, in stop
    raise self._errors[0]
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 238, in _run
    handler.copy(req)
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 282, in copy
    self._src.write_to(self._dst, req.length, self._buf)
  File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 211, in write_to
    n = res.readinto(view[:step])
  File "/usr/lib64/python3.6/http/client.py", line 507, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib64/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 177, in cmd_full
    download_backup(connection, backup.id, args)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 376, in download_backup
    download_disk(connection, backup_uuid, disk, disk_path, args, incremental=incremental)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 423, in download_disk
    imagetransfer.finalize_transfer(connection, transfer, disk)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/helpers/imagetransfer.py", line 332, in finalize_transfer
    .format(transfer.id))
RuntimeError: Timed out waiting for transfer 9c46a2be-d5cd-4200-b120-747d47c2061a to finalize

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 457, in <module>
    main()
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 161, in main
    args.command(args)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 180, in cmd_full
    stop_backup(connection, backup.id, args)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 335, in stop_backup
    backup_service.finalize()
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 33793, in finalize
    return self._internal_action(action, 'finalize', None, headers, query, wait)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 299, in _internal_action
    return future.wait() if wait else future
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 296, in callback
    self._check_fault(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault
    self._raise_error(response, body)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[]". HTTP response code is 400.



Expected results:
VM shouldnt hang, and the backup flow should end successfully. 

Additional info:
Adding the following:
VM dump xml: VM_1_raw_disk_virtio_dump.xml
Core dump of qemu-kvm pid (gcore {qemu-kvm-pid}): core.380155
Engine log + vdsm log + daemon.log from both, engine and vdsm

Comment 1 Nir Soffer 2021-05-13 09:31:43 UTC
Adding more info from the RHV bug:

Backtrace of qemu after the vm becomes non-responsive:

[New LWP 380155]
[New LWP 380160]
[New LWP 380161]
[New LWP 380165]
[New LWP 380167]
[New LWP 380171]
[New LWP 380172]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm'.
#0  0x00007f9daa00c65d in __lll_lock_wait () from /lib64/libpthread.so.0
[Current thread is 1 (Thread 0x7f9dad9def00 (LWP 380155))]

Thread 7 (Thread 0x7f9d43dff700 (LWP 380172)):
#0  0x00007f9daa0092fc in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00005641db21e04d in qemu_cond_wait_impl (cond=<optimized out>, mutex=0x5641de00be88, file=0x5641db2540d3 "../ui/vnc-jobs.c", line=215) at ../util/qemu-thread-posix.c:174
#2  0x00005641daec1d71 in vnc_worker_thread_loop (queue=queue@entry=0x5641de00be50) at ../ui/vnc-jobs.c:215
#3  0x00005641daec26a0 in vnc_worker_thread (arg=0x5641de00be50) at ../ui/vnc-jobs.c:325
#4  0x00005641db21dc94 in qemu_thread_start (args=0x5641de00bee0) at ../util/qemu-thread-posix.c:521
#5  0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#6  0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 6 (Thread 0x7f9d54fff700 (LWP 380171)):
#0  0x00007f9da9d27a41 in poll () at /lib64/libc.so.6
#1  0x00007f9dac839ab6 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007f9dac839e72 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#3  0x00007f9d9f6c58db in red_worker_main () at /lib64/libspice-server.so.1
#4  0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#5  0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 5 (Thread 0x7f9d575fe700 (LWP 380167)):
#0  0x00007f9da9d27b36 in ppoll () at /lib64/libc.so.6
#1  0x00005641db21efa9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  0x00005641db21efa9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at ../util/qemu-timer.c:337
#3  0x00005641db21e909 in fdmon_poll_wait (ctx=0x5641dc16ba60, ready_list=0x7f9d575fd210, timeout=-1) at ../util/fdmon-poll.c:80
#4  0x00005641db21b511 in aio_poll (ctx=0x5641dc16ba60, blocking=blocking@entry=true) at ../util/aio-posix.c:607
#5  0x00005641db214d3a in aio_wait_bh_oneshot (ctx=0x5641dc1768e0, cb=cb@entry=0x5641db135170 <nbd_aio_detach_bh>, opaque=opaque@entry=0x5641de42b400) at ../util/aio-wait.c:71
#6  0x00005641db13414c in blk_aio_detach (opaque=0x5641de42b400) at ../nbd/server.c:1573
#7  0x00005641db18945b in bdrv_detach_aio_context (bs=0x5641dc1f0990) at ../block.c:6394
#8  0x00005641db18945b in bdrv_set_aio_context_ignore (bs=0x5641dc1f0990, new_context=new_context@entry=0x5641dc16ba60, ignore=ignore@entry=0x7f9d575fd350) at ../block.c:6502
#9  0x00005641db18938b in bdrv_set_aio_context_ignore (bs=bs@entry=0x5641de01a680, new_context=new_context@entry=0x5641dc16ba60, ignore=ignore@entry=0x7f9d575fd350) at ../block.c:6489
#10 0x00005641db189973 in bdrv_child_try_set_aio_context (bs=bs@entry=0x5641de01a680, ctx=0x5641dc16ba60, ignore_child=<optimized out>, errp=<optimized out>) at ../block.c:6606
#11 0x00005641db141f3b in blk_do_set_aio_context (blk=0x5641ddbe08f0, new_context=0x5641dc16ba60, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at ../block/block-backend.c:2026
#12 0x00005641db144641 in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at ../block/block-backend.c:2047
#13 0x00005641db087f34 in virtio_blk_data_plane_stop (vdev=<optimized out>) at ../hw/block/dataplane/virtio-blk.c:304
#14 0x00005641daf5b320 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5641ddbdcb48) at ../hw/virtio/virtio-bus.c:245
#15 0x00005641daf5ba6f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5641ddbdcb48) at ../hw/virtio/virtio-bus.c:237
#16 0x00005641daf40958 in virtio_pci_stop_ioeventfd (proxy=0x5641ddbd4a40) at ../hw/virtio/virtio-pci.c:1255
#17 0x00005641daf40958 in virtio_pci_common_write (opaque=0x5641ddbd4a40, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1255
#18 0x00005641db0cf508 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at ../softmmu/memory.c:491
#19 0x00005641db0ce0fe in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f9d575fd518, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x5641db0cf4b0 <memory_region_write_accessor>, mr=0x5641ddbd5420, attrs=...) at ../softmmu/memory.c:552
#20 0x00005641db0d14a3 in memory_region_dispatch_write (mr=mr@entry=0x5641ddbd5420, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../softmmu/memory.c:1501
#21 0x00005641db07c957 in flatview_write_continue (fv=fv@entry=0x7f9d482a3ef0, addr=addr@entry=4263510036, attrs=..., ptr=ptr@entry=0x7f9dad9fd028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x5641ddbd5420) at /usr/src/debug/qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64/include/qemu/host-utils.h:164
#22 0x00005641db07cb76 in flatview_write (fv=0x7f9d482a3ef0, addr=4263510036, attrs=..., buf=0x7f9dad9fd028, len=1) at ../softmmu/physmem.c:2799
#23 0x00005641db08070f in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at ../softmmu/physmem.c:2891
#24 0x00005641db0a41da in kvm_cpu_exec (cpu=cpu@entry=0x5641dc1f54c0) at ../accel/kvm/kvm-all.c:2553
#25 0x00005641db0dc8e5 in kvm_vcpu_thread_fn (arg=0x5641dc1f54c0) at ../accel/kvm/kvm-cpus.c:49
#26 0x00005641db21dc94 in qemu_thread_start (args=0x5641dc21cdd0) at ../util/qemu-thread-posix.c:521
#27 0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#28 0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 4 (Thread 0x7f9d57dff700 (LWP 380165)):
#0  0x00007f9da9d27a41 in poll () at /lib64/libc.so.6
#1  0x00007f9dac839ab6 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007f9dac839e72 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#3  0x00005641db12aa69 in iothread_run (opaque=0x5641dbff9b20) at ../iothread.c:80
#4  0x00005641db21dc94 in qemu_thread_start (args=0x5641dc1eac40) at ../util/qemu-thread-posix.c:521
#5  0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#6  0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 3 (Thread 0x7f9d9dbf0700 (LWP 380161)):
#0  0x00007f9da9d27b36 in ppoll () at /lib64/libc.so.6
#1  0x00005641db21ef65 in ppoll (__ss=0x0, __timeout=0x7f9d9dbef520, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  0x00005641db21ef65 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=599999036818) at ../util/qemu-timer.c:349
#3  0x00005641db21e909 in fdmon_poll_wait (ctx=0x5641dc1768e0, ready_list=0x7f9d9dbef5a0, timeout=599999036818) at ../util/fdmon-poll.c:80
#4  0x00005641db21b511 in aio_poll (ctx=ctx@entry=0x5641dc1768e0, blocking=blocking@entry=true) at ../util/aio-posix.c:607
#5  0x00005641db13528d in nbd_aio_detach_bh (opaque=0x5641de42b400) at ../nbd/server.c:1562
#6  0x00005641db214c6f in aio_wait_bh (opaque=0x7f9d575fd260) at ../util/aio-wait.c:55
#7  0x00005641db212c5d in aio_bh_call (bh=0x7f9d98003620) at ../util/async.c:164
#8  0x00005641db212c5d in aio_bh_poll (ctx=ctx@entry=0x5641dc1768e0) at ../util/async.c:164
#9  0x00005641db21b122 in aio_poll (ctx=0x5641dc1768e0, blocking=blocking@entry=true) at ../util/aio-posix.c:659
#10 0x00005641db12aa3a in iothread_run (opaque=0x5641dc112f20) at ../iothread.c:73
#11 0x00005641db21dc94 in qemu_thread_start (args=0x5641dc176df0) at ../util/qemu-thread-posix.c:521
#12 0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#13 0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 2 (Thread 0x7f9da52e4700 (LWP 380160)):
#0  0x00007f9da9d2d52d in syscall () at /lib64/libc.so.6
#1  0x00005641db21e55f in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /usr/src/debug/qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64/include/qemu/futex.h:29
#2  0x00005641db21e55f in qemu_event_wait (ev=ev@entry=0x5641dbaa4a40 <rcu_gp_event>) at ../util/qemu-thread-posix.c:460
#3  0x00005641db23db9f in wait_for_readers () at ../util/rcu.c:135
#4  0x00005641db23db9f in synchronize_rcu () at ../util/rcu.c:171
#5  0x00005641db23df45 in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:265
#6  0x00005641db21dc94 in qemu_thread_start (args=0x5641dbf863a0) at ../util/qemu-thread-posix.c:521
#7  0x00007f9daa00314a in start_thread () at /lib64/libpthread.so.0
#8  0x00007f9da9d32dc3 in clone () at /lib64/libc.so.6

Thread 1 (Thread 0x7f9dad9def00 (LWP 380155)):
#0  0x00007f9daa00c65d in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007f9daa005979 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x00005641db21dd7d in qemu_mutex_lock_impl (mutex=0x5641dbaa0740 <qemu_global_mutex>, file=0x5641db57ecc1 "../util/main-loop.c", line=242) at ../util/qemu-thread-posix.c:79
#3  0x00005641db08670e in qemu_mutex_lock_iothread_impl (file=file@entry=0x5641db57ecc1 "../util/main-loop.c", line=line@entry=242) at ../softmmu/cpus.c:485
#4  0x00005641db236a1d in os_host_main_loop_wait (timeout=<optimized out>) at ../util/main-loop.c:242
#5  0x00005641db236a1d in main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:520
#6  0x00005641db099eb1 in qemu_main_loop () at ../softmmu/vl.c:1679
#7  0x00005641dae77a02 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:50

Comment from Stefan:

Thread 3 vs Thread 5 looks suspicious:

Thread 5 (vcpu thread):
#5  0x00005641db214d3a in aio_wait_bh_oneshot (ctx=0x5641dc1768e0,

Thread 3 (IOThread)
#4  0x00005641db21b511 in aio_poll (ctx=ctx@entry=0x5641dc1768e0,

The vcpu thread is waiting for the IOThread to do something and then notify the vcpu thread (QEMU main loop).

I haven't tried following the code myself, but this looks like a clue regarding the deadlock.

Comment 2 Nir Soffer 2021-05-13 09:58:03 UTC
Notes on reproducing without RHV:

ovirt-imageio opens 5 connections to qemu nbd server:
- connection 1: extents connection
- connections 2-5: readers

The nbd connections are configured with "base:allocation" meta context
and structured replies.

Connection 1 is performing a loop, sending NBD_CMD_BLOCK_STATUS
with length 2G, split the returned extents to 128m chunks
and feed zero/copy requests to a queue.

Connections 2-5 consume the zero/copy requests from the queue and send 
NBD_CMD_READ with with maximum length of 8m.

All this happens on the client side, and the requests are sent over 
http, and translated to nbd on the server, but this should not matter
for reproducing.

So we are likely to have up to 4 concurrent NBD_CMD_READ during the 
download.

In comment 0 we can see that after we downloaded:

    [  8.33% ] 512.00 MiB, 70.47 seconds, 7.27 MiB/s 

qemu nbd server stopped responding.

To reproduce we start a full backup via libvirt, and then start a loop
downloading full backup 5 times. The purpose of the multiple downloads
is to check that we get the same data during the same backup. In a real
system the client will download the data once.

After download is started, we reboot the vm within the guest. When the
guest is up again, we reboot it again. Repeat until all downloads
finish.

Comment 3 Nir Soffer 2021-05-13 10:00:13 UTC
Bump severity since after this deadlock, the vm cannot stopped. The only
way to stop it is to SIGKILL the qemu process.

Comment 4 Ilan Zuckerman 2021-05-13 10:31:06 UTC
Libvirt + qemu versions which was tested on:

libvirt-7.0.0-14.module+el8.4.0+10886+79296686.x86_64
qemu-kvm-core-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64

Comment 5 John Ferlan 2021-05-13 16:13:47 UTC
Just as a data point, can we test the sequence against the recently ON_QA RHEL-AV 8.5.0?

qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d

I believe Jirka is working libvirt-7.3.0-1.el8 through the AV8.5.0 downstream process right now...

Thanks

Comment 6 aihua liang 2021-05-18 03:04:16 UTC
Can reproduce it on qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.

Test Steps:
 1.Start guest with qemu cmds:
 /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 30720  \
    -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
    -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
    -chardev socket,server=on,path=/tmp/monitor-qmpmonitor1-20210512-234257-mOeaMK07,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,server=on,path=/tmp/monitor-catch_monitor-20210512-234257-mOeaMK07,id=qmp_id_catch_monitor,wait=off  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idXTMc9z \
    -chardev socket,server=on,path=/tmp/serial-serial0-20210512-234257-mOeaMK07,id=chardev_serial0,wait=off \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20210512-234257-mOeaMK07,path=/tmp/seabios-20210512-234257-mOeaMK07,server=on,wait=off \
    -device isa-debugcon,chardev=seabioslog_id_20210512-234257-mOeaMK07,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -object iothread,id=iothread1 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio.raw,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=raw,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie-root-port-2,addr=0x0,iothread=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:bb:ed:35:8d:44,id=idSFYXRM,netdev=id47qtZ5,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=id47qtZ5,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

  2. Start nbd server
     {"execute":"nbd-server-start","arguments":{"addr":{"type":"inet","data":{"host":"10.73.196.25","port":"10809"}}}}

  3. Create scratch.img
      #qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel850-64-virtio.raw -F raw scratch.img

  4. Add scratch.img
     {"execute":"blockdev-add","arguments":{"driver":"file","filename":"/home/scratch.img","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}}
     {"execute":"blockdev-add","arguments":{"node-name":"tmp","read-only":false,"driver":"qcow2","file":"libvirt-3-storage","backing":"drive_image1"}}

  5. Start backup channel
     { "execute": "transaction", "arguments": { "actions": [ {"type": "blockdev-backup", "data": { "device": "drive_image1", "target": "tmp", "sync": "none", "job-id":"j1" } }, {"type": "block-dirty-bitmap-add", "data": { "node": "drive_image1", "name": "bitmap0" } } ] } }
{"timestamp": {"seconds": 1621303220, "microseconds": 251587}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1621303220, "microseconds": 251675}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}

  6. Expose the backup image
    {"execute":"nbd-server-add","arguments":{"device":"tmp"}}

  7. In nbd client, create backup image
    #qemu-img create -f qcow2 back1.img 20G
    Note:qemu-kvm version in nbd client: qemu-kvm-4.2.0-29.module+el8.2.1+7712+3c3fe332.2
  
  8. Pull backup image from nbd client.
    #./copyif3.sh nbd://10.73.196.25:10809/tmp back1.img

  9. Repeate reboot in vm during step8's process.
    (guest)#reboot

Test Result:
  In step9, qemu hang.

Gdb info:
 (gdb) bt full
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
No locals.
#1  0x00007f350c639979 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5557916c98e0 <qemu_global_mutex>)
    at ../nptl/pthread_mutex_lock.c:80
        ignore1 = <optimized out>
        ignore2 = <optimized out>
        ignore3 = <optimized out>
        type = <optimized out>
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        id = <optimized out>
#2  0x0000555790e257ad in qemu_mutex_lock_impl (mutex=0x5557916c98e0 <qemu_global_mutex>, file=0x555791195119 "../util/main-loop.c", 
    line=252) at ../util/qemu-thread-posix.c:79
        err = <optimized out>
        __PRETTY_FUNCTION__ = "qemu_mutex_lock_impl"
        __func__ = "qemu_mutex_lock_impl"
#3  0x0000555790c711be in qemu_mutex_lock_iothread_impl (file=file@entry=0x555791195119 "../util/main-loop.c", line=line@entry=252)
    at ../softmmu/cpus.c:491
        bql_lock = <optimized out>
        __func__ = "qemu_mutex_lock_iothread_impl"
        _g_boolean_var_ = <optimized out>
#4  0x0000555790e2924d in os_host_main_loop_wait (timeout=<optimized out>) at ../util/main-loop.c:252
        context = 0x55579253e830
        ret = 1
        context = <optimized out>
        ret = <optimized out>
#5  main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:530
        mlpoll = {state = 0, timeout = 4294967295, pollfds = 0x555792544000}
        ret = <optimized out>
        timeout_ns = <optimized out>
#6  0x0000555790c3c9d9 in qemu_main_loop () at ../softmmu/runstate.c:725
No locals.
--Type <RET> for more, q to quit, c to continue without paging--
#7  0x0000555790a808c2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:50
No locals.

Comment 8 Sergio Lopez 2021-05-21 10:34:42 UTC
I'm able to reproduce this issue with latest upstream too. My last patch series addressing the issues while moving the NBD server between AIO contexts did make this problem less likely to occur, but didn't fixed the root cause.

Basically, we need a way to ensure we can stop all coroutines in the NBD server on blk_aio_detach, to ensure we get a clean slate on blk_aio_attached. This needs to be done _before_ entering the drained state in the block layer, as otherwise some requests won't be able to complete.

I think I have a solution that accomplishes that. I'll provide a test package ASAP.

Comment 13 aihua liang 2021-06-07 03:54:07 UTC
Test with qemu-kvm-6.0.0-17.el8.slopezpa202106020749, don't hit the issue.

Detailed steps: as that in comment10.

Comment 14 John Ferlan 2021-06-09 18:41:10 UTC
Added zstream? based on Virt Block storage meeting from 09-Jun where RHV notes that this is something that should also get an 8.4.z fix since it's a hang situation.

Comment 22 Yanan Fu 2021-06-24 15:24:28 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 23 aihua liang 2021-07-02 11:36:38 UTC
The bug has been fixed on qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09, set bug's status to "Verified".

Test Steps as in comment 10.

Comment 25 errata-xmlrpc 2021-11-16 07:53:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684


Note You need to log in before you can comment on or make changes to this bug.