Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1614623

Summary:	qemu and guest hang when guest poweroff after live commit with data-plane
Product:	Red Hat Enterprise Linux Advanced Virtualization	Reporter:	Yihuang Yu <yihyu>
Component:	qemu-kvm	Assignee:	Kevin Wolf <kwolf>
qemu-kvm sub component:	General	QA Contact:	aihua liang <aliang>
Status:	CLOSED CURRENTRELEASE	Docs Contact:
Severity:	medium
Priority:	medium	CC:	aliang, chayang, coli, juzhang, kwolf, ngu, qzhang, rbalakri, virt-maint, xutian, xuwei, yihyu
Version:	---
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-03-03 06:11:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Yihuang Yu 2018-08-10 05:32:51 UTC

Description of problem:
qemu and guest hang when guest poweroff after live commit with data-plane

Version-Release number of selected component (if applicable):
host kernel version: 3.10.0-931.el7.ppc64le
qemu version: qemu-kvm-rhev-2.12.0-9.el7.ppc64le
guest kernel version: 3.10.0-931.el7.ppc64le

How reproducible:
75%

Steps to Reproduce:
1.Lauch a guest using data-plane:
/usr/libexec/qemu-kvm \
-name 'ppc-block-testvm1' \
-sandbox off  \
-machine pseries  \
-nodefaults  \
-vga std  \
-chardev socket,id=qmp_catch_monitorppc1,path=/var/tmp/monitor-catch_monitorppc1,server,nowait \
-mon chardev=qmp_catch_monitorppc1,mode=control  \
-chardev socket,id=serial_id_serialppc1,path=/var/tmp/serial-serialppc1,server,nowait \
-device spapr-vty,reg=0x30000000,chardev=serial_id_serialppc1 \
-device nec-usb-xhci,id=usbppc1,bus=pci.0,addr=0x3 \
-object iothread,id=iothread0 \
-drive file=/home/qemukvm/rhel76.qcow2,id=drive_imageppc1,if=none,cache=none,format=qcow2,werror=stop,rerror=stop \
-m 8192  \
-smp 2,maxcpus=4,cores=2,threads=1,sockets=2  \
-device usb-tablet,id=usb-tabletppc1,bus=usbppc1.0,port=1  \
-device usb-kbd,id=usbppc1sb-kbdppc1,bus=usbppc1.0,port=2  \
-vnc :1 \
-boot menu=off,strict=off,order=cdn,once=c \
-rtc base=utc,clock=host  \
-enable-kvm \
-qmp tcp:0:4441,server,nowait \
-netdev tap,id=tap1,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \
-device spapr-vlan,netdev=tap1,id=nic1,mac=1a:59:0a:4b:5a:94 \
-serial tcp:0:3331,server,nowait \
-monitor stdio \
-device virtio-scsi-pci,id=virtio_scsi_pcippc1,bus=pci.0,addr=0x4,iothread=iothread0 \
-device scsi-hd,id=imageppc1,drive=drive_imageppc1 \
-object iothread,id=iothread1 \
-device virtio-scsi-pci,id=datapscsi1,iothread=iothread1 \
-drive file=/home/qemukvm/image.qcow2,if=none,id=dataqcow,format=qcow2,cache=none,rerror=stop,werror=stop \
-device scsi-hd,drive=dataqcow,bus=datapscsi1.0,scsi-id=1,lun=1,id=scsi-disk1

2.do "block-snapshot-sync" and "block-commit"
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "dataqcow", "snapshot-file": "/home/qemukvm/testsn1", "format": "qcow2" } }
{"return": {}}
{ "execute": "block-commit", "arguments": { "device": "dataqcow", "base": "/home/qemukvm/image.qcow2", "top": "/home/qemukvm/testsn1", "speed": 1000000000 } }
{"timestamp": {"seconds": 1533868674, "microseconds": 941103}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533868674, "microseconds": 941250}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "dataqcow"}}
{"return": {}}
{"timestamp": {"seconds": 1533868674, "microseconds": 942412}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533868674, "microseconds": 942563}, "event": "BLOCK_JOB_READY", "data": {"device": "dataqcow", "len": 0, "offset": 0, "speed": 1000000000, "type": "commit"}}

3.guest poweroff
1) [root@dhcp70-153 ~]# poweroff

or

2) {"execute":"system_powerdown"}
{"return": {}}
{"timestamp": {"seconds": 1533869348, "microseconds": 700005}, "event": "POWERDOWN"}
{"timestamp": {"seconds": 1533869354, "microseconds": 215695}, "event": "SHUTDOWN", "data": {"guest": true}}


Actual results:
Both qemu and guest hang.

Expected results:
Guest powerdown successed, QEMU can quit normal. 

Additional info:
x86 also has this problem.

when qemu and guest hang:
gdb -batch -ex bt -p 14976
[New LWP 14998]
[New LWP 14997]
[New LWP 14996]
[New LWP 14995]
[New LWP 14979]
[New LWP 14978]
[New LWP 14977]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00003fff86c37a10 in ppoll () from /lib64/libc.so.6
#0  0x00003fff86c37a10 in ppoll () from /lib64/libc.so.6
#1  0x00000001127fc500 in qemu_poll_ns ()
#2  0x00000001127ff330 in aio_poll ()
#3  0x000000011273a6e4 in bdrv_drain_recurse ()
#4  0x000000011273a77c in bdrv_drain_recurse ()
#5  0x000000011273a77c in bdrv_drain_recurse ()
#6  0x000000011273b1d8 in bdrv_do_drained_begin ()
#7  0x00000001126cf99c in bdrv_child_cb_drained_begin ()
#8  0x000000011273b1b0 in bdrv_do_drained_begin ()
#9  0x000000011272a768 in blk_drain ()
#10 0x0000000112736518 in mirror_drain ()
#11 0x00000001126db9b0 in block_job_drain ()
#12 0x00000001126de648 in job_drain ()
#13 0x00000001126decc8 in job_finish_sync ()
#14 0x00000001126df4b0 in job_cancel_sync_all ()
#15 0x0000000112388c28 in main ()

Comment 1 Gu Nini 2018-08-10 05:47:20 UTC

Yihuang,

After step2, you didn't issue an { "execute": "block-job-complete", "arguments": { "device": "dataqcow"}} command to switch the active block BEFORE power off the guest. Could you redo the test with 'block-job-complete' cmd included?

Thanks,
Niny

Comment 2 Yihuang Yu 2018-08-10 08:00:40 UTC

(In reply to Gu Nini from comment #1)
> Yihuang,
> 
> After step2, you didn't issue an { "execute": "block-job-complete",
> "arguments": { "device": "dataqcow"}} command to switch the active block
> BEFORE power off the guest. Could you redo the test with
> 'block-job-complete' cmd included?
> 
> Thanks,
> Niny

Hi Niny,

Sorry that I ignored "block-job-complete" command before poweroff. when I do { "execute": "block-job-complete", "arguments": { "device": "dataqcow" } }, both guest and qemu can exit normally, qmp output and gdb info like below:

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "dataqcow", "snapshot-file": "/home/qemukvm/testsn1", "format": "qcow2" } }
{"return": {}}
{ "execute": "block-commit", "arguments": { "device": "dataqcow", "base": "/home/qemukvm/image.qcow2", "top": "/home/qemukvm/testsn1", "speed": 1000000000 } }
{"timestamp": {"seconds": 1533886670, "microseconds": 980390}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886670, "microseconds": 980541}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "dataqcow"}}
{"return": {}}
{"timestamp": {"seconds": 1533886670, "microseconds": 981704}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886670, "microseconds": 981819}, "event": "BLOCK_JOB_READY", "data": {"device": "dataqcow", "len": 0, "offset": 0, "speed": 1000000000, "type": "commit"}}
{ "execute": "block-job-complete", "arguments": { "device": "dataqcow" } }
{"return": {}}
{"timestamp": {"seconds": 1533886674, "microseconds": 267251}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886674, "microseconds": 267303}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886674, "microseconds": 267352}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "dataqcow", "len": 0, "offset": 0, "speed": 1000000000, "type": "commit"}}
{"timestamp": {"seconds": 1533886674, "microseconds": 267407}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886674, "microseconds": 267455}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "dataqcow"}}
{"timestamp": {"seconds": 1533886684, "microseconds": 271839}, "event": "SHUTDOWN", "data": {"guest": true}}
Connection closed by foreign host.

# gdb -batch -ex bt -p 31551
[New LWP 31731]
[New LWP 31720]
[New LWP 31719]
[New LWP 31718]
[New LWP 31717]
[New LWP 31716]
[New LWP 31715]
[New LWP 31714]
[New LWP 31573]
[New LWP 31572]
[New LWP 31571]
[New LWP 31570]
[New LWP 31554]
[New LWP 31553]
[New LWP 31552]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00003fff89807a10 in ppoll () from /lib64/libc.so.6
#0  0x00003fff89807a10 in ppoll () from /lib64/libc.so.6
#1  0x0000000133ecc500 in ppoll (__ss=0x0, __timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:334
#3  0x0000000133ecdcb8 in os_host_main_loop_wait (timeout=574967274) at util/main-loop.c:233
#4  main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
#5  0x0000000133a58b78 in main_loop () at vl.c:1963
#6  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4768

But I still think it may be a bug. Because I redo it in rhel8 without { "execute": "block-job-complete", "arguments": { "device": "dataqcow" } }
, guest and qemu not hang.

Finally,I update gdb info when guest and QEMU hang:
# gdb -batch -ex bt -p 33152
[New LWP 33174]
[New LWP 33173]
[New LWP 33172]
[New LWP 33171]
[New LWP 33155]
[New LWP 33154]
[New LWP 33153]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00003fffa3697a10 in ppoll () from /lib64/libc.so.6
#0  0x00003fffa3697a10 in ppoll () from /lib64/libc.so.6
#1  0x000000010760c500 in ppoll (__ss=0x0, __timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:334
#3  0x000000010760f330 in aio_poll (ctx=0x10032d21680, blocking=<optimized out>) at util/aio-posix.c:629
#4  0x000000010754a6e4 in bdrv_drain_recurse (bs=0x10032db3800) at block/io.c:197
#5  0x000000010754b1d8 in bdrv_do_drained_begin (bs=0x10032db3800, recursive=<optimized out>, parent=0x0) at block/io.c:290
#6  0x00000001074df99c in bdrv_child_cb_drained_begin (child=<optimized out>) at block.c:821
#7  0x000000010754b1b0 in bdrv_parent_drained_begin (ignore=0x0, bs=0x10032db0400) at block/io.c:53
#8  bdrv_do_drained_begin (bs=0x10032db0400, recursive=<optimized out>, parent=0x0) at block/io.c:288
#9  0x00000001074df99c in bdrv_child_cb_drained_begin (child=<optimized out>) at block.c:821
#10 0x000000010754b1b0 in bdrv_parent_drained_begin (ignore=0x0, bs=0x10032da6800) at block/io.c:53
#11 bdrv_do_drained_begin (bs=0x10032da6800, recursive=<optimized out>, parent=0x0) at block/io.c:288
#12 0x000000010753a768 in blk_drain (blk=0x10032f1c8c0) at block/block-backend.c:1591
#13 0x0000000107546518 in mirror_drain (job=<optimized out>) at block/mirror.c:984
#14 0x00000001074eb9b0 in block_job_drain (job=0x10032e9be00) at blockjob.c:125
#15 0x00000001074ee648 in job_drain (job=0x10032e9be00) at job.c:529
#16 0x00000001074eecc8 in job_finish_sync (job=0x10032e9be00, finish=<optimized out>, errp=<optimized out>) at job.c:1004
#17 0x00000001074ef4b0 in job_cancel_sync (job=0x10032e9be00) at job.c:917
#18 job_cancel_sync_all () at job.c:928
#19 0x0000000107198c28 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4775

Comment 3 Gu Nini 2018-08-10 08:17:58 UTC

Thanks Yihuang. Agree that it is a real bug about negative test. Let's waiting for dev's response.

Comment 4 Gu Nini 2018-08-15 08:24:03 UTC

I could produce core dump on RHEL7.5z, I think they are the same cause, so the bz is not a regression; besides, I failed to reproduce either guest hang or core dump by powering guest after executing 'virsh blockcommit rhel76 --active' before executing 'virsh blockjob rhel76 sda --pivot'.



#### core dump info in hmp:
qemu-kvm: block/io.c:199: bdrv_drain_recurse: Assertion `!wait_->need_kick' failed.
./vm22.sh: line 36: 14835 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1'

####gdb debug info:
(gdb) 
(gdb) bt full
#0  0x00007f52c1623207 in raise () at /lib64/libc.so.6
#1  0x00007f52c16248f8 in abort () at /lib64/libc.so.6
#2  0x00007f52c161c026 in __assert_fail_base () at /lib64/libc.so.6
#3  0x00007f52c161c0d2 in  () at /lib64/libc.so.6
#4  0x000055bab926b1e0 in bdrv_drain_recurse (bs=0x55babcfa2000, begin=true) at block/io.c:199
        waited_ = false
        busy_ = true
        wait_ = 0x55babcfa5234
        ctx_ = 0x55babb9bfb80
        bs_ = 0x55babcfa2000
        child = <optimized out>
        tmp = <optimized out>
        waited = <optimized out>
        __PRETTY_FUNCTION__ = "bdrv_drain_recurse"
#5  0x000055bab926baa6 in bdrv_drained_begin (bs=0x55babe63c800) at block/io.c:47
        c = <optimized out>
        next = 0x0
        bs = 0x55babe63c800
#6  0x000055bab926baa6 in bdrv_drained_begin (bs=0x55babe63c800) at block/io.c:276
        bs = 0x55babe63c800
#7  0x000055bab926bfcd in bdrv_co_drain_bh_cb (opaque=0x7f52b857ee60) at block/io.c:233
        data = 0x7f52b857ee60
        co = 0x55babbd8a8c0
        bs = 0x55babe63c800
#8  0x000055bab92e20f1 in aio_bh_poll (bh=0x55babe696cf0) at util/async.c:90
        bh = <optimized out>
        bhp = <optimized out>
        next = 0x55babdd46d80
        ret = 1
        deleted = false
#9  0x000055bab92e20f1 in aio_bh_poll (ctx=ctx@entry=0x55babb9bfb80) at util/async.c:118
        bh = <optimized out>
        bhp = <optimized out>
        next = 0x55babdd46d80
        ret = 1
        deleted = false
#10 0x000055bab92e53e4 in aio_poll (ctx=ctx@entry=0x55babb9bfb80, blocking=blocking@entry=false) at util/aio-posix.c:689
        node = <optimized out>
        i = <optimized out>
        ret = 0
        progress = <optimized out>
        timeout = <optimized out>
        start = 1366454032467482
        __PRETTY_FUNCTION__ = "aio_poll"
#11 0x000055bab926b061 in bdrv_drain_recurse (bs=0x55babcfa2000, begin=true) at block/io.c:199
        waited_ = true
        busy_ = true
        wait_ = 0x55babcfa5234
        ctx_ = 0x55babb9bfb80
        bs_ = 0x55babcfa2000
        child = <optimized out>
---Type <return> to continue, or q <return> to quit---
        tmp = <optimized out>
        waited = <optimized out>
        __PRETTY_FUNCTION__ = "bdrv_drain_recurse"
#12 0x000055bab926bbe9 in bdrv_drain (bs=0x55babcfa2000) at block/io.c:318
#13 0x000055bab925f4c6 in blk_drain (blk=<optimized out>) at block/block-backend.c:1479
#14 0x000055bab9227022 in block_job_drain (job=0x55babe60d680) at blockjob.c:206
#15 0x000055bab9227171 in block_job_finish_sync (job=job@entry=0x55babe60d680, finish=finish@entry=0x55bab92277e0 <block_job_cancel_err>, errp=errp@entry=0x0) at blockjob.c:406
        local_err = 0x0
        ret = <optimized out>
        __PRETTY_FUNCTION__ = "block_job_finish_sync"
#16 0x000055bab922726d in block_job_cancel_sync_all (job=0x55babe60d680) at blockjob.c:554
        job = 0x55babe60d680
        aio_context = 0x55babb9bfb80
#17 0x000055bab922726d in block_job_cancel_sync_all () at blockjob.c:565
        job = 0x55babe60d680
        aio_context = 0x55babb9bfb80
#18 0x000055bab92208f9 in bdrv_close_all () at block.c:3225
        __PRETTY_FUNCTION__ = "bdrv_close_all"
#19 0x000055bab8fc3bb4 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4833
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x55babb8faab8 "c"
        cyls = <optimized out>
        heads = <optimized out>
        secs = <optimized out>
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
        olist = <optimized out>
        optind = 69
        optarg = 0x7ffe011624cc "tcp:0:5555,server,nowait"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x7ffe01161e48 "std"
        qtest_chrdev = <optimized out>
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_type = <optimized out>
---Type <return> to continue, or q <return> to quit---
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffe011615a0}
        __func__ = "main"
        __FUNCTION__ = "main"

Comment 5 Kevin Wolf 2018-08-17 11:20:13 UTC

Gu Nini suggested that this could be related to bug 1601212, and after doing some debugging, I agree that it's very likely at least a related root cause. I'm not closing this as a duplicate just yet because it might be a different instance, but I expect that we can close it as such later.

In short, while debugging the other report, I found several bugs in two categories:

1. Block jobs making progress that would allows their .drained_poll callback to go from true to false, but the AioContext wasn't kicked. This is solved by calling bdrv_wakeup() for all BlockDriverStates involved in a job whenever state relevant for .drained_poll changes (essentially when a job coroutine yields or terminates).

2. Some instances of bugs where we get recursive locking of the I/O thread AioContext and drop only one of the locks in AIO_WAIT_WHILE(). This is a deadlock and the I/O thread can't make progress. Typical symptoms for this are I/O request functions spawning a coroutine and waiting for its result with BDRV_POLL_WHILE(), but bs->in_flight = 0 because the coroutine didn't even start running.

I'm hoping to send some patches upstream today, which should also make a difference for this report.

Comment 6 Fam Zheng 2018-09-04 01:01:18 UTC

The "Assertion `!wait_->need_kick' failed" bug is fixed in 2.12 with 7376eda7c2e0451e "block: make BDRV_POLL_WHILE() re-entrancy safe". The hang problem, I think, is the same as bug 1601212.

I agree we should keep it open until we've definitive understandings of all these issues.

Comment 7 aihua liang 2018-11-20 08:27:34 UTC

Can also reproduce it on qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64.
 kernel version: 3.10.0-957.1.2.el7.x86_64

Comment 9 Ademar Reis 2018-12-05 22:02:30 UTC

(In reply to Kevin Wolf from comment #5)
> Gu Nini suggested that this could be related to bug 1601212, and after doing
> some debugging, I agree that it's very likely at least a related root cause.
> I'm not closing this as a duplicate just yet because it might be a different
> instance, but I expect that we can close it as such later.
> 
> In short, while debugging the other report, I found several bugs in two
> categories:
> 
> 1. Block jobs making progress that would allows their .drained_poll callback
> to go from true to false, but the AioContext wasn't kicked. This is solved
> by calling bdrv_wakeup() for all BlockDriverStates involved in a job
> whenever state relevant for .drained_poll changes (essentially when a job
> coroutine yields or terminates).
> 
> 2. Some instances of bugs where we get recursive locking of the I/O thread
> AioContext and drop only one of the locks in AIO_WAIT_WHILE(). This is a
> deadlock and the I/O thread can't make progress. Typical symptoms for this
> are I/O request functions spawning a coroutine and waiting for its result
> with BDRV_POLL_WHILE(), but bs->in_flight = 0 because the coroutine didn't
> even start running.
> 
> I'm hoping to send some patches upstream today, which should also make a
> difference for this report.

Kevin, reassigning this to you for a follow-up. See also comment #6 and comment #7.

Comment 10 Kevin Wolf 2018-12-06 09:18:45 UTC

(In reply to aihua liang from comment #7)
> Can also reproduce it on qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64.
>  kernel version: 3.10.0-957.1.2.el7.x86_64

Does the gdb backtrace of the hanging process still look the same?

Comment 11 aihua liang 2018-12-10 08:03:39 UTC

Run 20 times, only hit this issue on virtio_blk, not on virtio_scsi.

Gdb info when guest and qemu hang on virtio_blk:
# gdb -batch -ex bt -p 30607
warning: the debug information found in "/usr/lib/debug//usr/libexec/qemu-kvm.debug" does not match "/usr/libexec/qemu-kvm" (CRC mismatch).
warning: the debug information found in "/usr/lib/debug/usr/libexec/qemu-kvm.debug" does not match "/usr/libexec/qemu-kvm" (CRC mismatch).
[New LWP 30718]
[New LWP 30669]
[New LWP 30668]
[New LWP 30667]
[New LWP 30665]
[New LWP 30664]
[New LWP 30659]
[New LWP 30658]
[New LWP 30656]
[New LWP 30655]
[New LWP 30653]
[New LWP 30651]
[New LWP 30609]
[New LWP 30608]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00005599207370b5 in bdrv_do_drained_end ()
#0  0x00005599207370b5 in bdrv_do_drained_end ()
#1  0x00005599206ec5d9 in block_job_drain ()
#2  0x00005599206ec558 in block_job_detach_aio_context ()
#3  0x00005599206e8911 in bdrv_detach_aio_context ()
#4  0x00005599206e8ab6 in bdrv_set_aio_context ()
#5  0x00005599204dac58 in virtio_blk_data_plane_stop ()
#6  0x0000559920680145 in virtio_bus_stop_ioeventfd ()
#7  0x0000559920505317 in virtio_vmstate_change ()
#8  0x0000559920595642 in vm_state_notify ()
#9  0x00005599204a5cda in do_vm_stop ()
#10 0x00005599204619a1 in main ()

**********Details**************
Test Env:
   kernel version: 3.10.0-957.1.2.el7.x86_64
   qemu-kvm-rhev version: qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64

Test steps:
 1.Start guest with qemu cmds:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20181107-005924-PkIxnG9p,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20181107-005924-PkIxnG9p,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idkp9HYI  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20181107-005924-PkIxnG9p,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20181107-005924-PkIxnG9p,path=/var/tmp/seabios-20181107-005924-PkIxnG9p,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20181107-005924-PkIxnG9p,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \
    -device virtio-net-pci,mac=9a:44:45:46:47:48,id=iddDGLIi,vectors=4,netdev=idDdrbRp,bus=pci.0,addr=0x7  \
    -netdev tap,id=idDdrbRp,vhost=on \
    -m 2048  \
    -smp 10,maxcpus=10,cores=5,threads=1,sockets=2  \
    -cpu host \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
     -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \
    -object iothread,id=iothread0 \
    -drive if=none,id=drive_image1,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,iothread=iothread0 \
    -drive if=none,id=drive_data1,aio=threads,cache=none,format=qcow2,file=/home/data1.qcow2 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0,iothread=iothread0 \

 2. Create live snapshot on data disk.
    {"execute":"qmp_capabilities"}
    { "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive_data1", "snapshot-file": "/home/testsn1", "format": "qcow2" } }

 3. Do block commit on data disk.
    { "execute": "block-commit", "arguments": { "device": "drive_data1", "base": "/home/data1.qcow2", "top": "/home/testsn1", "speed": 1000000000 } }

 4. After block commit job reach ready status, powerdown vm.
    {"execute":"system_powerdown"}

 Actual Result:
  qemu and guest hang.

 Pstack info:
   [root@intel-e52650-16-4 ~]# pstack 30607
Thread 15 (Thread 0x7fa6d83f8700 (LWP 30608)):
#0  0x00007fa6dfbc31c9 in syscall () at /lib64/libc.so.6
#1  0x00005599207c1970 in qemu_event_wait ()
#2  0x00005599207d1e9e in call_rcu_thread ()
#3  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#4  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 14 (Thread 0x7fa6d7bf7700 (LWP 30609)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea1de6 in _L_lock_941 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea1cdf in pthread_mutex_lock () at /lib64/libpthread.so.0
#3  0x00005599207c1299 in qemu_mutex_lock_impl ()
#4  0x00005599207bc35c in co_schedule_bh_cb ()
#5  0x00005599207bc4d1 in aio_bh_poll ()
#6  0x00005599207bf75b in aio_poll ()
#7  0x000055992058f70e in iothread_run ()
#8  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#9  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 13 (Thread 0x7fa6d6af4700 (LWP 30651)):
#0  0x00007fa6dfbbe2cf in ppoll () at /lib64/libc.so.6
#1  0x00005599207bd96b in qemu_poll_ns ()
#2  0x00005599207bf6e7 in aio_poll ()
#3  0x000055992058f70e in iothread_run ()
#4  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 12 (Thread 0x7fa6d62f3700 (LWP 30653)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 11 (Thread 0x7fa6d5af2700 (LWP 30655)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 10 (Thread 0x7fa6d52f1700 (LWP 30656)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 9 (Thread 0x7fa6d4af0700 (LWP 30658)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 8 (Thread 0x7fa6d42ef700 (LWP 30659)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 7 (Thread 0x7fa6d3aee700 (LWP 30664)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 6 (Thread 0x7fa6d32ed700 (LWP 30665)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 5 (Thread 0x7fa6d2aec700 (LWP 30667)):
#0  0x00007fa6dfea64ed in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa6dfea8cdc in _L_cond_lock_847 () at /lib64/libpthread.so.0
#2  0x00007fa6dfea8b71 in __pthread_mutex_cond_lock () at /lib64/libpthread.so.0
#3  0x00007fa6dfea39f5 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#4  0x00005599207c1549 in qemu_cond_wait_impl ()
#5  0x00005599204a3b9f in qemu_wait_io_event ()
#6  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#8  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 4 (Thread 0x7fa6d22eb700 (LWP 30668)):
#0  0x00007fa6dfea3965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00005599207c1549 in qemu_cond_wait_impl ()
#2  0x00005599204a3b9f in qemu_wait_io_event ()
#3  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 3 (Thread 0x7fa6d1aea700 (LWP 30669)):
#0  0x00007fa6dfea3965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00005599207c1549 in qemu_cond_wait_impl ()
#2  0x00005599204a3b9f in qemu_wait_io_event ()
#3  0x00005599204a52a0 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 2 (Thread 0x7fa64f7ff700 (LWP 30718)):
#0  0x00007fa6dfea3965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00005599207c1549 in qemu_cond_wait_impl ()
#2  0x00005599206dc28f in vnc_worker_thread_loop ()
#3  0x00005599206dc858 in vnc_worker_thread ()
#4  0x00007fa6dfe9fdd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fa6dfbc8ead in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7fa6f92afdc0 (LWP 30607)):
#0  0x00005599206ec526 in child_job_drained_poll ()
#1  0x0000559920737568 in bdrv_drain_poll ()
#2  0x0000559920737568 in bdrv_drain_poll ()
#3  0x0000559920737568 in bdrv_drain_poll ()
#4  0x0000559920737790 in bdrv_do_drained_begin ()
#5  0x000055992072a3e8 in blk_drain ()
#6  0x0000559920732b0d in mirror_drain ()
#7  0x00005599206ec558 in block_job_detach_aio_context ()
#8  0x00005599206e8911 in bdrv_detach_aio_context ()
#9  0x00005599206e8ab6 in bdrv_set_aio_context ()
#10 0x00005599204dac58 in virtio_blk_data_plane_stop ()
#11 0x0000559920680145 in virtio_bus_stop_ioeventfd ()
#12 0x0000559920505317 in virtio_vmstate_change ()
#13 0x0000559920595642 in vm_state_notify ()
#14 0x00005599204a5cda in do_vm_stop ()
#15 0x00005599204619a1 in main ()

Comment 13 aihua liang 2019-05-22 08:28:50 UTC

Test with qemu-kvm-rhev-2.12.0-29.el7.x86_64 on RHEL7.7 with -drive/device, also hit this issue as described in comment 11.

Comment 14 Ademar Reis 2020-02-05 22:49:14 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 15 aihua liang 2020-03-03 06:11:32 UTC

Test on qemu-kvm-4.2.0-13.module+el8.2.0+5898+fb4bceae, don't hit this issue any more, close it as currentrelease.

Test Steps:
  1. Start guest with qemu cmds:(iothread enable)
      /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1 \
    -m 14336  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC',+kvm_pv_unhalt  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idy8YPXp \
    -chardev socket,path=/var/tmp/serial-serial0-20200203-033416-61dmcn92,server,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200203-033416-61dmcn92,path=/var/tmp/seabios-20200203-033416-61dmcn92,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200203-033416-61dmcn92,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -object iothread,id=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:6c:ca:b7:36:85,id=idz4QyVp,netdev=idNnpx5D,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idNnpx5D,vhost=on \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

2. Create snapshot target
    {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
    {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
    {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
    {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}}

3. Do live snapshot
    {"execute":"blockdev-snapshot","arguments":{"node":"drive_image1","overlay":"sn1"}}

4. Check block info
    (qemu) info block
sn1: /root/sn1 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writeback
    Backing file:     /home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2 (chain depth: 1)

drive_data1: /home/data.qcow2 (qcow2)
    Attached to:      /machine/peripheral/data1/virtio-backend
    Cache mode:       writeback, direct

5. In guest, dd a file
   (guest)# dd if=/dev/urandom of=te bs=1M count=1000

6. Do live commit
   {'execute': 'block-commit', 'arguments': { 'device': 'sn1','job-id':'j1'}}
{"timestamp": {"seconds": 1583215467, "microseconds": 248030}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1583215467, "microseconds": 248129}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}
{"timestamp": {"seconds": 1583215468, "microseconds": 494774}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1583215468, "microseconds": 494862}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 1050542080, "offset": 1050542080, "speed": 0, "type": "commit"}}

7. Powerdown vm in guest
   (guest)#powerdown

After step7, guest can powerdown successfully.