Bug 1655367

Summary: qemu deadlock when drive-mirror
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Han Han <hhan>
Component: qemu-kvmAssignee: John Snow <jsnow>
qemu-kvm sub component: Storage QA Contact: aihua liang <aliang>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: high CC: aliang, coli, dyuan, juzhang, kanderso, knoel, kwolf, matt.s, mtessun, rbalakri, virt-maint, xutian, xuzhang, yduan, yisun
Version: 8.0Keywords: Automation, Regression, TestBlocker
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1794692 (view as bug list) Environment:
Last Closed: 2020-02-04 03:51:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Auto test logs none

Description Han Han 2018-12-03 02:49:40 UTC
Created attachment 1510779 [details]
Auto test logs

Description of problem:
When running a libvirt auto job blockcopy.positive_test.non_acl.local_disk.no_blockdev.no_shallow.finish_option, qemu will get a deadlock

Version-Release number of selected component (if applicable):
libvirt-4.9.0-1.module+el8+2241+fdba4e12.x86_64
qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Setup avocado-vt and tp-libvirt environment by following steps:
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/libvirt-auto-doc-updater/lastSuccessfulBuild/artifact/libvirt-auto-doc/build/html/avocado_vt.html#installation

2. Run the case
# avocado run --vt-type libvirt blockcopy.positive_test.non_acl.local_disk.no_blockdev.no_shallow.finish_option
JOB ID     : 9fe34a07677a1d1e768b1a64d92b2d9db9a8cc1e
JOB LOG    : /root/avocado/job-results/job-2018-11-29T15.37-9fe34a0/job.log
 (1/1) type_specific.io-github-autotest-libvirt.virsh.blockcopy.positive_test.non_acl.local_disk.no_blockdev.no_shallow.finish_option:

This case will hang...
From job.log I find it hung at blockcopy:
2018-11-29 17:06:59,612 process          L0684 INFO | Command 'systemctl restart libvirtd.service' finished with 0 after 0.20259785652160645s
2018-11-29 17:06:59,613 process          L0596 INFO | Running 'virsh list'
2018-11-29 17:07:00,279 process          L0428 DEBUG| [stdout]  Id   Name             State   
2018-11-29 17:07:00,280 process          L0428 DEBUG| [stdout] --------------------------------
2018-11-29 17:07:00,281 process          L0428 DEBUG| [stdout]  1    avocado-vt-vm1   running 
2018-11-29 17:07:00,281 process          L0428 DEBUG| [stdout]
2018-11-29 17:07:00,281 process          L0684 INFO | Command 'virsh list' finished with 0 after 0.6579406261444092s
2018-11-29 17:07:00,385 virsh            L1397 DEBUG| Undefine VM avocado-vt-vm1
2018-11-29 17:07:00,431 virsh            L0655 DEBUG| Running virsh command: blockcopy avocado-vt-vm1 hda /var/tmp/avocado_bkvosn9y/2018-11-29-17.07.00.img --wait --finish --verbose
2018-11-29 17:07:00,432 process          L0596 INFO | Running '/bin/virsh blockcopy avocado-vt-vm1 hda /var/tmp/avocado_bkvosn9y/2018-11-29-17.07.00.img --wait --finish --verbose'


Actual results:
As above

Expected results:
Blockcopy didn't hang.

Additional info:
For the automation log , see the attachment
To find the source of this bug, I has tried these
1. Run the case with custom qmp monitor.
Set following custom xml in vm:
  <qemu:commandline>
    <qemu:arg value='-qmp'/>
    <qemu:arg value='tcp:localhost:4444,server,nowait'/>
  </qemu:commandline>

After vm started in automation case, tried to connect the custom qmp and run qmp command:
# telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
{ "execute": "query-commands" }

qmp command didn't return anything.

2. Rerun the auto case on libvirt-4.9.0-1.module+el8+2241+fdba4e12.x86_64 qemu-kvm-2.12.0-42.module+el8+2173+537e5cb5. The case passed.

3. Check the blockcopy dst image:
# qemu-img info -U /var/tmp/avocado_bkvosn9y/2018-11-29-17.07.00.img
image: /var/tmp/avocado_bkvosn9y/2018-11-29-17.07.00.img
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 318M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

The copy process seems stopped at 30%

4. Debug with strace and gstack:
# strace -p `pgrep qemu`                                           
strace: Process 18602 attached
futex(0x5631841fada0, FUTEX_WAIT_PRIVATE, 2, NULL

# gstack `pgrep qemu`
Thread 6 (Thread 0x7f8bd73ff700 (LWP 18652)):
#0  0x00007f8c3f12e781 in __GI___poll (fds=0x7f8bd00031e0, nfds=2, timeout=2147483647) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f8c4370ab56 in g_main_context_iterate.isra () from target:/lib64/libglib-2.0.so.0
#2  0x00007f8c4370af12 in g_main_loop_run () from target:/lib64/libglib-2.0.so.0
#3  0x00007f8c405b2efe in red_worker_main () from target:/lib64/libspice-server.so.1
#4  0x00007f8c3f4092de in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007f8c3f139a03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 5 (Thread 0x7f8c34bc8700 (LWP 18617)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007f8c3f40bb59 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631841fada0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
#2  0x00005631839c166d in qemu_mutex_lock_impl (mutex=0x5631841fada0 <qemu_global_mutex>, file=0x563183a4ec28 "/builddir/build/BUILD/qemu-3.0.0/cpus.c", line=1768) at util/qemu-thread-posix.c:66
#3  0x000056318369d240 in qemu_mutex_lock_iothread () at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1768
#4  0x00005631836c3650 in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/accel/kvm/kvm-all.c:2046
#5  0x000056318369d3fe in qemu_kvm_cpu_thread_fn (arg=0x5631845547f0) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1215
#6  0x00007f8c3f4092de in start_thread (arg=<optimized out>) at pthread_create.c:486
#7  0x00007f8c3f139a03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 4 (Thread 0x7f8c353c9700 (LWP 18616)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007f8c3f40bc24 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631844acbb0) at ../nptl/pthread_mutex_lock.c:115
#2  0x00005631839c166d in qemu_mutex_lock_impl (mutex=mutex@entry=0x5631844acbb0, file=file@entry=0x563183b4a7ff "util/async.c", line=line@entry=511) at util/qemu-thread-posix.c:66
#3  0x00005631839bcd21 in aio_context_acquire (ctx=<optimized out>) at util/async.c:511
#4  aio_co_enter (ctx=<optimized out>, co=0x7f8c2019f5d0) at util/async.c:493
#5  0x000056318392ed56 in blk_aio_prwv (blk=0x5631844d5060, offset=offset@entry=1048576, bytes=512, iobuf=iobuf@entry=0x7f8c2017d310, co_entry=co_entry@entry=0x563183930620 <blk_aio_read_entry>, flags=flags@entry=0, cb=0x56318381d7d0 <ide_buffered_readv_cb>, opaque=0x7f8c2017d2f0) at block/block-backend.c:1377                                                                 
#6  0x000056318392fe14 in blk_aio_preadv (blk=<optimized out>, offset=offset@entry=1048576, qiov=qiov@entry=0x7f8c2017d310, flags=flags@entry=0, cb=cb@entry=0x56318381d7d0 <ide_buffered_readv_cb>, opaque=opaque@entry=0x7f8c2017d2f0) at block/block-backend.c:1471
#7  0x0000563183820210 in ide_buffered_readv (s=0x5631851584e8, sector_num=2048, iov=<optimized out>, nb_sectors=<optimized out>, cb=0x563183821200 <ide_sector_read_cb>, opaque=<optimized out>) at hw/ide/core.c:666
#8  0x0000563183820358 in ide_sector_read (s=s@entry=0x5631851584e8) at hw/ide/core.c:782
#9  0x00005631838204b0 in cmd_read_pio (s=0x5631851584e8, cmd=<optimized out>) at hw/ide/core.c:1498
#10 0x000056318382185c in ide_exec_cmd (bus=<optimized out>, val=<optimized out>) at hw/ide/core.c:2088
#11 0x00005631836b0176 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:527                                                                                                  
#12 0x00005631836ae3d6 in access_with_adjusted_size (addr=addr@entry=7, value=value@entry=0x7f8c353c85b8, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=0x5631836b0130 <memory_region_write_accessor>, mr=0x5631851e7bc0, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:594   
#13 0x00005631836b22d8 in memory_region_dispatch_write (mr=0x5631851e7bc0, addr=7, data=<optimized out>, size=1, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:1473
#14 0x000056318365cfc3 in flatview_write_continue (fv=0x7f8c203061a0, addr=503, attrs=..., buf=0x7f8c4404a000 " \223\065", len=1, addr1=<optimized out>, l=<optimized out>, mr=0x5631851e7bc0) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3255
#15 0x000056318365d1e9 in flatview_write (fv=0x7f8c203061a0, addr=503, attrs=..., buf=0x7f8c4404a000 " \223\065", len=1) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3294
#16 0x0000563183660ca3 in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3384
#17 0x00005631836c3420 in kvm_handle_io (count=1, size=1, direction=<optimized out>, data=<optimized out>, attrs=..., port=503) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/accel/kvm/kvm-all.c:1741
#18 kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/accel/kvm/kvm-all.c:1981
#19 0x000056318369d3fe in qemu_kvm_cpu_thread_fn (arg=0x563184507240) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1215
#20 0x00007f8c3f4092de in start_thread (arg=<optimized out>) at pthread_create.c:486
#21 0x00007f8c3f139a03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 3 (Thread 0x7f8c378df700 (LWP 18607)):
#0  0x00007f8c3f12e876 in __GI_ppoll (fds=0x7f8c28000b20, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x00005631839bdba9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322
#3  0x00005631839bfb54 in aio_poll (ctx=0x56318443cb30, blocking=blocking@entry=true) at util/aio-posix.c:645
#4  0x000056318379007e in iothread_run (opaque=0x56318443c860) at iothread.c:64
#5  0x00007f8c3f4092de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f8c3f139a03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 2 (Thread 0x7f8c380e0700 (LWP 18606)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00005631839c1daf in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at util/qemu-thread-posix.c:438
#2  qemu_event_wait (ev=ev@entry=0x563184231208 <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
#3  0x00005631839d2f42 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:261
#4  0x00007f8c3f4092de in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007f8c3f139a03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7f8c44014e80 (LWP 18602)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007f8c3f40bb59 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631841fada0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
#2  0x00005631839c166d in qemu_mutex_lock_impl (mutex=0x5631841fada0 <qemu_global_mutex>, file=0x563183a4ec28 "/builddir/build/BUILD/qemu-3.0.0/cpus.c", line=1768) at util/qemu-thread-posix.c:66
#3  0x000056318369d240 in qemu_mutex_lock_iothread () at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1768
#4  0x00005631839bea41 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:236
#5  main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
#6  0x0000563183795e49 in main_loop () at vl.c:1882
#7  0x0000563183657cd5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4671

5. Reproduce both on ide and virtio disk.

6. I followed the auto test script and try to reproduce it by virsh:
1) Restart libvirt
2) virsh list
3) Start vm
4) restart libvirtd
5) virsh list
6) blockcopy with --finish

But I didn't reproduce it.


From 2, I can tell it is a qemu regression.
From 1,4, it is likely a dead lock.


Additional info:
Logs like libvirtd.log, vm.log, job.log are in the attachment.

Comment 1 aihua liang 2018-12-03 04:16:26 UTC
Can reproduce it on qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64 by auto, but can't by manual.

Note: can only reproduce it by auto.

Test cmds:
 python ConfigTest.py --guestname=RHEL.8.0 --platform=x86_64 --testcase=drive_mirror.simple_test.job_complete.2qcow2

Test result:
 Qemu hang and no response for cmd {"execute":"query-block-jobs"} during drive mirror running.

Test steps:
1.Start guest with qemu cmds:
  /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmp1,path=/var/tmp/avocado_u854de7k/monitor-qmp1-20181202-010429-7e6szSC6,server,nowait \
    -mon chardev=qmp_id_qmp1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_u854de7k/monitor-catch_monitor-20181202-010429-7e6szSC6,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idVBz7m8  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_u854de7k/serial-serial0-20181202-010429-7e6szSC6,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20181202-010429-7e6szSC6,path=/var/tmp/avocado_u854de7k/seabios-20181202-010429-7e6szSC6,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20181202-010429-7e6szSC6,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device virtio-net-pci,mac=9a:52:53:54:55:56,id=idqPHl5R,vectors=4,netdev=idXAJWqI,bus=pci.0,addr=0x4  \
    -netdev tap,id=idXAJWqI,vhost=on,vhostfd=18,fd=16 \
    -m 14336  \
    -smp 10,maxcpus=10,cores=5,threads=1,sockets=2  \
    -cpu 'Broadwell',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm

2. Start block mirror:
    Send command: {'execute': 'drive-mirror', 'arguments': {'mode': 'absolute-paths', 'speed': 0, 'format': 'qcow2', 'device': 'drive_image1', 'target': '/home/kvm_autotest_root/images/target1.qcow2', 'sync': 'full'}, 'id': 'iae9eNSV'}

3. During block mirror running, query block jobs every 2 or 3 seconds:
   Send command: {'execute': 'query-block-jobs', 'id': '3fHN0vmM'}
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1) Response to 'query-block-jobs' (re-formated)
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        auto-finalize: True
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        io-status: ok
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        device: drive_image1
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        auto-dismiss: True
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        busy: True
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        len: 0
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        offset: 0
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        status: running
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        paused: False
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        speed: 0
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        ready: False
01:04:49 DEBUG| (monitor avocado-vt-vm1.qmp1)        type: mirror


Actual Result:
 After 27s later, qemu hang and no response for query-block-jobs cmd.

Additional info:
 ➜  /home pstack 13733
Thread 1 (Thread 0x7fd1f6de6700 (LWP 13733)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x0000559a16472daf in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at util/qemu-thread-posix.c:438
#2  qemu_event_wait (ev=ev@entry=0x559a16ce2208 <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
#3  0x0000559a16483f42 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:261
#4  0x00007fd1fe10f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007fd1fde3fa03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
➜  /home pstack 13734
Thread 1 (Thread 0x7fd1f65e5700 (LWP 13734)):
#0  0x00007fd1fde34876 in __GI_ppoll (fds=0x7fd1e8000b20, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x0000559a1646eba9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322
#3  0x0000559a16470b54 in aio_poll (ctx=0x559a18583b30, blocking=blocking@entry=true) at util/aio-posix.c:645
#4  0x0000559a1624107e in iothread_run (opaque=0x559a18583860) at iothread.c:64
#5  0x00007fd1fe10f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007fd1fde3fa03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
➜  /home pstack 13739
Thread 1 (Thread 0x7fd1e7fff700 (LWP 13739)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007fd1fe111c24 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x559a185f3510) at ../nptl/pthread_mutex_lock.c:115
#2  0x0000559a1647266d in qemu_mutex_lock_impl (mutex=0x559a185f3510, file=file@entry=0x559a165fb7ff "util/async.c", line=line@entry=511) at util/qemu-thread-posix.c:66
#3  0x0000559a1646ddb9 in aio_context_acquire (ctx=<optimized out>) at util/async.c:511
#4  0x0000559a1618fd0b in virtio_blk_data_plane_stop (vdev=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/hw/block/dataplane/virtio-blk.c:281
#5  0x0000559a1632fd8f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x559a19542318) at hw/virtio/virtio-bus.c:246
#6  0x0000559a163304ef in virtio_bus_stop_ioeventfd (bus=bus@entry=0x559a19542318) at hw/virtio/virtio-bus.c:238
#7  0x0000559a16333268 in virtio_pci_stop_ioeventfd (proxy=0x559a1953a220) at hw/virtio/virtio-pci.c:1282
#8  virtio_pci_common_write (opaque=0x559a1953a220, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at hw/virtio/virtio-pci.c:1282
#9  0x0000559a16161176 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:527
#10 0x0000559a1615f3d6 in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7fd1e7ffe5b8, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=0x559a16161130 <memory_region_write_accessor>, mr=0x559a1953abf0, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:594
#11 0x0000559a161632d8 in memory_region_dispatch_write (mr=0x559a1953abf0, addr=20, data=<optimized out>, size=1, attrs=...) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/memory.c:1473
#12 0x0000559a1610dfc3 in flatview_write_continue (fv=0x7fd1e03cacc0, addr=4273963028, attrs=..., buf=0x7fd202d4f028 "", len=1, addr1=<optimized out>, l=<optimized out>, mr=0x559a1953abf0) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3255
#13 0x0000559a1610e1e9 in flatview_write (fv=0x7fd1e03cacc0, addr=4273963028, attrs=..., buf=0x7fd202d4f028 "", len=1) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3294
#14 0x0000559a16111ca3 in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3384
#15 0x0000559a16174468 in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/accel/kvm/kvm-all.c:1991
#16 0x0000559a1614e3fe in qemu_kvm_cpu_thread_fn (arg=0x559a1864f240) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1215
#17 0x00007fd1fe10f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007fd1fde3fa03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
➜  /home pstack 13740
Thread 1 (Thread 0x7fd1e77fe700 (LWP 13740)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007fd1fe111b59 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x559a16cabda0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
#2  0x0000559a1647266d in qemu_mutex_lock_impl (mutex=0x559a16cabda0 <qemu_global_mutex>, file=0x559a164ffc28 "/builddir/build/BUILD/qemu-3.0.0/cpus.c", line=1768) at util/qemu-thread-posix.c:66
#3  0x0000559a1614e240 in qemu_mutex_lock_iothread () at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1768
#4  0x0000559a1610c935 in prepare_mmio_access (mr=<optimized out>, mr=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3219
#5  0x0000559a1611189f in flatview_read_continue (fv=0x7fd1e03cacc0, addr=4228198436, attrs=..., buf=<optimized out>, len=4, addr1=<optimized out>, l=<optimized out>, mr=0x559a194b4150) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3314
#6  0x0000559a16111ac6 in flatview_read (fv=0x7fd1e03cacc0, addr=4228198436, attrs=..., buf=0x7fd202d4c028 "", len=4) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3354
#7  0x0000559a16111bf3 in address_space_read_full (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/exec.c:3367
#8  0x0000559a16174468 in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/accel/kvm/kvm-all.c:1991
#9  0x0000559a1614e3fe in qemu_kvm_cpu_thread_fn (arg=0x559a1869c030) at /usr/src/debug/qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64/cpus.c:1215
#10 0x00007fd1fe10f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#11 0x00007fd1fde3fa03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
➜  /home pstack 13746
Thread 1 (Thread 0x7fd1e67ff700 (LWP 13746)):
#0  0x00007fd1fde34781 in __GI___poll (fds=0x7fd1dc0031e0, nfds=2, timeout=2147483647) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fd202410b56 in g_main_context_poll (priority=<optimized out>, n_fds=2, fds=0x7fd1dc0031e0, timeout=<optimized out>, context=0x559a198d9700) at gmain.c:4204
#2  g_main_context_iterate (context=0x559a198d9700, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3898
#3  0x00007fd202410f12 in g_main_loop_run (loop=0x7fd1dc002100) at gmain.c:4099
#4  0x00007fd1ff2b8efe in red_worker_main () from target:/lib64/libspice-server.so.1
#5  0x00007fd1fe10f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007fd1fde3fa03 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Also hit it on qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.

Comment 2 aihua liang 2018-12-03 05:11:27 UTC
Pstack info when test on qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate:

# pstack 6787
Thread 13 (Thread 0x7fd599dff700 (LWP 6808)):
#0  0x00007fd5d149f4cc in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x000055cfe16c165d in qemu_cond_wait_impl (cond=<optimized out>, mutex=0x55cfe3d13668, file=0x55cfe1826d87 "ui/vnc-jobs.c", line=214) at util/qemu-thread-posix.c:161
#2  0x000055cfe15de571 in vnc_worker_thread_loop (queue=queue@entry=0x55cfe3d13630) at ui/vnc-jobs.c:214
#3  0x000055cfe15deb30 in vnc_worker_thread (arg=0x55cfe3d13630) at ui/vnc-jobs.c:324
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe40863e0) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 12 (Thread 0x7fd59b7fe700 (LWP 6802)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2e2dd30, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2e2dd30) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2e4ffb0) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 11 (Thread 0x7fd59bfff700 (LWP 6801)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2e0b270, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2e0b270) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2e2d4f0) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 10 (Thread 0x7fd5b8ff9700 (LWP 6800)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2de87b0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2de87b0) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2e0aa30) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 9 (Thread 0x7fd5b97fa700 (LWP 6799)):
#0  0x00007fd5d14a28dd in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fd5d149bc24 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x000055cfe16c138d in qemu_mutex_lock_impl (mutex=0x55cfe2c7e9e0, file=0x55cfe184dd5f "util/async.c", line=511) at util/qemu-thread-posix.c:66
#3  0x000055cfe13ef7a3 in virtio_scsi_dataplane_stop (vdev=0x55cfe3fd2370) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/hw/scsi/virtio-scsi-dataplane.c:210
#4  0x000055cfe157bb1f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55cfe3fd22f8) at hw/virtio/virtio-bus.c:246
#5  0x000055cfe157c27f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55cfe3fd22f8) at hw/virtio/virtio-bus.c:238
#6  0x000055cfe157eff8 in virtio_pci_stop_ioeventfd (proxy=0x55cfe3fca200) at hw/virtio/virtio-pci.c:1282
#7  0x000055cfe157eff8 in virtio_pci_common_write (opaque=0x55cfe3fca200, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at hw/virtio/virtio-pci.c:1282
#8  0x000055cfe13a5653 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/memory.c:504
#9  0x000055cfe13a3806 in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7fd5b97f9548, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=0x55cfe13a5600 <memory_region_write_accessor>, mr=0x55cfe3fcabd0, attrs=...) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/memory.c:570
#10 0x000055cfe13a75d0 in memory_region_dispatch_write (mr=0x55cfe3fcabd0, addr=20, data=<optimized out>, size=1, attrs=...) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/memory.c:1452
#11 0x000055cfe13516b3 in flatview_write_continue (fv=0x7fd5b459ba80, addr=4261412884, attrs=..., buf=0x7fd5d5f5c028 "", len=1, addr1=<optimized out>, l=<optimized out>, mr=0x55cfe3fcabd0) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3233
#12 0x000055cfe13518d9 in flatview_write (fv=0x7fd5b459ba80, addr=4261412884, attrs=..., buf=0x7fd5d5f5c028 "", len=1) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3272
#13 0x000055cfe13559d3 in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3362
#14 0x000055cfe13b91e8 in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2043
#15 0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2dc5cf0) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#16 0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2de7f70) at util/qemu-thread-posix.c:498
#17 0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#18 0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 8 (Thread 0x7fd5b9ffb700 (LWP 6798)):
#0  0x00007fd5d14a28dd in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fd5d149bb59 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x000055cfe16c138d in qemu_mutex_lock_impl (mutex=0x55cfe1eff900 <qemu_global_mutex>, file=0x55cfe17492b8 "/builddir/build/BUILD/qemu-3.1.0/exec.c", line=3197) at util/qemu-thread-posix.c:66
#3  0x000055cfe13926ae in qemu_mutex_lock_iothread_impl (file=<optimized out>, line=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1845
#4  0x000055cfe1350709 in prepare_mmio_access (mr=<optimized out>, mr=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3197
#5  0x000055cfe1351722 in flatview_write_continue (fv=0x7fd5b440ee00, addr=980, attrs=..., buf=0x7fd5d5f60000 "\016\065", len=2, addr1=<optimized out>, l=<optimized out>, mr=0x55cfe3dc1640) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3228
#6  0x000055cfe13518d9 in flatview_write (fv=0x7fd5b440ee00, addr=980, attrs=..., buf=0x7fd5d5f60000 "\016\065", len=2) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3272
#7  0x000055cfe13559d3 in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/exec.c:3362
#8  0x000055cfe13b9270 in kvm_handle_io (count=1, size=2, direction=<optimized out>, data=<optimized out>, attrs=..., port=980) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1787
#9  0x000055cfe13b9270 in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2033
#10 0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2da2a00) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#11 0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2dc54b0) at util/qemu-thread-posix.c:498
#12 0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#13 0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 7 (Thread 0x7fd5ba7fc700 (LWP 6797)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2d7ff40, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2d7ff40) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2da21c0) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 6 (Thread 0x7fd5baffd700 (LWP 6796)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2d5d2a0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2d5d2a0) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2d7f730) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 5 (Thread 0x7fd5bb7fe700 (LWP 6795)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2d3aa10, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2d3aa10) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2d5ca60) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 4 (Thread 0x7fd5bbfff700 (LWP 6794)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2d17450, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2d17450) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2d3a1d0) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 3 (Thread 0x7fd5c8ee5700 (LWP 6793)):
#0  0x00007fd5d11c03fb in ioctl () at /lib64/libc.so.6
#1  0x000055cfe13b8f69 in kvm_vcpu_ioctl (cpu=cpu@entry=0x55cfe2cce470, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:2156
#2  0x000055cfe13b902a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/accel/kvm/kvm-all.c:1993
#3  0x000055cfe1392876 in qemu_kvm_cpu_thread_fn (arg=0x55cfe2cce470) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1277
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2cec070) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 2 (Thread 0x7fd5ca169700 (LWP 6788)):
#0  0x00007fd5d11c425d in syscall () at /lib64/libc.so.6
#1  0x000055cfe16c1acf in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at util/qemu-thread-posix.c:438
#2  0x000055cfe16c1acf in qemu_event_wait (ev=ev@entry=0x55cfe1f35d48 <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
#3  0x000055cfe16d3202 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:261
#4  0x000055cfe16c12a4 in qemu_thread_start (args=0x55cfe2bc3260) at util/qemu-thread-posix.c:498
#5  0x00007fd5d14992de in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd5d11c9a03 in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7fd5d60aaec0 (LWP 6787)):
#0  0x00007fd5d14a28dd in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fd5d149bb59 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x000055cfe16c138d in qemu_mutex_lock_impl (mutex=0x55cfe1eff900 <qemu_global_mutex>, file=0x55cfe184e074 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66
#3  0x000055cfe13926ae in qemu_mutex_lock_iothread_impl (file=file@entry=0x55cfe184e074 "util/main-loop.c", line=line@entry=236) at /usr/src/debug/qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64/cpus.c:1845
#4  0x000055cfe16be48d in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:236
#5  0x000055cfe16be48d in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
#6  0x000055cfe1490749 in main_loop () at vl.c:1910
#7  0x000055cfe134ca37 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4670

Comment 4 Need Real Name 2018-12-05 22:19:35 UTC
I think I may have bumped into this, let me know, otherwise I'll start a new bug.

Source server: linux 4.19.5 qemu-3.0.0 from source, libvirt 4.9
Dest server: linux 4.18.19 qemu-3.0.0 from source, libvirt 4.9

When this VM is running on source server:

/usr/bin/qemu-system-x86_64 -name guest=testvm,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-testvm/master-key.aes -machine pc-q35-3.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Server-IBRS,ss=on,hypervisor=on,tsc_adjust=on,clflushopt=on,umip=on,pku=on,ssbd=on,xsaves=on,topoext=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_synic,hv_stimer,hv_reset,hv_vendor_id=KVM Hv -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -object iothread,id=iothread1 -uuid 3b00b788-ee91-4e45-80a6-c7319da71225 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=23,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-pci-bridge,id=pci.3,bus=pci.1,addr=0x0 -device pcie-root-port,port=0x12,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x3 -device piix3-usb-uhci,id=usb,bus=pci.3,addr=0x1 -device virtio-scsi-pci,iothread=iothread1,id=scsi0,bus=pci.4,addr=0x0 -drive file=/dev/zvol/datastore/vm/testvm-vda,format=raw,if=none,id=drive-scsi0-0-0-0,cache=writeback,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2,write-cache=on -drive if=none,id=drive-sata0-0-4,media=cdrom,readonly=on -device ide-cd,bus=ide.4,drive=drive-sata0-0-4,id=sata0-0-4,bootindex=1 -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a2:b7:a1,bus=pci.2,addr=0x0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1 -s -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

I try to migrate it and the disks to the other side:

virsh migrate --live --undefinesource --persistent --verbose --copy-storage-all testvm qemu+ssh://wasvirt1/system

We get to 99% then hang with both sides in the pause state.

Source server is stuck here:
(gdb) bt full
#0  0x00007f327994f3c1 in ppoll () at /lib64/libc.so.6
#1  0x000000000086167b in qemu_poll_ns (fds=<optimized out>, nfds=nfds@entry=1, timeout=<optimized out>) at util/qemu-timer.c:322
#2  0x0000000000863302 in aio_poll (ctx=0x21044e0, blocking=blocking@entry=true) at util/aio-posix.c:629
        node = <optimized out>
        i = <optimized out>
        ret = 0
        progress = <optimized out>
        timeout = <optimized out>
        start = <optimized out>
        __PRETTY_FUNCTION__ = "aio_poll"
#3  0x00000000007e0d52 in nbd_client_close (bs=0x2ba2400) at block/nbd-client.c:62
        waited_ = <optimized out>
        wait_ = 0x2ba563c
        ctx_ = 0x2109bb0
        bs_ = 0x2ba2400
        client = 0x31287e0
        client = <optimized out>
        request = {handle = 0, from = 0, len = 0, flags = 0, type = 2}
#4  0x00000000007e0d52 in nbd_client_close (bs=0x2ba2400) at block/nbd-client.c:965
        client = <optimized out>
        request = {handle = 0, from = 0, len = 0, flags = 0, type = 2}
#5  0x00000000007de5ca in nbd_close (bs=<optimized out>) at block/nbd.c:491
        s = 0x31287e0
#6  0x00000000007823d6 in bdrv_unref (bs=0x2ba2400) at block.c:3352
        ban = <optimized out>
        ban_next = <optimized out>
        child = <optimized out>
        next = <optimized out>
#7  0x00000000007823d6 in bdrv_unref (bs=0x2ba2400) at block.c:3560
#8  0x00000000007823d6 in bdrv_unref (bs=0x2ba2400) at block.c:4616
#9  0x0000000000782403 in bdrv_unref (bs=0x2af96f0) at block.c:3359
        ban = <optimized out>
        ban_next = <optimized out>
        child = <optimized out>
        next = <optimized out>
#10 0x0000000000782403 in bdrv_unref (bs=0x2af96f0) at block.c:3560
#11 0x0000000000782403 in bdrv_unref (bs=0x2af96f0) at block.c:4616
#12 0x0000000000785784 in block_job_remove_all_bdrv (job=job@entry=0x2f32570) at blockjob.c:200
        c = 0x23bac30
        l = 0x20dd330 = {0x23bac30, 0x2b89410}
#13 0x00000000007ceb5f in mirror_exit (job=0x2f32570, opaque=0x7f326407a350) at block/mirror.c:700
        s = 0x2f32570
        bjob = 0x2f32570
        data = 0x7f326407a350
        bs_opaque = 0x30d5600
        replace_aio_context = <optimized out>
        src = 0x2131080
        target_bs = 0x2af96f0
        mirror_top_bs = 0x210eb70
        local_err = 0x0
#14 0x0000000000786452 in job_defer_to_main_loop_bh (opaque=0x7f32640786a0) at job.c:973
        data = 0x7f32640786a0
        job = <optimized out>
        aio_context = 0x2109bb0
#15 0x000000000085fd3f in aio_bh_poll (ctx=ctx@entry=0x21044e0) at util/async.c:118
---Type <return> to continue, or q <return> to quit---
        bh = <optimized out>
        bhp = <optimized out>
        next = 0x2ea86e0
        ret = 1
        deleted = false
#16 0x00000000008631b0 in aio_dispatch (ctx=0x21044e0) at util/aio-posix.c:436
#17 0x000000000085fc1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
        ctx = <optimized out>
#18 0x00007f327f17d797 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
#19 0x00000000008622ed in main_loop_wait () at util/main-loop.c:215
        context = 0x2104900
        pfds = <optimized out>
        context = 0x2104900
        ret = 1
        ret = 1
        timeout = 4294967295
        timeout_ns = <optimized out>
#20 0x00000000008622ed in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
        context = 0x2104900
        ret = 1
        ret = 1
        timeout = 4294967295
        timeout_ns = <optimized out>
#21 0x00000000008622ed in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
        ret = 1
        timeout = 4294967295
        timeout_ns = <optimized out>
#22 0x0000000000595dee in main_loop () at vl.c:1866
#23 0x000000000041f35d in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4644
        i = <optimized out>
        snapshot = 0
        linux_boot = <optimized out>
        initrd_filename = 0x0
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = 0x918f44 "cad"
        boot_once = 0x0
        ds = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = 0x0
        olist = <optimized out>
        optind = 71
        optarg = 0x7ffdfc94df69 "timestamp=on"
        loadvm = 0x0
        machine_class = 0x0
        cpu_model = 0x7ffdfc94d864 "Skylake-Server-IBRS,ss=on,hypervisor=on,tsc_adjust=on,clflushopt=on,umip=on,pku=on,ssbd=on,xsaves=on,topoext=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_synic,hv_stimer"...
        vga_model = 0x0
        qtest_chrdev = 0x0
        qtest_log = 0x0
        pid_file = <optimized out>
        incoming = 0x0
        userconfig = <optimized out>
---Type <return> to continue, or q <return> to quit---
        nographic = false
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = 4294967296
        ram_slots = 0
        vmstate_dump_file = 0x0
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = false
        dir = <optimized out>
        dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdfc94c170}
        __func__ = "main"

Strace shows:
ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8

Which points to this:

ls -al /proc/2286/fd/9
lrwx------    1 root     users           64 Dec  5 13:04 /proc/2286/fd/9 -> anon_inode:[eventfd]

The dest side is stuck here:

(gdb) bt full
#0  0x00007f21f070d3c1 in ppoll () at /lib64/libc.so.6
#1  0x0000000000861659 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=2999926258) at util/qemu-timer.c:334
        ts = {tv_sec = 2, tv_nsec = 999926258}
Python Exception <class 'gdb.error'> That operation is not available on integers of more than 8 bytes.:
#2  0x00000000008622a4 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:233
        context = 0x2142900
        ret = <optimized out>
        ret = -1295041038
        timeout = 4294967295
        timeout_ns = <optimized out>
#3  0x00000000008622a4 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
        ret = -1295041038
        timeout = 4294967295
        timeout_ns = <optimized out>
#4  0x0000000000595dee in main_loop () at vl.c:1866
#5  0x000000000041f35d in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4644
        i = <optimized out>
        snapshot = 0
        linux_boot = <optimized out>
        initrd_filename = 0x0
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = 0x918f44 "cad"
        boot_once = 0x0
        ds = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = 0x0
        olist = <optimized out>
        optind = 73
        optarg = 0x7ffdd6ee8f69 "timestamp=on"
        loadvm = 0x0
        machine_class = 0x0
        cpu_model = 0x7ffdd6ee8854 "Skylake-Server-IBRS,ss=on,hypervisor=on,tsc_adjust=on,clflushopt=on,umip=on,pku=on,ssbd=on,xsaves=on,topoext=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_synic,hv_stimer"...
        vga_model = 0x0
        qtest_chrdev = 0x0
        qtest_log = 0x0
        pid_file = <optimized out>
        incoming = 0x7ffdd6ee8f0a "defer"
        userconfig = <optimized out>
        nographic = false
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = 4294967296
        ram_slots = 0
        vmstate_dump_file = 0x0
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = false
        dir = <optimized out>
        dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffdd6ee6630}
---Type <return> to continue, or q <return> to quit---
        __func__ = "main"

Strace show this over and over
ppoll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN}, {fd=24, events=POLLIN}, {fd=27, events=POLLIN}], 9, {0, 594527977}, NULL, 8) = 0 (Timeout)

lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/10 -> anon_inode:[eventfd]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/21 -> socket:[42631161]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/22 -> socket:[42631165]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/23 -> socket:[42631167]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/24 -> socket:[42631168]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/27 -> socket:[42690422]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/6 -> anon_inode:[eventfd]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/7 -> anon_inode:[signalfd]
lrwx------    1 root     users           64 Dec  5 13:15 /proc/20170/fd/9 -> anon_inode:[eventfd]

Comment 5 Need Real Name 2018-12-05 23:54:15 UTC
I'm not running redhat, so I'll report upstream, thought my bt might be helpful...

Comment 6 John Snow 2018-12-10 23:17:41 UTC
aliang: Thank you for the test steps; have you reproduced this in this way? (That is, without using avocado?)

Can you tell me in this context what rhel80-64-virtio-scsi.qcow2 contains and what it might be doing on boot? (Is it just an idle image or is it running some kind of block activity, etc?)

Comment 7 aihua liang 2018-12-11 01:57:44 UTC
(In reply to John Snow from comment #6)
> aliang: Thank you for the test steps; have you reproduced this in this way?
> (That is, without using avocado?)
> 
> Can you tell me in this context what rhel80-64-virtio-scsi.qcow2 contains
> and what it might be doing on boot? (Is it just an idle image or is it
> running some kind of block activity, etc?)

Hi, John

  I tried to reproduce it by manual with the test steps and the block-query block interval greater than 3s, but failed. 
  rhel80-64-virtio-scsi.qcow2 is an image used as system disk, that is to say, we installed rhel8 on it.
  So in this test, we're running drive-mirror on system disk.
  I'll try my best to reproduce the bug by manual with less block-query-block interval, and will update the progress later.



aliang

Comment 9 aihua liang 2019-01-11 09:13:38 UTC
Qemu also hang when set buf-size when doing block mirror.

 kernel version: 4.18.0-57.el8.x86_64
 qemu-kvm version:qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64

Test Steps:
  1. Start guest with cmds:
       /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190107-223041-G541jCra,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190107-223041-G541jCra,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=id0S5usU  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190107-223041-G541jCra,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190107-223041-G541jCra,path=/var/tmp/seabios-20190107-223041-G541jCra,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190107-223041-G541jCra,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -device virtio-net-pci,mac=9a:a0:a1:a2:a3:a4,id=idawKNIF,vectors=4,netdev=idIaOUAY,bus=pci.0,addr=0x5  \
    -netdev tap,id=idIaOUAY,vhost=on \
    -m 7168  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
    -cpu host,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -object secret,id=sec0,data=backing \
    -blockdev node-name=my_drive,driver=iscsi,cache.direct=off,cache.no-flush=on,transport=tcp,portal=10.73.194.81,target=iqn.2019-01.com.example:aliang,lun=1 \
    -blockdev driver=luks,node-name=drive_image1,file=my_drive,key-secret=sec0 \
    -device ide-hd,drive=drive_image1,id=image1,bus=ide.0,unit=0 \
    -drive id=drive_winutils,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
    -device ide-cd,id=winutils,drive=drive_winutils,bus=ide.0,unit=1 \
    -drive id=drive_unattended,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/virtio-win.iso.el8 \
    -device ide-cd,id=unattended,drive=drive_unattended,bus=ide.1,unit=0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=d \
    -enable-kvm \
    -monitor stdio \

2. Create mirror image in previous:
    {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':32212254720},'job-id':'job1'}}
    {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
    {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':32212254720},'job-id':'job2'}}
    {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1'}}
    {'execute':'job-dismiss','arguments':{'id':'job1'}}
    {'execute':'job-dismiss','arguments':{'id':'job2'}}

3. Do block mirror with buf-size setting:
    {'execute':'blockdev-mirror','arguments':{'device':'drive_image1','target':'sn1','sync':'full','granularity':65536,'buf-size':655360,'job-id':'j1'}}

4. Check block job status during mirroring.
    {"execute":"query-block-jobs"}
    {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1", "auto-dismiss": true, "busy": true, "len": 32212975616, "offset": 4499046400, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}

    {"execute":"query-block-jobs"}
    {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1", "auto-dismiss": true, "busy": true, "len": 32282705920, "offset": 13958512640, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}

    {"execute":"query-block-jobs"}
    {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1", "auto-dismiss": true, "busy": true, "len": 32287846400, "offset": 32211759104, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}

    {"execute":"query-block-jobs"}
    {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1", "auto-dismiss": true, "busy": true, "len": 32287846400, "offset": 32211759104, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}

    {"execute":"query-block-jobs"}
    {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1", "auto-dismiss": true, "busy": true, "len": 32287846400, "offset": 32211759104, "status": "running", "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}

5. Check gdb info
    #gdb -p 17388
    (gdb) bt full
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb623e6 in kvm_vm_ioctl (s=s@entry=0x5595af85bed0, type=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2138
        ret = <optimized out>
        arg = 0x7fff0040c650
        ap = {{gp_offset = 16, fp_offset = 21909, overflow_arg_area = 0x7fff0040c650, reg_save_area = 0x7fff0040c610}}
#2  0x00005595aeb63109 in kvm_set_irq (s=0x5595af85bed0, irq=irq@entry=8, level=level@entry=1)
    at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:958
        event = {{irq = 1, status = 1}, level = 1}
        ret = <optimized out>
        __PRETTY_FUNCTION__ = "kvm_set_irq"
#3  0x00005595aebc8937 in kvm_pic_set_irq (opaque=<optimized out>, irq=8, level=1) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/i386/kvm/i8259.c:115
        delivered = <optimized out>
#4  0x00005595aeb97806 in qemu_irq_raise (irq=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/include/hw/irq.h:16
#5  0x00005595aeb97806 in rtc_policy_slew_deliver_irq (s=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/timer/mc146818rtc.c:149
#6  0x00005595aeb978d2 in rtc_periodic_timer (opaque=0x5595af840450) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/timer/mc146818rtc.c:279
        s = 0x5595af840450
#7  0x00005595aee44027 in timerlist_run_timers (timer_list=0x5595af84ca10) at util/qemu-timer.c:574
        ts = <optimized out>
        current_time = 1547196543718171000
        progress = <optimized out>
        cb = 0x5595aeb97870 <rtc_periodic_timer>
        opaque = <optimized out>
        need_replay_checkpoint = <optimized out>
        timer_list = 0x5595af84ca10
#8  0x00005595aee442ae in qemu_clock_run_timers (type=<optimized out>) at util/qemu-timer.c:708
        progress = false
        type = <optimized out>
#9  0x00005595aee442ae in qemu_clock_run_all_timers () at util/qemu-timer.c:708
        progress = false
        type = <optimized out>
#10 0x00005595aee447a5 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:503
        timeout = 4294967295
        timeout_ns = <optimized out>
#11 0x00005595aec37e99 in main_loop () at vl.c:1910
#12 0x00005595aeaf83f4 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4681
        i = <optimized out>
        snapshot = 0
        linux_boot = <optimized out>
        initrd_filename = 0x0
        kernel_filename = 0x0
        kernel_cmdline = <optimized out>
        boot_order = 0x5595af7e0b80 "cdn"
        boot_once = <optimized out>
        ds = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        icount_opts = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        accel_opts = 0x0
        olist = <optimized out>
        optind = 67
        optarg = 0x7fff0040f566 "stdio"
        loadvm = 0x0
        machine_class = <optimized out>
        cpu_model = 0x7fff0040f196 "host,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time"
        vga_model = 0x0
        qtest_chrdev = 0x0
        qtest_log = 0x0
        incoming = 0x0
        userconfig = <optimized out>
        nographic = false
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = 7516192768
        ram_slots = 0
        vmstate_dump_file = 0x0
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        dir = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7fff0040c880}
        __func__ = "main"

6. Check pstack info
   # pstack 17388
Thread 11 (Thread 0x7f054bfff700 (LWP 17453)):
#0  0x00007f057ee7c48c in futex_wait_cancelable (private=0, expected=0, futex_word=0x5595b07e3798) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  0x00007f057ee7c48c in __pthread_cond_wait_common (abstime=0x0, mutex=0x5595b07e37a8, cond=0x5595b07e3770) at pthread_cond_wait.c:502
#2  0x00007f057ee7c48c in __pthread_cond_wait (cond=0x5595b07e3770, mutex=mutex@entry=0x5595b07e37a8) at pthread_cond_wait.c:655
#3  0x00005595aee4791d in qemu_cond_wait_impl (cond=<optimized out>, mutex=0x5595b07e37a8, file=0x5595aefab4c7 "ui/vnc-jobs.c", line=214) at util/qemu-thread-posix.c:161
#4  0x00005595aed7b691 in vnc_worker_thread_loop (queue=queue@entry=0x5595b07e3770) at ui/vnc-jobs.c:214
#5  0x00005595aed7bc50 in vnc_worker_thread (arg=0x5595b07e3770) at ui/vnc-jobs.c:324
#6  0x00005595aee47564 in qemu_thread_start (args=0x5595b0826ae0) at util/qemu-thread-posix.c:498
#7  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 10 (Thread 0x7f055e7fc700 (LWP 17451)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9ca170, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9ca170) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9ec4d0) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 9 (Thread 0x7f055effd700 (LWP 17450)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9a6f80, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9a6f80) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9c92e0) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 8 (Thread 0x7f055f7fe700 (LWP 17449)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9843b0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9843b0) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9a6710) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 7 (Thread 0x7f055ffff700 (LWP 17448)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9617e0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9617e0) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af983b40) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 6 (Thread 0x7f0574bff700 (LWP 17447)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af93e9e0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af93e9e0) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af960fa0) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 5 (Thread 0x7f0575400700 (LWP 17446)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af91bdc0, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af91bdc0) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af93e170) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 4 (Thread 0x7f0575c01700 (LWP 17445)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af8f7690, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af8f7690) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af91b550) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 3 (Thread 0x7f0576402700 (LWP 17444)):
#0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af8ab770, type=type@entry=44672) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:2156
#2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/kvm-all.c:1993
#3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af8ab770) at /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af8cd0a0) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 2 (Thread 0x7f0577b40700 (LWP 17389)):
#0  0x00007f057eba12bd in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00005595aee47d8f in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at util/qemu-thread-posix.c:438
#2  0x00005595aee47d8f in qemu_event_wait (ev=ev@entry=0x5595af6ab8e8 <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
#3  0x00005595aee594c2 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:261
#4  0x00005595aee47564 in qemu_thread_start (args=0x5595af793260) at util/qemu-thread-posix.c:498
#5  0x00007f057ee762de in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007f057eba6a63 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7f0583a8cec0 (LWP 17388)):
#0  0x00007f057eb9b8d6 in __GI_ppoll (fds=0x5595af7b24f0, nfds=12, timeout=<optimized out>, timeout@entry=0x7fff0040c6f0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x00005595aee43765 in ppoll (__ss=0x0, __timeout=0x7fff0040c6f0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  0x00005595aee43765 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=455741) at util/qemu-timer.c:334
#3  0x00005595aee44735 in os_host_main_loop_wait (timeout=455741) at util/main-loop.c:233
#4  0x00005595aee44735 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
#5  0x00005595aec37e99 in main_loop () at vl.c:1910
#6  0x00005595aeaf83f4 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4681

7. Check block job status:
   {"execute":"query-block-jobs"}
    ---> no response

Qemu and guest hang...

Comment 10 aihua liang 2019-03-28 08:57:49 UTC
When running testcase "block_stream_negative" with dataplane enable on qemu-kvm-3.1.0-20.module+el8+2904+e658c755.x86_64, also hit this issue.

{'execute': 'block-stream', 'arguments': {'device': 'drive_image1', 'speed': 10485760}, 'id': 'bYJuNEbM'}

block_mirror no progress with info:
 {'execute': 'query-block-jobs', 'id': 'n1wS5W7l'}
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Response to 'query-block-jobs' (re-formated)
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        auto-finalize: True
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        io-status: ok
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        device: drive_image1
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        auto-dismiss: True
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        busy: True
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        len: 21474836480
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        offset: 3927113728
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        status: running
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        paused: False
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        speed: 10485760
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        ready: False
05:49:35 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1)        type: stream

Will try to reproduce it by manual later, if possible, will give the detail info.

Comment 11 aihua liang 2019-10-14 09:03:31 UTC
Also hit this issue in qemu-kvm-rhev-2.12.0-37.el7.x86_64.

Comment 12 Ademar Reis 2019-11-20 16:35:29 UTC
(In reply to aihua liang from comment #9)
> Qemu also hang when set buf-size when doing block mirror.
> 
>  kernel version: 4.18.0-57.el8.x86_64
>  qemu-kvm version:qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64
> 
> Test Steps:
>   1. Start guest with cmds:
>        /usr/libexec/qemu-kvm \
>     -name 'avocado-vt-vm1'  \
>     -sandbox off  \
>     -machine pc  \
>     -nodefaults \
>     -device VGA,bus=pci.0,addr=0x2  \
>     -chardev
> socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190107-
> 223041-G541jCra,server,nowait \
>     -mon chardev=qmp_id_qmpmonitor1,mode=control  \
>     -chardev
> socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190107-
> 223041-G541jCra,server,nowait \
>     -mon chardev=qmp_id_catch_monitor,mode=control \
>     -device pvpanic,ioport=0x505,id=id0S5usU  \
>     -chardev
> socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190107-223041-
> G541jCra,server,nowait \
>     -device isa-serial,chardev=serial_id_serial0  \
>     -chardev
> socket,id=seabioslog_id_20190107-223041-G541jCra,path=/var/tmp/seabios-
> 20190107-223041-G541jCra,server,nowait \
>     -device
> isa-debugcon,chardev=seabioslog_id_20190107-223041-G541jCra,iobase=0x402 \
>     -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
>     -device
> virtio-net-pci,mac=9a:a0:a1:a2:a3:a4,id=idawKNIF,vectors=4,netdev=idIaOUAY,
> bus=pci.0,addr=0x5  \
>     -netdev tap,id=idIaOUAY,vhost=on \
>     -m 7168  \
>     -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
>     -cpu host,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time
> \
>     -object secret,id=sec0,data=backing \
>     -blockdev
> node-name=my_drive,driver=iscsi,cache.direct=off,cache.no-flush=on,
> transport=tcp,portal=10.73.194.81,target=iqn.2019-01.com.example:aliang,
> lun=1 \
>     -blockdev
> driver=luks,node-name=drive_image1,file=my_drive,key-secret=sec0 \
>     -device ide-hd,drive=drive_image1,id=image1,bus=ide.0,unit=0 \
>     -drive
> id=drive_winutils,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,
> file=/home/kvm_autotest_root/iso/windows/winutils.iso \
>     -device ide-cd,id=winutils,drive=drive_winutils,bus=ide.0,unit=1 \
>     -drive
> id=drive_unattended,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,
> file=/home/kvm_autotest_root/iso/windows/virtio-win.iso.el8 \
>     -device ide-cd,id=unattended,drive=drive_unattended,bus=ide.1,unit=0 \
>     -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
>     -vnc :0  \
>     -rtc base=localtime,clock=host,driftfix=slew  \
>     -boot menu=off,strict=off,order=cdn,once=d \
>     -enable-kvm \
>     -monitor stdio \
> 
> 2. Create mirror image in previous:
>     {'execute':'blockdev-create','arguments':{'options':
> {'driver':'file','filename':'/root/sn1','size':32212254720},'job-id':'job1'}}
>    
> {'execute':'blockdev-add','arguments':{'driver':'file','node-name':
> 'drive_sn1','filename':'/root/sn1'}}
>     {'execute':'blockdev-create','arguments':{'options': {'driver':
> 'qcow2','file':'drive_sn1','size':32212254720},'job-id':'job2'}}
>    
> {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1',
> 'file':'drive_sn1'}}
>     {'execute':'job-dismiss','arguments':{'id':'job1'}}
>     {'execute':'job-dismiss','arguments':{'id':'job2'}}
> 
> 3. Do block mirror with buf-size setting:
>    
> {'execute':'blockdev-mirror','arguments':{'device':'drive_image1','target':
> 'sn1','sync':'full','granularity':65536,'buf-size':655360,'job-id':'j1'}}
> 
> 4. Check block job status during mirroring.
>     {"execute":"query-block-jobs"}
>     {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1",
> "auto-dismiss": true, "busy": true, "len": 32212975616, "offset":
> 4499046400, "status": "running", "paused": false, "speed": 0, "ready":
> false, "type": "mirror"}]}
> 
>     {"execute":"query-block-jobs"}
>     {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1",
> "auto-dismiss": true, "busy": true, "len": 32282705920, "offset":
> 13958512640, "status": "running", "paused": false, "speed": 0, "ready":
> false, "type": "mirror"}]}
> 
>     {"execute":"query-block-jobs"}
>     {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1",
> "auto-dismiss": true, "busy": true, "len": 32287846400, "offset":
> 32211759104, "status": "running", "paused": false, "speed": 0, "ready":
> false, "type": "mirror"}]}
> 
>     {"execute":"query-block-jobs"}
>     {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1",
> "auto-dismiss": true, "busy": true, "len": 32287846400, "offset":
> 32211759104, "status": "running", "paused": false, "speed": 0, "ready":
> false, "type": "mirror"}]}
> 
>     {"execute":"query-block-jobs"}
>     {"return": [{"auto-finalize": true, "io-status": "ok", "device": "j1",
> "auto-dismiss": true, "busy": true, "len": 32287846400, "offset":
> 32211759104, "status": "running", "paused": false, "speed": 0, "ready":
> false, "type": "mirror"}]}
> 
> 5. Check gdb info
>     #gdb -p 17388
>     (gdb) bt full
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb623e6 in kvm_vm_ioctl (s=s@entry=0x5595af85bed0,
> type=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2138
>         ret = <optimized out>
>         arg = 0x7fff0040c650
>         ap = {{gp_offset = 16, fp_offset = 21909, overflow_arg_area =
> 0x7fff0040c650, reg_save_area = 0x7fff0040c610}}
> #2  0x00005595aeb63109 in kvm_set_irq (s=0x5595af85bed0, irq=irq@entry=8,
> level=level@entry=1)
>     at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:958
>         event = {{irq = 1, status = 1}, level = 1}
>         ret = <optimized out>
>         __PRETTY_FUNCTION__ = "kvm_set_irq"
> #3  0x00005595aebc8937 in kvm_pic_set_irq (opaque=<optimized out>, irq=8,
> level=1) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/i386/kvm/
> i8259.c:115
>         delivered = <optimized out>
> #4  0x00005595aeb97806 in qemu_irq_raise (irq=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/include/hw/
> irq.h:16
> #5  0x00005595aeb97806 in rtc_policy_slew_deliver_irq (s=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/timer/
> mc146818rtc.c:149
> #6  0x00005595aeb978d2 in rtc_periodic_timer (opaque=0x5595af840450) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/hw/timer/
> mc146818rtc.c:279
>         s = 0x5595af840450
> #7  0x00005595aee44027 in timerlist_run_timers (timer_list=0x5595af84ca10)
> at util/qemu-timer.c:574
>         ts = <optimized out>
>         current_time = 1547196543718171000
>         progress = <optimized out>
>         cb = 0x5595aeb97870 <rtc_periodic_timer>
>         opaque = <optimized out>
>         need_replay_checkpoint = <optimized out>
>         timer_list = 0x5595af84ca10
> #8  0x00005595aee442ae in qemu_clock_run_timers (type=<optimized out>) at
> util/qemu-timer.c:708
>         progress = false
>         type = <optimized out>
> #9  0x00005595aee442ae in qemu_clock_run_all_timers () at
> util/qemu-timer.c:708
>         progress = false
>         type = <optimized out>
> #10 0x00005595aee447a5 in main_loop_wait (nonblocking=<optimized out>) at
> util/main-loop.c:503
>         timeout = 4294967295
>         timeout_ns = <optimized out>
> #11 0x00005595aec37e99 in main_loop () at vl.c:1910
> #12 0x00005595aeaf83f4 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4681
>         i = <optimized out>
>         snapshot = 0
>         linux_boot = <optimized out>
>         initrd_filename = 0x0
>         kernel_filename = 0x0
>         kernel_cmdline = <optimized out>
>         boot_order = 0x5595af7e0b80 "cdn"
>         boot_once = <optimized out>
>         ds = <optimized out>
>         opts = <optimized out>
>         machine_opts = <optimized out>
>         icount_opts = <optimized out>
> --Type <RET> for more, q to quit, c to continue without paging--
>         accel_opts = 0x0
>         olist = <optimized out>
>         optind = 67
>         optarg = 0x7fff0040f566 "stdio"
>         loadvm = 0x0
>         machine_class = <optimized out>
>         cpu_model = 0x7fff0040f196
> "host,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time"
>         vga_model = 0x0
>         qtest_chrdev = 0x0
>         qtest_log = 0x0
>         incoming = 0x0
>         userconfig = <optimized out>
>         nographic = false
>         display_remote = <optimized out>
>         log_mask = <optimized out>
>         log_file = <optimized out>
>         trace_file = <optimized out>
>         maxram_size = 7516192768
>         ram_slots = 0
>         vmstate_dump_file = 0x0
>         main_loop_err = 0x0
>         err = 0x0
>         list_data_dirs = <optimized out>
>         dir = <optimized out>
>         bdo_queue = {sqh_first = 0x0, sqh_last = 0x7fff0040c880}
>         __func__ = "main"
> 
> 6. Check pstack info
>    # pstack 17388
> Thread 11 (Thread 0x7f054bfff700 (LWP 17453)):
> #0  0x00007f057ee7c48c in futex_wait_cancelable (private=0, expected=0,
> futex_word=0x5595b07e3798) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
> #1  0x00007f057ee7c48c in __pthread_cond_wait_common (abstime=0x0,
> mutex=0x5595b07e37a8, cond=0x5595b07e3770) at pthread_cond_wait.c:502
> #2  0x00007f057ee7c48c in __pthread_cond_wait (cond=0x5595b07e3770,
> mutex=mutex@entry=0x5595b07e37a8) at pthread_cond_wait.c:655
> #3  0x00005595aee4791d in qemu_cond_wait_impl (cond=<optimized out>,
> mutex=0x5595b07e37a8, file=0x5595aefab4c7 "ui/vnc-jobs.c", line=214) at
> util/qemu-thread-posix.c:161
> #4  0x00005595aed7b691 in vnc_worker_thread_loop
> (queue=queue@entry=0x5595b07e3770) at ui/vnc-jobs.c:214
> #5  0x00005595aed7bc50 in vnc_worker_thread (arg=0x5595b07e3770) at
> ui/vnc-jobs.c:324
> #6  0x00005595aee47564 in qemu_thread_start (args=0x5595b0826ae0) at
> util/qemu-thread-posix.c:498
> #7  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #8  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 10 (Thread 0x7f055e7fc700 (LWP 17451)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9ca170,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9ca170) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9ec4d0) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 9 (Thread 0x7f055effd700 (LWP 17450)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9a6f80,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9a6f80) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9c92e0) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 8 (Thread 0x7f055f7fe700 (LWP 17449)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9843b0,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9843b0) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af9a6710) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 7 (Thread 0x7f055ffff700 (LWP 17448)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af9617e0,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af9617e0) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af983b40) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 6 (Thread 0x7f0574bff700 (LWP 17447)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af93e9e0,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af93e9e0) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af960fa0) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 5 (Thread 0x7f0575400700 (LWP 17446)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af91bdc0,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af91bdc0) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af93e170) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 4 (Thread 0x7f0575c01700 (LWP 17445)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af8f7690,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af8f7690) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af91b550) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 3 (Thread 0x7f0576402700 (LWP 17444)):
> #0  0x00007f057eb9d45b in ioctl () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00005595aeb64749 in kvm_vcpu_ioctl (cpu=cpu@entry=0x5595af8ab770,
> type=type@entry=44672) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:2156
> #2  0x00005595aeb6480a in kvm_cpu_exec (cpu=<optimized out>) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/accel/kvm/
> kvm-all.c:1993
> #3  0x00005595aeb3e066 in qemu_kvm_cpu_thread_fn (arg=0x5595af8ab770) at
> /usr/src/debug/qemu-kvm-3.1.0-3.module+el8+2614+d714d2bb.x86_64/cpus.c:1281
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af8cd0a0) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 2 (Thread 0x7f0577b40700 (LWP 17389)):
> #0  0x00007f057eba12bd in syscall () at
> ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
> #1  0x00005595aee47d8f in qemu_futex_wait (val=<optimized out>, f=<optimized
> out>) at util/qemu-thread-posix.c:438
> #2  0x00005595aee47d8f in qemu_event_wait (ev=ev@entry=0x5595af6ab8e8
> <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
> #3  0x00005595aee594c2 in call_rcu_thread (opaque=<optimized out>) at
> util/rcu.c:261
> #4  0x00005595aee47564 in qemu_thread_start (args=0x5595af793260) at
> util/qemu-thread-posix.c:498
> #5  0x00007f057ee762de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #6  0x00007f057eba6a63 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> Thread 1 (Thread 0x7f0583a8cec0 (LWP 17388)):
> #0  0x00007f057eb9b8d6 in __GI_ppoll (fds=0x5595af7b24f0, nfds=12,
> timeout=<optimized out>, timeout@entry=0x7fff0040c6f0,
> sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1  0x00005595aee43765 in ppoll (__ss=0x0, __timeout=0x7fff0040c6f0,
> __nfds=<optimized out>, __fds=<optimized out>) at
> /usr/include/bits/poll2.h:77
> #2  0x00005595aee43765 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized
> out>, timeout=timeout@entry=455741) at util/qemu-timer.c:334
> #3  0x00005595aee44735 in os_host_main_loop_wait (timeout=455741) at
> util/main-loop.c:233
> #4  0x00005595aee44735 in main_loop_wait (nonblocking=<optimized out>) at
> util/main-loop.c:497
> #5  0x00005595aec37e99 in main_loop () at vl.c:1910
> #6  0x00005595aeaf83f4 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4681
> 
> 7. Check block job status:
>    {"execute":"query-block-jobs"}
>     ---> no response
> 
> Qemu and guest hang...


Kevin: does it ring a bell? Can you please try reproducing this? I see that both -blockdev and -drive are being used, perhaps we should find a simpler reproducer?

Comment 13 John Snow 2020-01-23 22:54:16 UTC
Hi, doing triage on my bugs again: aliang, we can't reproduce this "manually", only with avocado still, right? I'm less familiar with Avocado and need to consult with e.g. Cleber Rosa to see if I can reproduce it. This looks like a pretty straightforward mirror operation and I don't have any ready-made hunches or suspects right now.

Looks like it's not a recent regression; if it happens on 2.12 and up through 3.1, do we know if it happens on 4.2?

Next, it looks like the job is choking at 99%; possibly either:

(1) The query command is racing with job cleanup, which creates complications, or
(2) The job is failing all on its own.

Something isn't quiescing/cleaning up correctly at the end. I wonder if the query command is truly necessary to replicate the hang? Could the query be an important piece?


So, aliang:

- Is the only way to reproduce this inside avocado for now?

Comment 14 Kevin Wolf 2020-01-24 09:44:38 UTC
(In reply to Ademar Reis from comment #12)
> Kevin: does it ring a bell? Can you please try reproducing this? I see that
> both -blockdev and -drive are being used, perhaps we should find a simpler
> reproducer?

First of all, this should be a separate BZ because it has an entirely different backtrace than the original report. Not every case where QEMU hangs is the same bug.

The pattern of the originally reported bug:
A vcpu thread holds qemu_global_mutex and tries to acquire the AioContext lock of a BlockBackend, but apparently someone else already holds it, so it hangs. Everyone else hangs because they're trying to acquire qemu_global_mutex, so this looks much like a deadlock. It's not obvious from the backtrace who holds the AioContext lock and why (I think that someone holds it in the circumstances of this backtrace is the bug). If we get a hanging process for inspection, looking at the internal mutex state in gdb would tell us at least who owns the lock.

The pattern of comment 9:
Nobody is trying to get a lock or anything. Everything is just sitting there idle. The problem in step 4 seems to be that the block job doesn't move forward any more immediately before it should finish. Getting a coroutine backtrace of the job would be helpful there. Step 7 says that at some point, the monitor doesn't respond any more. Unfortunately the backtrace seems to be from before the hang because a monitor hang should be visible there.

If I understand right, the original bug hasn't been reproduced outside of Avocado, but the unrelated bug from comment 9 looks more like manual testing.

Comment 15 Kevin Wolf 2020-01-24 11:07:05 UTC
I created bug 1794692 to track the bug from comment 9. Let's continue there and keep this BZ for the originally reported bug.

Resetting fields that were set for what is bug 1794692 now.

Comment 16 aihua liang 2020-02-04 03:51:38 UTC
Run auto test on qemu-kvm-4.2.0-8.module+el8.2.0+5607+dc756904, don't hit this issue any more.
Set bug's status to "Closed->CurrentRelease".

Comment 18 yisun 2020-02-13 04:35:22 UTC
this specific case passed:
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.2%20x86_64/job/libvirt-RHEL-8.2-runtest-x86_64-function-block_copy_no_shallow/6/testReport/rhel.virsh/blockcopy/positive_test_non_acl_local_disk_no_blockdev_no_shallow_finish_option/

with qemu and libvirt versions as follow:
libvirt	libvirt-6.0.0-1.module+el8.2.0+5453+31b2b136.x86_64
qemu-kvm	qemu-kvm-4.2.0-6.module+el8.2.0+5453+31b2b136.x86_64

Latter qemu-kvm version will be tracked to see if have problem.