Description of problem: fail to quit source guest after storage vm migration with data-plane Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.12.0-7.el7.x86_64 kernel-3.10.0-918.el7.x86_64 How reproducible: 100% Steps to Reproduce: Same steps to reproduce as Bug 1503437 1. boot up dest guest an empty image with data-plane. # qemu-img create -f qcow2 mirror.qcow2 20G # cat dest.sh /usr/libexec/qemu-kvm \ -name guest=test-virt1 \ -machine pc,accel=kvm,usb=off,vmport=off,dump-guest-core=off \ -cpu SandyBridge \ -m 4G \ -smp 4,sockets=4,cores=1,threads=1 \ -boot strict=on \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,bus=pci.0,addr=0x5,iothread=iothread0,id=scsi0 \ -drive file=mirror.qcow2,format=qcow2,snapshot=off,cache=none,if=none,aio=native,id=img0 \ -device scsi-hd,bus=scsi0.0,drive=img0,scsi-id=0,lun=0,id=scsi-disk0,bootindex=0 \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=51:54:12:b3:20:61,bus=pci.0,addr=0x3 \ -device qxl-vga \ -vnc :2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \ -monitor stdio \ -qmp tcp:0:5555,server,nowait \ -usbdevice tablet \ -incoming tcp:0:6666 \ 2. export dest img0 as NBD server. # telnet 127.0.0.1 5555 { "execute": "qmp_capabilities" } { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet","data": { "host":"10.66.144.33", "port": "9000" } } } } {"execute":"nbd-server-add","arguments":{"device":"img0", "writable": true}} 3. boot up src guest with data-plane. # cat src.sh /usr/libexec/qemu-kvm \ -name guest=test-virt0 \ -machine pc,accel=kvm,usb=off,vmport=off,dump-guest-core=off \ -cpu SandyBridge \ -m 4G \ -smp 4,sockets=4,cores=1,threads=1 \ -boot strict=on \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,bus=pci.0,addr=0x5,iothread=iothread0,id=scsi0 \ -drive file=/home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2,format=qcow2,snapshot=off,cache=none,if=none,aio=native,id=img0 \ -device scsi-hd,bus=scsi0.0,drive=img0,scsi-id=0,lun=0,id=scsi-disk0,bootindex=0 \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=51:54:12:b3:20:61,bus=pci.0,addr=0x3 \ -device qxl-vga \ -vnc :1 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \ -monitor stdio \ -qmp tcp:0:4444,server,nowait \ -usbdevice tablet \ 4. invoke drive mirror on src. # telnet 127.0.0.1 4444 { "execute": "qmp_capabilities" } { "execute": "drive-mirror", "arguments": { "device": "img0", "target": "nbd://10.66.144.33:9000/img0", "sync": "full", "format": "raw", "mode": "existing" } } 5. after block-mirror reaches ready state, do migration to dest. {"execute": "migrate","arguments":{"uri": "tcp:10.66.144.33:6666"}} 6. quit src guest. (qemu) q Actual results: dest guest continues to work after migration. failed to quit src guest. Expected results: could quit src guest. Additional info: 1. no such issue without dataplane. 2. bt info # gdb -batch -ex bt -p 23672 [New LWP 23699] [New LWP 23696] [New LWP 23690] [New LWP 23689] [New LWP 23688] [New LWP 23687] [New LWP 23686] [New LWP 23674] [New LWP 23673] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007f4fc7ef62cf in __GI_ppoll (fds=0x556e6568c3c0, nfds=2, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 56 result = INLINE_SYSCALL (ppoll, 5, fds, nfds, timeout, sigmask, #0 0x00007f4fc7ef62cf in __GI_ppoll (fds=0x556e6568c3c0, nfds=2, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 #1 0x0000556e627097cb in qemu_poll_ns (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x0000556e627097cb in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at util/qemu-timer.c:322 #3 0x0000556e6270b515 in aio_poll (ctx=0x556e655bb7c0, blocking=blocking@entry=true) at util/aio-posix.c:629 #4 0x0000556e62686caa in bdrv_flush (bs=bs@entry=0x556e65696800) at block/io.c:2531 #5 0x0000556e626379ab in bdrv_unref (bs=0x556e65696800) at block.c:3326 #6 0x0000556e626379ab in bdrv_unref (bs=0x556e65696800) at block.c:3514 #7 0x0000556e626379ab in bdrv_unref (bs=0x556e65696800) at block.c:4614 #8 0x0000556e6263af44 in block_job_remove_all_bdrv (job=job@entry=0x556e664b6000) at blockjob.c:177 #9 0x0000556e62681224 in mirror_exit (job=0x556e664b6000, opaque=0x556e65a066b8) at block/mirror.c:572 #10 0x0000556e6263baf2 in job_defer_to_main_loop_bh (opaque=0x556e65ff5d20) at job.c:968 #11 0x0000556e62708331 in aio_bh_poll (bh=0x556e673ab290) at util/async.c:90 #12 0x0000556e62708331 in aio_bh_poll (ctx=ctx@entry=0x556e655bb7c0) at util/async.c:118 #13 0x0000556e6270b764 in aio_poll (ctx=0x556e655bb7c0, blocking=blocking@entry=true) at util/aio-posix.c:689 #14 0x0000556e6263cd0a in job_finish_sync (job=0x556e664b6000, finish=<optimized out>, errp=<optimized out>) at job.c:1007 #15 0x0000556e6263d1c5 in job_cancel_sync_all (job=0x556e664b6000) at job.c:917 #16 0x0000556e6263d1c5 in job_cancel_sync_all () at job.c:928 #17 0x0000556e623b1fc6 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4775
(In reply to Longxiang Lyu from comment #0) > Description of problem: > fail to quit source guest after storage vm migration with data-plane > > Version-Release number of selected component (if applicable): > qemu-kvm-rhev-2.12.0-7.el7.x86_64 > kernel-3.10.0-918.el7.x86_64 > > How reproducible: > 100% > > Steps to Reproduce: > Same steps to reproduce as Bug 1503437 Likely the same root cause.
Can you please test whether this still happens with the recent drain/iothread fixes?
Hi,Kevin Test on qemu-kvm-rhev-2.12.0-18.el7_6.3.x86_64, still hit this issue: Gdb info: # gdb -batch -ex bt -p 31130 [New LWP 31162] [New LWP 31160] [New LWP 31159] [New LWP 31158] [New LWP 31157] [New LWP 31156] [New LWP 31155] [New LWP 31154] [New LWP 31153] [New LWP 31152] [New LWP 31151] [New LWP 31150] [New LWP 31132] [New LWP 31131] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007fd4985f62cf in ppoll () from /lib64/libc.so.6 #0 0x00007fd4985f62cf in ppoll () at /lib64/libc.so.6 #1 0x000055fbf857440b in qemu_poll_ns (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055fbf857440b in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x000055fbf8576187 in aio_poll (ctx=0x55fbfb6877c0, blocking=blocking@entry=true) at util/aio-posix.c:645 #4 0x000055fbf84f92fa in nbd_client_close (bs=0x55fbfcf3e000) at block/nbd-client.c:62 #5 0x000055fbf84f92fa in nbd_client_close (bs=0x55fbfcf3e000) at block/nbd-client.c:961 #6 0x000055fbf84f6c2a in nbd_close (bs=<optimized out>) at block/nbd.c:491 #7 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:3392 #8 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:3576 #9 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:4654 #10 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:3399 #11 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:3576 #12 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:4654 #13 0x000055fbf84e17e1 in blk_remove_bs (blk=blk@entry=0x55fbfcfe2580) at block/block-backend.c:784 #14 0x000055fbf84e1a5f in blk_unref (blk=0x55fbfcfe2580) at block/block-backend.c:402 #15 0x000055fbf84e1a5f in blk_unref (blk=0x55fbfcfe2580) at block/block-backend.c:458 #16 0x000055fbf84a4f58 in job_finish_sync (job=job@entry=0x55fbfcfe22c0, finish=finish@entry=0x55fbf84a5550 <job_cancel_err>, errp=errp@entry=0x0) at job.c:989 #17 0x000055fbf84a55a5 in job_cancel_sync_all (job=0x55fbfcfe22c0) at job.c:936 #18 0x000055fbf84a55a5 in job_cancel_sync_all () at job.c:947 #19 0x000055fbf8217b26 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4775 Reproduce steps: 1.Create an empty disk in dst #qemu-img create -f qcow2 /home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2 20G 2.In dst,start guest with cmds: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20181107-005924-PkIxnG9p,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20181107-005924-PkIxnG9p,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idkp9HYI \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20181107-005924-PkIxnG9p,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20181107-005924-PkIxnG9p,path=/var/tmp/seabios-20181107-005924-PkIxnG9p,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20181107-005924-PkIxnG9p,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -device virtio-net-pci,mac=9a:44:45:46:47:48,id=iddDGLIi,vectors=4,netdev=idDdrbRp,bus=pci.0,addr=0x7 \ -netdev tap,id=idDdrbRp,vhost=on \ -m 2048 \ -smp 10,maxcpus=10,cores=5,threads=1,sockets=2 \ -cpu SandyBridge \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=scsi0,iothread=iothread0 \ -drive if=none,id=drive_image1,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,bus=scsi0.0 \ -incoming tcp:0:5000 \ 2.In dst, start nbd server and expose the system disk. { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet","data": { "host": "10.73.194.83", "port": "3333"}}}} { "execute": "nbd-server-add", "arguments": { "device": "drive_data1","writable": true } } 3.In src,start guest with qemu cmds: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20181107-005924-PkIxnG9p,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20181107-005924-PkIxnG9p,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idkp9HYI \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20181107-005924-PkIxnG9p,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20181107-005924-PkIxnG9p,path=/var/tmp/seabios-20181107-005924-PkIxnG9p,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20181107-005924-PkIxnG9p,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -device virtio-net-pci,mac=9a:44:45:46:47:48,id=iddDGLIi,vectors=4,netdev=idDdrbRp,bus=pci.0,addr=0x7 \ -netdev tap,id=idDdrbRp,vhost=on \ -m 2048 \ -smp 10,maxcpus=10,cores=5,threads=1,sockets=2 \ -cpu SandyBridge \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=scsi0,iothread=iothread0 \ -drive if=none,id=drive_image1,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,bus=scsi0.0 \ 4.In src, do block mirror: { "execute": "drive-mirror", "arguments": { "device": "drive_image1", "target": "nbd://10.73.194.83:3333/drive_image1", "sync": "full", "format": "raw", "mode": "existing" } } 5.In src, after mirror reach steady status, migrate from src to dst. {"timestamp": {"seconds": 1544692298, "microseconds": 282821}, "event": "BLOCK_JOB_READY", "data": {"device": "drive_image1", "len": 21474836480, "offset": 21474836480, "speed": 0, "type": "mirror"}} {"execute": "migrate","arguments":{"uri": "tcp:10.73.194.83:5000"}} {"return": {}} {"timestamp": {"seconds": 1544692540, "microseconds": 562189}, "event": "STOP"} 6.In dst, quit guest (qemu)quit ----> vm quit successfully. 7.In src, quit guest (qemu)quit ---> src hang with pstack info: # pstack 31130 Thread 15 (Thread 0x7fd490e30700 (LWP 31131)): #0 0x00007fd4985fb1c9 in syscall () at /lib64/libc.so.6 #1 0x000055fbf8578410 in qemu_event_wait (val=<optimized out>, f=<optimized out>) at /usr/src/debug/qemu-2.12.0/include/qemu/futex.h:29 #2 0x000055fbf8578410 in qemu_event_wait (ev=ev@entry=0x55fbf91ffbe8 <rcu_call_ready_event>) at util/qemu-thread-posix.c:445 #3 0x000055fbf858893e in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:261 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 14 (Thread 0x7fd49062f700 (LWP 31132)): #0 0x00007fd4985f62cf in ppoll () at /lib64/libc.so.6 #1 0x000055fbf857440b in qemu_poll_ns (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055fbf857440b in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x000055fbf8576187 in aio_poll (ctx=0x55fbfb687900, blocking=blocking@entry=true) at util/aio-posix.c:645 #4 0x000055fbf8345d5e in iothread_run (opaque=0x55fbfb6a5ce0) at iothread.c:64 #5 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #6 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 13 (Thread 0x7fd48f62d700 (LWP 31150)): #0 0x00007fd4985f62cf in ppoll () at /lib64/libc.so.6 #1 0x000055fbf857440b in qemu_poll_ns (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055fbf857440b in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x000055fbf8576187 in aio_poll (ctx=0x55fbfb687b80, blocking=blocking@entry=true) at util/aio-posix.c:645 #4 0x000055fbf8345d5e in iothread_run (opaque=0x55fbfb6a5ea0) at iothread.c:64 #5 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #6 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 12 (Thread 0x7fd48ee2c700 (LWP 31151)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb8da000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb8da000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 11 (Thread 0x7fd48e62b700 (LWP 31152)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb93e000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb93e000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 10 (Thread 0x7fd48de2a700 (LWP 31153)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb95c000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb95c000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 9 (Thread 0x7fd48d629700 (LWP 31154)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb97c000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb97c000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 8 (Thread 0x7fd48ce28700 (LWP 31155)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb99e000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb99e000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 7 (Thread 0x7fd48c627700 (LWP 31156)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb9be000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb9be000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 6 (Thread 0x7fd48be26700 (LWP 31157)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfb9e0000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfb9e0000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 5 (Thread 0x7fd48b625700 (LWP 31158)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfba00000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfba00000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 4 (Thread 0x7fd48ae24700 (LWP 31159)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfba20000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfba20000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 3 (Thread 0x7fd48a623700 (LWP 31160)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=<optimized out>, mutex=mutex@entry=0x55fbf8dc60e0 <qemu_global_mutex>, file=file@entry=0x55fbf860c308 "/builddir/build/BUILD/qemu-2.12.0/cpus.c", line=line@entry=1176) at util/qemu-thread-posix.c:164 #2 0x000055fbf8259d1f in qemu_wait_io_event (cpu=cpu@entry=0x55fbfba42000) at /usr/src/debug/qemu-2.12.0/cpus.c:1176 #3 0x000055fbf825b420 in qemu_kvm_cpu_thread_fn (arg=0x55fbfba42000) at /usr/src/debug/qemu-2.12.0/cpus.c:1220 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 2 (Thread 0x7fd4083ff700 (LWP 31162)): #0 0x00007fd4988db965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x000055fbf8577fe9 in qemu_cond_wait_impl (cond=cond@entry=0x55fbfb650a20, mutex=mutex@entry=0x55fbfb650a58, file=file@entry=0x55fbf86e1d07 "ui/vnc-jobs.c", line=line@entry=212) at util/qemu-thread-posix.c:164 #2 0x000055fbf8492b1f in vnc_worker_thread_loop (queue=queue@entry=0x55fbfb650a20) at ui/vnc-jobs.c:212 #3 0x000055fbf84930e8 in vnc_worker_thread (arg=0x55fbfb650a20) at ui/vnc-jobs.c:319 #4 0x00007fd4988d7dd5 in start_thread () at /lib64/libpthread.so.0 #5 0x00007fd498600ead in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7fd4b1ce8dc0 (LWP 31130)): #0 0x00007fd4985f62cf in ppoll () at /lib64/libc.so.6 #1 0x000055fbf857440b in qemu_poll_ns (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055fbf857440b in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x000055fbf8576187 in aio_poll (ctx=0x55fbfb6877c0, blocking=blocking@entry=true) at util/aio-posix.c:645 #4 0x000055fbf84f92fa in nbd_client_close (bs=0x55fbfcf3e000) at block/nbd-client.c:62 #5 0x000055fbf84f92fa in nbd_client_close (bs=0x55fbfcf3e000) at block/nbd-client.c:961 #6 0x000055fbf84f6c2a in nbd_close (bs=<optimized out>) at block/nbd.c:491 #7 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:3392 #8 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:3576 #9 0x000055fbf849fc82 in bdrv_unref (bs=0x55fbfcf3e000) at block.c:4654 #10 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:3399 #11 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:3576 #12 0x000055fbf849fcaf in bdrv_unref (bs=0x55fbfb762800) at block.c:4654 #13 0x000055fbf84e17e1 in blk_remove_bs (blk=blk@entry=0x55fbfcfe2580) at block/block-backend.c:784 #14 0x000055fbf84e1a5f in blk_unref (blk=0x55fbfcfe2580) at block/block-backend.c:402 #15 0x000055fbf84e1a5f in blk_unref (blk=0x55fbfcfe2580) at block/block-backend.c:458 #16 0x000055fbf84a4f58 in job_finish_sync (job=job@entry=0x55fbfcfe22c0, finish=finish@entry=0x55fbf84a5550 <job_cancel_err>, errp=errp@entry=0x0) at job.c:989 #17 0x000055fbf84a55a5 in job_cancel_sync_all (job=0x55fbfcfe22c0) at job.c:936 #18 0x000055fbf84a55a5 in job_cancel_sync_all () at job.c:947 #19 0x000055fbf8217b26 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4775
Thanks for retesting! This has a different backtrace now, so maybe the original bug is fixed and we hit a different bug now. The new backtrace ends in some NBD client code and looks very similar to bug 1634219. Also, thank you for providing the backtrace for all threads, this is the information that was still missing in the other bug! It shows that the iothreads are not waiting for a lock, but idle. This means we're probably missing some notification (aio_wait_kick) in the NBD code.
(In reply to Kevin Wolf from comment #6) > Thanks for retesting! > > This has a different backtrace now, so maybe the original bug is fixed and > we hit a different bug now. The new backtrace ends in some NBD client code > and looks very similar to bug 1634219. Reassigning to Eric, who's taking care of bug 1634219.
While the context is slightly different, this is the same bug as BZ#1634219 *** This bug has been marked as a duplicate of bug 1634219 ***