Bug 2186181 - Qemu core dump when query-jobs after creating target mirror node(iothread enable)
Summary: Qemu core dump when query-jobs after creating target mirror node(iothread ena...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-12 10:34 UTC by aihua liang
Modified: 2023-07-03 08:31 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-06-30 06:38:46 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-154513 0 None None None 2023-04-12 10:35:46 UTC

Description aihua liang 2023-04-12 10:34:17 UTC
Description of problem:
Qemu core dump when query-jobs after creating target mirror nodee


Version-Release number of selected component (if applicable):
kernel version:5.14.0-295.el9.x86_64
qemu-kvm version:qemu-kvm-8.0.0-0.rc1.el9.candidate


How reproducible:
2/5

Steps to Reproduce:
1.Start guest with qemu cmdline:
/usr/libexec/qemu-kvm \
 	-S  \
 	-name 'avocado-vt-vm1'  \
 	-sandbox on  \
 	-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \
 	-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
 	-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \
 	-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \
 	-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
 	-device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
 	-device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
 	-nodefaults \
 	-device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
 	-m 30720 \
 	-object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
 	-smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2  \
 	-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
 	-chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-qmpmonitor1-20230412-035446-fRTermpy,id=qmp_id_qmpmonitor1,wait=off,server=on  \
 	-mon chardev=qmp_id_qmpmonitor1,mode=control \
 	-chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-catch_monitor-20230412-035446-fRTermpy,id=qmp_id_catch_monitor,wait=off,server=on  \
 	-mon chardev=qmp_id_catch_monitor,mode=control \
 	-device '{"ioport": 1285, "driver": "pvpanic", "id": "idXYf3vj"}' \
 	-chardev socket,path=/var/tmp/avocado_22sdgjjv/serial-serial0-20230412-035446-fRTermpy,id=chardev_serial0,wait=off,server=on \
 	-device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}'  \
 	-chardev socket,id=seabioslog_id_20230412-035446-fRTermpy,path=/var/tmp/avocado_22sdgjjv/seabios-20230412-035446-fRTermpy,server=on,wait=off \
 	-device isa-debugcon,chardev=seabioslog_id_20230412-035446-fRTermpy,iobase=0x402 \
 	-device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
 	-device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
 	-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio.qcow2", "cache": {"direct": true, "no-flush": false}}' \
 	-object '{"qom-type": "iothread", "id": "iothread0"}' \
 	-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
 	-device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
 	-device '{"driver": "virtio-blk-pci", "id": "image1", "drive": "drive_image1", "bootindex": 0, "write-cache": "on", "bus": "pcie-root-port-2", "addr": "0x0", "iothread": "iothread0"}' \
 	-device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
 	-device '{"driver": "virtio-net-pci", "mac": "9a:af:02:d9:a7:c4", "id": "idHizN18", "netdev": "idSWxZ3M", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
 	-netdev tap,id=idSWxZ3M,vhost=on,vhostfd=16,fd=12  \
 	-vnc :0  \
 	-rtc base=utc,clock=host,driftfix=slew  \
 	-boot menu=off,order=cdn,once=c,strict=off \
 	-chardev socket,id=char_vtpm_avocado-vt-vm1_tpm0,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock \
 	-tpmdev emulator,chardev=char_vtpm_avocado-vt-vm1_tpm0,id=emulator_vtpm_avocado-vt-vm1_tpm0 \
 	-device '{"id": "tpm-crb_vtpm_avocado-vt-vm1_tpm0", "tpmdev": "emulator_vtpm_avocado-vt-vm1_tpm0", "driver": "tpm-crb"}' \
 	-enable-kvm \
 	-device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}'

2.Continue vm
 {'execute': 'cont', 'id': 'XKDhbUzA'}

3.Create a file in guest
 (guest)#dd if=/dev/urandom of=/var/tmp/IA2q bs=1M count=10 oflag=direct
    	#md5sum /var/tmp/IA2q > /var/tmp/IA2q.md5 && sync

4.Create mirror target file node
  {'execute': 'blockdev-create', 'arguments': {'options': {'driver': 'file', 'filename': '/tmp/tmp_target_path/mirror1.qcow2', 'size': 21474836480}, 'job-id': 'file_mirror1'}, 'id': 'IJIXcNXa'}

5. Check job status
  {'execute': 'query-jobs', 'id': 'LubWCdSf'}

6. Dismiss job
  {'execute': 'job-dismiss', 'arguments': {'id': 'file_mirror1'}, 'id': 'KOupRvUY'}

7. Add mirror target file node
  {'execute': 'blockdev-add', 'arguments': {'node-name': 'file_mirror1', 'driver': 'file', 'filename': '/tmp/tmp_target_path/mirror1.qcow2', 'aio': 'threads', 'auto-read-only': True, 'discard': 'unmap'}, 'id': 'nzSslZfn'}

8. Create mirror target format node
  {'execute': 'blockdev-create', 'arguments': {'options': {'driver': 'qcow2', 'file': 'file_mirror1', 'size': 21474836480}, 'job-id': 'drive_mirror1'}, 'id': 'sIHz0l2u'}
{"timestamp": {"seconds": 1681286134, "microseconds": 135983}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "drive_mirror1"}}
{"timestamp": {"seconds": 1681286134, "microseconds": 136017}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "drive_mirror1"}}
 {"return": {}, "id": "sIHz0l2u"}

9. Check block job status
  {'execute': 'query-jobs', 'id': 'XFqDVXUh'}
{"return": [{"current-progress": 0, "status": "running", "total-progress": 1, "type": "create", "id": "drive_mirror1"}], "id": "XFqDVXUh"}

10. Check job status again
  {'execute': 'query-jobs', 'id': 'RNNblrXT'}

Actual results:
After step10, no response for query-jobs and qemu coredump with info:
 [qemu output] /tmp/aexpect_j9SQ86Ot/aexpect-5jxdpzjj.sh: line 1: 307497 Segmentation fault  	(core dumped) MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -sandbox on -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 30720 -object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2 -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt -chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-qmpmonitor1-20230412-035446-fRTermpy,id=qmp_id_qmpmonitor1,wait=off,server=on -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-catch_monitor-20230412-035446-fRTermpy,id=qmp_id_catch_monitor,wait=off,server=on -mon chardev=qmp_id_catch_monitor,mode=control ...

Expected results:
Mirror node can created/add successfully, and mirror executed successfully.

Additional info:
Coredump info:
 coredumpctl debug 307497
       	PID: 307497 (qemu-kvm)
       	UID: 0 (root)
       	GID: 0 (root)
    	Signal: 11 (SEGV)
 	Timestamp: Wed 2023-04-12 03:55:34 EDT (19min ago)
  Command Line: /usr/libexec/qemu-kvm -S -name avocado-vt-vm1 -sandbox on -blockdev $'{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev $'{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev $'{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev $'{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars -device $'{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device $'{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device $'{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 30720 -object $'{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2 -cpu Cascadelake-Server-noTSX,+kvm_pv_unhalt -chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-qmpmonitor1-20230412-035446-fRTermpy,id=qmp_id_qmpmonitor1,wait=off,server=on -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,path=/var/tmp/avocado_22sdgjjv/monitor-catch_monitor-20230412-035446-fRTermpy,id=qmp_id_catch_monitor,wait=off,server=on -mon chardev=qmp_id_catch_monitor,mode=control -device $'{"ioport": 1285, "driver": "pvpanic", "id": "idXYf3vj"}' -chardev socket,path=/var/tmp/avocado_22sdgjjv/serial-serial0-20230412-035446-fRTermpy,id=chardev_serial0,wait=off,server=on -device $'{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' -chardev socket,id=seabioslog_id_20230412-035446-fRTermpy,path=/var/tmp/avocado_22sdgjjv/seabios-20230412-035446-fRTermpy,server=on,wait=off -device isa-debugcon,chardev=seabioslog_id_20230412-035446-fRTermpy,iobase=0x402 -device $'{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' -device $'{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' -device $'{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' -blockdev $'{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio.qcow2", "cache": {"direct": true, "no-flush": false}}' -object $'{"qom-type": "iothread", "id": "iothread0"}' -blockdev $'{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' -device $'{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' -device $'{"driver": "virtio-blk-pci", "id": "image1", "drive": "drive_image1", "bootindex": 0, "write-cache": "on", "bus": "pcie-root-port-2", "addr": "0x0", "iothread": "iothread0"}' -device $'{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' -device $'{"driver": "virtio-net-pci", "mac": "9a:af:02:d9:a7:c4", "id": "idHizN18", "netdev": "idSWxZ3M", "bus": "pcie-root-port-3", "addr": "0x0"}' -netdev tap,id=idSWxZ3M,vhost=on,vhostfd=16,fd=12 -vnc :0 -rtc base=utc,clock=host,driftfix=slew -boot menu=off,order=cdn,once=c,strict=off -chardev socket,id=char_vtpm_avocado-vt-vm1_tpm0,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock -tpmdev emulator,chardev=char_vtpm_avocado-vt-vm1_tpm0,id=emulator_vtpm_avocado-vt-vm1_tpm0 -device $'{"id": "tpm-crb_vtpm_avocado-vt-vm1_tpm0", "tpmdev": "emulator_vtpm_avocado-vt-vm1_tpm0", "driver": "tpm-crb"}' -enable-kvm -device $'{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}'
	Executable: /usr/libexec/qemu-kvm
 Control Group: /user.slice/user-0.slice/session-11.scope
      	Unit: session-11.scope
     	Slice: user-0.slice
   	Session: 11
 	Owner UID: 0 (root)
   	Boot ID: 9e198a9379fe4bf09c651e58127fa82a
	Machine ID: 17118f5bdfe643f79903478703dce496
  	Hostname: dell-per440-27.lab.eng.pek2.redhat.com
   	Storage: /var/lib/systemd/coredump/core.qemu-kvm.0.9e198a9379fe4bf09c651e58127fa82a.307497.1681286134000000.zst (present)
  Size on Disk: 557.5M
   	Message: Process 307497 (qemu-kvm) of user 0 dumped core.
           	 
            	Stack trace of thread 307499:
            	#0  0x00007fe1a00ae1c3 _int_malloc (libc.so.6 + 0xae1c3)
            	#1  0x00007fe1a00b012e __libc_calloc (libc.so.6 + 0xb012e)
            	#2  0x00007fe1a06dfb41 g_malloc0 (libglib-2.0.so.0 + 0x5ab41)
            	#3  0x0000560428975c47 qemu_coroutine_new (qemu-kvm + 0xa02c47)
            	#4  0x0000560428974500 qemu_coroutine_create (qemu-kvm + 0xa01500)
            	#5  0x0000560428762a71 blk_io_plug (qemu-kvm + 0x7efa71)
            	#6  0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#7  0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#8  0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#9  0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#10 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#11 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#12 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#13 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#14 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#15 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#16 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#17 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#18 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#19 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#20 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#21 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#22 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#23 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#24 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#25 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#26 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#27 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#28 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#29 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#30 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#31 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#32 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#33 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#34 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#35 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#36 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#37 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#38 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#39 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#40 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#41 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#42 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#43 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#44 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#45 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#46 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#47 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#48 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#49 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#50 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#51 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#52 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#53 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#54 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
            	#55 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#56 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#57 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
            	#58 0x000056042876631d bdrv_poll_co (qemu-kvm + 0x7f331d)
            	#59 0x0000560428762a7d blk_io_plug (qemu-kvm + 0x7efa7d)
            	#60 0x00005604285de750 virtio_blk_handle_vq (qemu-kvm + 0x66b750)
   	query     	#61 0x00005604286243a4 virtio_queue_host_notifier_aio_poll_ready (qemu-kvm + 0x6b13a4)
            	#62 0x00005604289578c1 aio_dispatch_handler (qemu-kvm + 0x9e48c1)
            	#63 0x0000560428958613 aio_poll (qemu-kvm + 0x9e5613)
           	 
            	Stack trace of thread 307498:
            	#0  0x00007fe1a003ee5d syscall (libc.so.6 + 0x3ee5d)
            	#1  0x000056042895bc5f qemu_event_wait (qemu-kvm + 0x9e8c5f)
            	#2  0x000056042896821b call_rcu_thread (qemu-kvm + 0x9f521b)
            	#3  0x000056042895bf0a qemu_thread_start (qemu-kvm + 0x9e8f0a)
            	#4  0x00007fe1a009f832 start_thread (libc.so.6 + 0x9f832)
            	#5  0x00007fe1a003f450 __clone3 (libc.so.6 + 0x3f450)
           	 
            	Stack trace of thread 307505:
            	#0  0x00007fe1a009c590 __GI___lll_lock_wait (libc.so.6 + 0x9c590)
            	#1  0x00007fe1a00a2c52 __pthread_mutex_lock.5 (libc.so.6 + 0xa2c52)
            	#2  0x000056042895af6f qemu_mutex_lock_impl (qemu-kvm + 0x9e7f6f)
            	#3  0x000056042865af9d flatview_write_continue (qemu-kvm + 0x6e7f9d)
            	#4  0x000056042865ad11 flatview_write (qemu-kvm + 0x6e7d11)
            	#5  0x000056042865f14c address_space_write (qemu-kvm + 0x6ec14c)
            	#6  0x000056042870323e kvm_cpu_exec (qemu-kvm + 0x79023e)
            	#7  0x00005604287057aa kvm_vcpu_thread_fn (qemu-kvm + 0x7927aa)
            	#8  0x000056042895bf0a qemu_thread_start (qemu-kvm + 0x9e8f0a)
            	#9  0x00007fe1a009f832 start_thread (libc.so.6 + 0x9f832)
            	#10 0x00007fe1a003f450 __clone3 (libc.so.6 + 0x3f450)
           	 
            	Stack trace of thread 307497:
            	#0  0x00007fe1a009c590 __GI___lll_lock_wait (libc.so.6 + 0x9c590)
            	#1  0x00007fe1a00a2cad __pthread_mutex_lock.5 (libc.so.6 + 0xa2cad)
            	#2  0x000056042895af6f qemu_mutex_lock_impl (qemu-kvm + 0x9e7f6f)
            	#3  0x00005604287afe68 bdrv_co_yield_to_drain (qemu-kvm + 0x83ce68)
            	#4  0x00005604287b57a9 bdrv_drain_all_end (qemu-kvm + 0x8427a9)
            	#5  0x000056042876d16a bdrv_replace_child_noperm (qemu-kvm + 0x7fa16a)
            	#6  0x000056042876d044 bdrv_root_unref_child (qemu-kvm + 0x7fa044)
            	#7  0x000056042879b366 blk_unref (qemu-kvm + 0x828366)
            	#8  0x00005604287e1ea7 qcow2_co_create (qemu-kvm + 0x86eea7)
            	#9  0x00005604287a7a21 blockdev_create_run (qemu-kvm + 0x834a21)
            	#10 0x000056042877d011 job_co_entry (qemu-kvm + 0x80a011)
            	#11 0x0000560428975e26 coroutine_trampoline (qemu-kvm + 0xa02e26)
            	#12 0x00007fe1a002a360 n/a (libc.so.6 + 0x2a360)
            	#13 0x0000000000000000 n/a (n/a + 0x0)
            	ELF object binary architecture: AMD x86-64

Note:
 Hit this issue by automation, will do more investigation on it to check if it's a regression and iothread related.

Comment 5 Kevin Wolf 2023-04-20 08:03:47 UTC
I have no idea why malloc() segfaults, this sounds a bit scary. I hope it's not memory corruption. Maybe a stack overflow that just coincidentally hits there?

Anyway, I don't think the recursion shown in the stack trace is intended. Stefan, can you have a look?

Comment 8 Stefan Hajnoczi 2023-04-27 21:52:26 UTC
The recursion is not intended. QEMU's virtqueue polling sees that there are new requests waiting to be processed, but blk_io_plug() makes a nested aio_poll() call. This creates an infinite call chain that will cause stack exhaustion.

I have patches that eliminate the blk_io_plug() API. They were not intended as a bug fix, instead they are necessary for upcoming QEMU multi-block layer changes. However, they are not a complete solution because they only eliminate blk_io_plug(). There may be other nested aio_poll() calls.

Two solutions come to mind:
1. Disable polling in nested aio_poll() calls because it is level-triggered instead of edge-triggered. If we read() the ioeventfd instead of relying on virtqueue memory polling, then this infinite call chain cannot happen.
2. Always pop virtqueue requests before calling aio_poll().

#1 fixes the entire class of bugs.
#2 is a per-device solution and there is no way to verify that all cases have been fixed other than auditing the whole codebase.

Comment 17 Stefan Hajnoczi 2023-05-24 20:05:18 UTC
I will post a backport for testing that includes Kevin's blk_co_unref() fix and my aio_poll() fix.

Comment 18 Stefan Hajnoczi 2023-05-26 11:02:39 UTC
Please test this rpm that includes Kevin's recent blk_co_unref() fix:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=52881197

Comment 22 aihua liang 2023-06-20 01:58:19 UTC
Test on qemu-kvm-8.0.0-5.el9 with case blockdev_full_backup, all pass.
 (12/16) Host_RHEL.m9.u3.ovmf.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_full_backup.simple_test.manual_completed.auto_compress.src_cluster_size_2M.q35: PASS (141.43 s)
 (13/16) Host_RHEL.m9.u3.ovmf.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_full_backup.during_reboot.with_data_plane.q35: PASS (187.11 s)
 (14/16) Host_RHEL.m9.u3.ovmf.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_full_backup.during_reboot.q35: PASS (187.11 s)
 (15/16) Host_RHEL.m9.u3.ovmf.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_full_backup.during_stress.with_data_plane.q35: PASS (155.82 s)
 (16/16) Host_RHEL.m9.u3.ovmf.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_full_backup.during_stress.q35: PASS (161.15 s)

Test on qemu-kvm-8.0.0-5.el9 with cases:differential_backup,blockdev_mirror_vm_reboot,blockdev_mirror_stress,blockdev_commit_install,blockdev_mirror_no_space,blockdev_mirror_vm_stop_cont,blockdev_snapshot_chains for total 100 times, all pass.
 (96/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_mirror_vm_stop_cont.q35: PASS (482.48 s)
 (97/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.differential_backup.q35: PASS (107.34 s)
 (96/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_snapshot_chains.q35: PASS (441.93 s)
 (97/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_mirror_vm_reboot.q35: PASS (387.34 s)
 (98/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_commit_install.q35: PASS (840.45 s)
 (99/100) repeat7.Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_mirror_no_space.q35: PASS (74.00 s)
 (100/100) Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.blockdev_mirror_stress.q35: PASS (508.58 s)


And also run regression test, all cases pass.


Note You need to log in before you can comment on or make changes to this bug.