Bug 1824363 - Qemu core dump when do snapshot with same node and overlay that not existed in snapshot chain
Summary: Qemu core dump when do snapshot with same node and overlay that not existed i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-16 02:31 UTC by aihua liang
Modified: 2022-05-17 12:24 UTC (History)
8 users (show)

Fixed In Version: qemu-kvm-6.2.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-17 12:23:22 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2022:2307 0 None None None 2022-05-17 12:24:06 UTC

Description aihua liang 2020-04-16 02:31:27 UTC
Description of problem:
 Qemu core dump when do snapshot with same node and overlay that not existed in snapshot chain

Version-Release number of selected component (if applicable):
 kernel version:4.18.0-175.el8.x86_64
 qemu-kvm version:qemu-kvm-4.2.0-18.module+el8.2.0+6278+dfae3426

How reproducible:
100%

Steps to Reproduce:
1.Start guest with qemu cmds:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1 \
    -m 4096  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC',+kvm_pv_unhalt  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20200203-033416-61dmcn92,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idy8YPXp \
    -chardev socket,path=/var/tmp/serial-serial0-20200203-033416-61dmcn92,server,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200203-033416-61dmcn92,path=/var/tmp/seabios-20200203-033416-61dmcn92,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200203-033416-61dmcn92,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6,iothread=iothread1 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:6c:ca:b7:36:85,id=idz4QyVp,netdev=idNnpx5D,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idNnpx5D,vhost=on \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

 2. Create a snapshot target in advance
     {'execute':'blockdev-create','arguments':{'options':
{'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
     {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
     {'execute':'blockdev-create','arguments':{'options':
{'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/data.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
     {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}}
     {'execute':'job-dismiss','arguments':{'id':'job1'}}
     {'execute':'job-dismiss','arguments':{'id':'job2'}}

 3. Do snapshot from sn1 to sn1
     {"execute":"blockdev-snapshot","arguments":{"node":"sn1","overlay":"sn1"}}
Ncat: Connection reset by peer.

Actual results:
 After step3, qemu core dump with info:
    (qemu) qemu-kvm: block.c:2416: bdrv_replace_child_noperm:
Assertion `new_bs->quiesce_counter <= new_bs_quiesce_counter' failed.
bug.txt: line 42: 18428 Aborted                 (core dumped)
/usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox on -machine q35
-nodefaults ...

 gdb info:
  (gdb) bt
#0  0x00007f7916b5870f in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f7916b42b25 in __GI_abort () at abort.c:79
#2  0x00007f7916b429f9 in __assert_fail_base
    (fmt=0x7f7916ca8c28 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=0x5577e0b96700 "new_bs->quiesce_counter <=
new_bs_quiesce_counter", file=0x5577e0af7630 "block.c", line=2416,
function=<optimized out>) at assert.c:92
#3  0x00007f7916b50cc6 in __GI___assert_fail
    (assertion=assertion@entry=0x5577e0b96700 "new_bs->quiesce_counter
<= new_bs_quiesce_counter", file=file@entry=0x5577e0af7630 "block.c",
line=line@entry=2416, function=function@entry=0x5577e0b98040
<__PRETTY_FUNCTION__.31656> "bdrv_replace_child_noperm") at
assert.c:101
#4  0x00005577e0945017 in bdrv_replace_child_noperm
    (child=child@entry=0x5577e2d5df70, new_bs=new_bs@entry=0x5577e31e56f0)
    at block.c:2416
#5  0x00005577e0949f33 in bdrv_replace_child
    (child=child@entry=0x5577e2d5df70, new_bs=new_bs@entry=0x5577e31e56f0)
    at block.c:2453
#6  0x00005577e094aec8 in bdrv_root_attach_child
    (child_bs=child_bs@entry=0x5577e31e56f0,
child_name=child_name@entry=0x5577e0b9cd52 "backing",
child_role=child_role@entry=0x5577e1194080 <child_backing>,
ctx=<optimize--Type <RET> for more, q to quit, c to continue without
paging--
d out>, perm=<optimized out>, shared_perm=<optimized out>,
opaque=0x5577e31e56f0, errp=0x7ffd1b263a50) at block.c:2557
#7  0x00005577e094b0a5 in bdrv_attach_child
    (parent_bs=parent_bs@entry=0x5577e31e56f0,
child_bs=child_bs@entry=0x5577e31e56f0,
child_name=child_name@entry=0x5577e0b9cd52 "backing",
child_role=child_role@entry=0x5577e1194080 <child_backing>,
errp=errp@entry=0x7ffd1b263a50) at block.c:6058
#8  0x00005577e094b1f5 in bdrv_set_backing_hd
    (bs=bs@entry=0x5577e31e56f0,
backing_hd=backing_hd@entry=0x5577e31e56f0,
errp=errp@entry=0x7ffd1b263a50) at block.c:2709
#9  0x00005577e094b4aa in bdrv_append
    (bs_new=0x5577e31e56f0, bs_top=0x5577e31e56f0,
errp=errp@entry=0x7ffd1b263ab0)
    at block.c:4402
#10 0x00005577e07e8b70 in external_snapshot_prepare
    (common=0x5577e32a3940, errp=0x7ffd1b263b38) at blockdev.c:1683
#11 0x00005577e07ebd02 in qmp_transaction
    (dev_list=dev_list@entry=0x7ffd1b263bc0,
has_props=has_props@entry=false, props=0x5577e2c5f950,
props@entry=0x0, errp=errp@entry=0x7ffd1b263bf8) at blockdev.c:2470
#12 0x00005577e07ebfa5 in blockdev_do_action
    (errp=<optimized out>, action=0x7ffd1b263bb0) at blockdev.c:1118
#13 0x00005577e07ebfa5 in qmp_blockdev_snapshot
    (node=<optimized out>, overlay=<optimized out>,
errp=errp@entry=0x7ffd1b263bf8)
--Type <RET> for more, q to quit, c to continue without paging--
    at blockdev.c:1160
#14 0x00005577e0905a08 in qmp_marshal_blockdev_snapshot
    (args=<optimized out>, ret=<optimized out>, errp=0x7ffd1b263c68)
    at qapi/qapi-commands-block-core.c:343
#15 0x00005577e09c9f9c in do_qmp_dispatch
    (errp=0x7ffd1b263c60, allow_oob=<optimized out>,
request=<optimized out>, cmds=0x5577e12b5d80 <qmp_commands>) at
qapi/qmp-dispatch.c:132
#16 0x00005577e09c9f9c in qmp_dispatch
    (cmds=0x5577e12b5d80 <qmp_commands>, request=<optimized out>,
allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
#17 0x00005577e08e7ed1 in monitor_qmp_dispatch
    (mon=0x5577e251f2f0, req=<optimized out>) at monitor/qmp.c:145
#18 0x00005577e08e856a in monitor_qmp_bh_dispatcher (data=<optimized out>)
    at monitor/qmp.c:234
#19 0x00005577e0a11996 in aio_bh_call (bh=0x5577e241da20) at util/async.c:117
#20 0x00005577e0a11996 in aio_bh_poll (ctx=ctx@entry=0x5577e241c5d0)
    at util/async.c:117
#21 0x00005577e0a14d84 in aio_dispatch (ctx=0x5577e241c5d0) at
util/aio-posix.c:459
#22 0x00005577e0a11872 in aio_ctx_dispatch
    (source=<optimized out>, callback=<optimized out>,
user_data=<optimized out>)
    at util/async.c:260
--Type <RET> for more, q to quit, c to continue without paging--
#23 0x00007f791b3e067d in g_main_dispatch (context=0x5577e24aa110) at
gmain.c:3176
#24 0x00007f791b3e067d in g_main_context_dispatch
    (context=context@entry=0x5577e24aa110) at gmain.c:3829
#25 0x00005577e0a13e38 in glib_pollfds_poll () at util/main-loop.c:219
#26 0x00005577e0a13e38 in os_host_main_loop_wait (timeout=<optimized out>)
    at util/main-loop.c:242
#27 0x00005577e0a13e38 in main_loop_wait (nonblocking=<optimized out>)
    at util/main-loop.c:518
#28 0x00005577e07f60b1 in main_loop () at vl.c:1828
#29 0x00005577e06a1ff2 in main
    (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
at vl.c:4504

Expected results:
 After step3, snapshot fail with the correct error info.

Comment 2 aihua liang 2020-04-16 02:38:22 UTC
As it's a negative test and it can't be triggered by libvirt, set its priority to "low"

Comment 3 aihua liang 2020-09-11 09:25:25 UTC
Test on qemu-kvm-5.1.0-5.module+el8.3.0+7975+b80d25f1, still hit this issue.

Comment 5 John Ferlan 2021-09-08 21:49:55 UTC
Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 7 RHEL Program Management 2021-10-16 07:27:06 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 8 aihua liang 2021-10-18 08:28:37 UTC
Test on qemu-kvm-6.1.0-5.el9, still hit this core dump issue.

Hi, Kevin

 Will we plan to fix it? If yes, I will reopen it.

Thanks,
Aliang

Comment 9 Kevin Wolf 2021-10-18 13:01:01 UTC
Oh, this one didn't even have an assignee.

Yes, I'm reopening it. I'll fix it upstream and then we'll get it from the 6.2 rebase in time for 9.0-GA.

Comment 11 aihua liang 2021-12-17 06:39:46 UTC
Test with qemu-kvm-6.2.0-1.el9, don't hit this issue any more.

Test Steps:
 1.Start with qemu cmd:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35,memory-backend=mem-machine_mem \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 30720 \
    -object memory-backend-ram,size=30720M,id=mem-machine_mem  \
    -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
    -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
    -chardev socket,wait=off,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20211215-212014-u83qUkY3,server=on  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,wait=off,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20211215-212014-u83qUkY3,server=on  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=ida8F4GE \
    -chardev socket,wait=off,id=chardev_serial0,path=/tmp/serial-serial0-20211215-212014-u83qUkY3,server=on \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20211215-212014-u83qUkY3,path=/tmp/seabios-20211215-212014-u83qUkY3,server=on,wait=off \
    -device isa-debugcon,chardev=seabioslog_id_20211215-212014-u83qUkY3,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    --object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6,iothread=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:37:88:01:97:b6,id=idvRVbq8,netdev=idSXhwTw,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idSXhwTw,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
    -monitor stdio \

 2.Create target node
    {'execute':'blockdev-create','arguments':{'options':{'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
{"timestamp": {"seconds": 1639722920, "microseconds": 864008}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
{"timestamp": {"seconds": 1639722920, "microseconds": 864055}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
{"return": {}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762531}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762575}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762596}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
    {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
{"return": {}}
    {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/data.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
{"timestamp": {"seconds": 1639722937, "microseconds": 619983}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 620029}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
{"return": {}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621572}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621602}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621622}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
    {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}}
{"return": {}}
    {'execute':'job-dismiss','arguments':{'id':'job1'}}
    {'execute':'job-dismiss','arguments':{'id':'job2'}}
{"timestamp": {"seconds": 1639722951, "microseconds": 844576}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
{"return": {}}
{"timestamp": {"seconds": 1639722951, "microseconds": 844937}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
{"return": {}}

 3.Do snapshot from sn1 to sn1
   {"execute":"blockdev-snapshot","arguments":{"node":"sn1","overlay":"sn1"}}


Test Result:
  In step3, snapshot failed with info:
{"error": {"class": "GenericError", "desc": "Making 'sn1' a backing child of 'sn1' would create a cycle"}}

Comment 12 Yanan Fu 2021-12-20 12:44:29 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 16 aihua liang 2021-12-23 06:06:09 UTC
As comment 11 and comment 12, set bug's status to "VERIFIED".

Comment 18 errata-xmlrpc 2022-05-17 12:23:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: qemu-kvm), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2307


Note You need to log in before you can comment on or make changes to this bug.