Bug 1824363
| Summary: | Qemu core dump when do snapshot with same node and overlay that not existed in snapshot chain | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | aihua liang <aliang> |
| Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> |
| qemu-kvm sub component: | Block Jobs | QA Contact: | aihua liang <aliang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | low | CC: | coli, jinzhao, juzhang, kwolf, mrezanin, ngu, qzhang, virt-maint |
| Version: | 9.0 | Keywords: | EasyFix, Reopened, Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-6.2.0-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-17 12:23:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
As it's a negative test and it can't be triggered by libvirt, set its priority to "low" Test on qemu-kvm-5.1.0-5.module+el8.3.0+7975+b80d25f1, still hit this issue. Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. Test on qemu-kvm-6.1.0-5.el9, still hit this core dump issue. Hi, Kevin Will we plan to fix it? If yes, I will reopen it. Thanks, Aliang Oh, this one didn't even have an assignee. Yes, I'm reopening it. I'll fix it upstream and then we'll get it from the 6.2 rebase in time for 9.0-GA. Test with qemu-kvm-6.2.0-1.el9, don't hit this issue any more.
Test Steps:
1.Start with qemu cmd:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1' \
-sandbox on \
-machine q35,memory-backend=mem-machine_mem \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 30720 \
-object memory-backend-ram,size=30720M,id=mem-machine_mem \
-smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-chardev socket,wait=off,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20211215-212014-u83qUkY3,server=on \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,wait=off,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20211215-212014-u83qUkY3,server=on \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=ida8F4GE \
-chardev socket,wait=off,id=chardev_serial0,path=/tmp/serial-serial0-20211215-212014-u83qUkY3,server=on \
-device isa-serial,id=serial0,chardev=chardev_serial0 \
-chardev socket,id=seabioslog_id_20211215-212014-u83qUkY3,path=/tmp/seabios-20211215-212014-u83qUkY3,server=on,wait=off \
-device isa-debugcon,chardev=seabioslog_id_20211215-212014-u83qUkY3,iobase=0x402 \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
--object iothread,id=iothread1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
-blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \
-device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6,iothread=iothread1 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-net-pci,mac=9a:37:88:01:97:b6,id=idvRVbq8,netdev=idSXhwTw,bus=pcie-root-port-3,addr=0x0 \
-netdev tap,id=idSXhwTw,vhost=on \
-vnc :0 \
-rtc base=utc,clock=host,driftfix=slew \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
-monitor stdio \
2.Create target node
{'execute':'blockdev-create','arguments':{'options':{'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
{"timestamp": {"seconds": 1639722920, "microseconds": 864008}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
{"timestamp": {"seconds": 1639722920, "microseconds": 864055}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
{"return": {}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762531}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762575}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
{"timestamp": {"seconds": 1639722921, "microseconds": 762596}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}}
{"return": {}}
{'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/data.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
{"timestamp": {"seconds": 1639722937, "microseconds": 619983}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 620029}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
{"return": {}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621572}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621602}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
{"timestamp": {"seconds": 1639722937, "microseconds": 621622}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}}
{"return": {}}
{'execute':'job-dismiss','arguments':{'id':'job1'}}
{'execute':'job-dismiss','arguments':{'id':'job2'}}
{"timestamp": {"seconds": 1639722951, "microseconds": 844576}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
{"return": {}}
{"timestamp": {"seconds": 1639722951, "microseconds": 844937}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
{"return": {}}
3.Do snapshot from sn1 to sn1
{"execute":"blockdev-snapshot","arguments":{"node":"sn1","overlay":"sn1"}}
Test Result:
In step3, snapshot failed with info:
{"error": {"class": "GenericError", "desc": "Making 'sn1' a backing child of 'sn1' would create a cycle"}}
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. As comment 11 and comment 12, set bug's status to "VERIFIED". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: qemu-kvm), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2307 |
Description of problem: Qemu core dump when do snapshot with same node and overlay that not existed in snapshot chain Version-Release number of selected component (if applicable): kernel version:4.18.0-175.el8.x86_64 qemu-kvm version:qemu-kvm-4.2.0-18.module+el8.2.0+6278+dfae3426 How reproducible: 100% Steps to Reproduce: 1.Start guest with qemu cmds: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -m 4096 \ -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2 \ -cpu 'EPYC',+kvm_pv_unhalt \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20200203-033416-61dmcn92,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20200203-033416-61dmcn92,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idy8YPXp \ -chardev socket,path=/var/tmp/serial-serial0-20200203-033416-61dmcn92,server,nowait,id=chardev_serial0 \ -device isa-serial,id=serial0,chardev=chardev_serial0 \ -chardev socket,id=seabioslog_id_20200203-033416-61dmcn92,path=/var/tmp/seabios-20200203-033416-61dmcn92,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20200203-033416-61dmcn92,iobase=0x402 \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -blockdev node-name=file_data1,driver=file,aio=threads,filename=/home/data.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_data1 \ -device virtio-blk-pci,id=data1,drive=drive_data1,write-cache=on,bus=pcie.0-root-port-6,iothread=iothread1 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device virtio-net-pci,mac=9a:6c:ca:b7:36:85,id=idz4QyVp,netdev=idNnpx5D,bus=pcie.0-root-port-4,addr=0x0 \ -netdev tap,id=idNnpx5D,vhost=on \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:3000,server,nowait \ 2. Create a snapshot target in advance {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}} {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1'}} {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/data.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null}} {'execute':'job-dismiss','arguments':{'id':'job1'}} {'execute':'job-dismiss','arguments':{'id':'job2'}} 3. Do snapshot from sn1 to sn1 {"execute":"blockdev-snapshot","arguments":{"node":"sn1","overlay":"sn1"}} Ncat: Connection reset by peer. Actual results: After step3, qemu core dump with info: (qemu) qemu-kvm: block.c:2416: bdrv_replace_child_noperm: Assertion `new_bs->quiesce_counter <= new_bs_quiesce_counter' failed. bug.txt: line 42: 18428 Aborted (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox on -machine q35 -nodefaults ... gdb info: (gdb) bt #0 0x00007f7916b5870f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f7916b42b25 in __GI_abort () at abort.c:79 #2 0x00007f7916b429f9 in __assert_fail_base (fmt=0x7f7916ca8c28 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5577e0b96700 "new_bs->quiesce_counter <= new_bs_quiesce_counter", file=0x5577e0af7630 "block.c", line=2416, function=<optimized out>) at assert.c:92 #3 0x00007f7916b50cc6 in __GI___assert_fail (assertion=assertion@entry=0x5577e0b96700 "new_bs->quiesce_counter <= new_bs_quiesce_counter", file=file@entry=0x5577e0af7630 "block.c", line=line@entry=2416, function=function@entry=0x5577e0b98040 <__PRETTY_FUNCTION__.31656> "bdrv_replace_child_noperm") at assert.c:101 #4 0x00005577e0945017 in bdrv_replace_child_noperm (child=child@entry=0x5577e2d5df70, new_bs=new_bs@entry=0x5577e31e56f0) at block.c:2416 #5 0x00005577e0949f33 in bdrv_replace_child (child=child@entry=0x5577e2d5df70, new_bs=new_bs@entry=0x5577e31e56f0) at block.c:2453 #6 0x00005577e094aec8 in bdrv_root_attach_child (child_bs=child_bs@entry=0x5577e31e56f0, child_name=child_name@entry=0x5577e0b9cd52 "backing", child_role=child_role@entry=0x5577e1194080 <child_backing>, ctx=<optimize--Type <RET> for more, q to quit, c to continue without paging-- d out>, perm=<optimized out>, shared_perm=<optimized out>, opaque=0x5577e31e56f0, errp=0x7ffd1b263a50) at block.c:2557 #7 0x00005577e094b0a5 in bdrv_attach_child (parent_bs=parent_bs@entry=0x5577e31e56f0, child_bs=child_bs@entry=0x5577e31e56f0, child_name=child_name@entry=0x5577e0b9cd52 "backing", child_role=child_role@entry=0x5577e1194080 <child_backing>, errp=errp@entry=0x7ffd1b263a50) at block.c:6058 #8 0x00005577e094b1f5 in bdrv_set_backing_hd (bs=bs@entry=0x5577e31e56f0, backing_hd=backing_hd@entry=0x5577e31e56f0, errp=errp@entry=0x7ffd1b263a50) at block.c:2709 #9 0x00005577e094b4aa in bdrv_append (bs_new=0x5577e31e56f0, bs_top=0x5577e31e56f0, errp=errp@entry=0x7ffd1b263ab0) at block.c:4402 #10 0x00005577e07e8b70 in external_snapshot_prepare (common=0x5577e32a3940, errp=0x7ffd1b263b38) at blockdev.c:1683 #11 0x00005577e07ebd02 in qmp_transaction (dev_list=dev_list@entry=0x7ffd1b263bc0, has_props=has_props@entry=false, props=0x5577e2c5f950, props@entry=0x0, errp=errp@entry=0x7ffd1b263bf8) at blockdev.c:2470 #12 0x00005577e07ebfa5 in blockdev_do_action (errp=<optimized out>, action=0x7ffd1b263bb0) at blockdev.c:1118 #13 0x00005577e07ebfa5 in qmp_blockdev_snapshot (node=<optimized out>, overlay=<optimized out>, errp=errp@entry=0x7ffd1b263bf8) --Type <RET> for more, q to quit, c to continue without paging-- at blockdev.c:1160 #14 0x00005577e0905a08 in qmp_marshal_blockdev_snapshot (args=<optimized out>, ret=<optimized out>, errp=0x7ffd1b263c68) at qapi/qapi-commands-block-core.c:343 #15 0x00005577e09c9f9c in do_qmp_dispatch (errp=0x7ffd1b263c60, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5577e12b5d80 <qmp_commands>) at qapi/qmp-dispatch.c:132 #16 0x00005577e09c9f9c in qmp_dispatch (cmds=0x5577e12b5d80 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 #17 0x00005577e08e7ed1 in monitor_qmp_dispatch (mon=0x5577e251f2f0, req=<optimized out>) at monitor/qmp.c:145 #18 0x00005577e08e856a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 #19 0x00005577e0a11996 in aio_bh_call (bh=0x5577e241da20) at util/async.c:117 #20 0x00005577e0a11996 in aio_bh_poll (ctx=ctx@entry=0x5577e241c5d0) at util/async.c:117 #21 0x00005577e0a14d84 in aio_dispatch (ctx=0x5577e241c5d0) at util/aio-posix.c:459 #22 0x00005577e0a11872 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 --Type <RET> for more, q to quit, c to continue without paging-- #23 0x00007f791b3e067d in g_main_dispatch (context=0x5577e24aa110) at gmain.c:3176 #24 0x00007f791b3e067d in g_main_context_dispatch (context=context@entry=0x5577e24aa110) at gmain.c:3829 #25 0x00005577e0a13e38 in glib_pollfds_poll () at util/main-loop.c:219 #26 0x00005577e0a13e38 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #27 0x00005577e0a13e38 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #28 0x00005577e07f60b1 in main_loop () at vl.c:1828 #29 0x00005577e06a1ff2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Expected results: After step3, snapshot fail with the correct error info.