Description of problem: 'job-cancel' QMP command doesn't have the same semantics as 'block-job-cancel' which actually waits for a mirror job to sync, so using 'job-cancel' for mirror is wrong. Version-Release number of selected component (if applicable): upstream and all versions using blockdev
Fixed upstream: commit 59543dfad61d1be7226f62bc72b4f3fec6112b6b Author: Peter Krempa <pkrempa> Date: Wed Apr 21 15:47:59 2021 +0200 qemuDomainBlockJobAbort: Don't use 'job-cancel' instead of 'block-job-cancel' 'block-job-cancel' has one very important semantic difference to 'job-cancel', docummented in qemu as: Note that if you issue 'block-job-cancel' after 'drive-mirror' has indicated (via the event BLOCK_JOB_READY) that the source and destination are synchronized, then the event triggered by this command changes to BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended and the destination now has a point-in-time copy tied to the time of the cancellation. Since libvirt advertises the block copy job as having the synchronous abort feature we must not use 'job-cancel' here. Fixes: 4817b5ca1d0 Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Michal Privoznik <mprivozn>
Tested on libvirt-7.3.0-1.module+el8.5.0+11004+f4810536.x86_64 qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64 Version: libvirt-7.3.0-1.module+el8.5.0+11004+f4810536.x86_64 qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64 Steps: 1. Start an VM 2. Open the qemu-monitor by systemtap 3. # stap /usr/share/doc/libvirt-docs/examples/systemtap/qemu-monitor.stp 4. Do blockcopy and blockjob --abort # virsh blockcopy pc vda /var/lib/libvirt/images/pc --transient-job Block Copy started # virsh blockjob pc vda --abort The log of qemu-monitor after blockjob --abort: 194.571 > 0x7ffb8c04a080 {"execute":"block-job-cancel","arguments":{"device":"copy-vda-libvirt-3-format"},"id":"libvirt-398"} 194.572 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-398"} 194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552155}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "copy-vda-libvirt-3-format"}} 194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552223}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "copy-vda-libvirt-3-format"}} 194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552305}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "copy-vda-libvirt-3-format", "len": 10737811456, "offset": 10737811456, "speed": 0, "type": "mirror"}} 194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552344}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "copy-vda-libvirt-3-format"}} 194.578 < 0x7ffb8c04a080 {"return": [{"current-progress": 10737811456, "status": "concluded", "total-progress": 10737811456, "type": "mirror", "id": "copy-vda-libvirt-3-format"}], "id": "libvirt-399"} 194.579 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 554354}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "copy-vda-libvirt-3-format"}} 194.581 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-400"} 194.577 > 0x7ffb8c04a080 {"execute":"query-jobs","id":"libvirt-399"} 194.578 > 0x7ffb8c04a080 {"execute":"job-dismiss","arguments":{"id":"copy-vda-libvirt-3-format"},"id":"libvirt-400"} 194.582 > 0x7ffb8c04a080 {"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-format"},"id":"libvirt-401"} 194.583 > 0x7ffb8c04a080 {"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-storage"},"id":"libvirt-402"} 194.583 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-401"} 194.584 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-402"} The block-job-cancel is used instead of job-cancel. PASS
Though I know it is correct to use block-job-cancel, what's the difference to the users after this patch? Is it required patching to z stream?
(In reply to Han Han from comment #6) > Though I know it is correct to use block-job-cancel, what's the difference > to the users after this patch? The difference is only for cases which would use: # virsh blockcopy .... and then once the job is in synchronised phase # virsh blockjob --abort with the intention of keeping the destination image as a backup. In such case a full consistency of the image is ensured by qemu only when 'block-job-cancel' is used, but not when 'job-cancel' is used. > Is it required patching to z stream? No. I don't think this is a common use case.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684