Bug 1951555 - Fix incorrect use of job-cancel and block-job-cancel
Summary: Fix incorrect use of job-cancel and block-job-cancel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.4
Assignee: Peter Krempa
QA Contact: Han Han
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-20 12:31 UTC by Peter Krempa
Modified: 2021-11-16 08:27 UTC (History)
4 users (show)

Fixed In Version: libvirt-7.3.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-16 07:52:40 UTC
Type: Bug
Target Upstream Version: 7.3.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:4684 0 None None None 2021-11-16 07:53:23 UTC

Description Peter Krempa 2021-04-20 12:31:09 UTC
Description of problem:
'job-cancel' QMP command doesn't have the same semantics as 'block-job-cancel' which actually waits for a mirror job to sync, so using 'job-cancel' for mirror is wrong.

Version-Release number of selected component (if applicable):
upstream and all versions using blockdev

Comment 1 Peter Krempa 2021-04-27 12:09:07 UTC
Fixed upstream:

commit 59543dfad61d1be7226f62bc72b4f3fec6112b6b
Author: Peter Krempa <pkrempa>
Date:   Wed Apr 21 15:47:59 2021 +0200

    qemuDomainBlockJobAbort: Don't use 'job-cancel' instead of 'block-job-cancel'
    
    'block-job-cancel' has one very important semantic difference to
    'job-cancel', docummented in qemu as:
    
      Note that if you issue 'block-job-cancel' after 'drive-mirror' has indicated
      (via the event BLOCK_JOB_READY) that the source and destination are
      synchronized, then the event triggered by this command changes to
      BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended and the
      destination now has a point-in-time copy tied to the time of the cancellation.
    
    Since libvirt advertises the block copy job as having the synchronous
    abort feature we must not use 'job-cancel' here.
    
    Fixes: 4817b5ca1d0
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Michal Privoznik <mprivozn>

Comment 5 Han Han 2021-05-19 06:11:35 UTC
Tested on libvirt-7.3.0-1.module+el8.5.0+11004+f4810536.x86_64 qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64
Version:
libvirt-7.3.0-1.module+el8.5.0+11004+f4810536.x86_64
qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64

Steps:
1. Start an VM
2. Open the qemu-monitor by systemtap
3. # stap /usr/share/doc/libvirt-docs/examples/systemtap/qemu-monitor.stp

4. Do blockcopy and blockjob --abort
# virsh blockcopy pc vda /var/lib/libvirt/images/pc --transient-job                                                                                                           
Block Copy started

# virsh blockjob pc vda --abort

The log of qemu-monitor after blockjob --abort:
194.571 > 0x7ffb8c04a080 {"execute":"block-job-cancel","arguments":{"device":"copy-vda-libvirt-3-format"},"id":"libvirt-398"}                                                                
194.572 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-398"}
194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552155}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "copy-vda-libvirt-3-format"}}      
194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552223}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "copy-vda-libvirt-3-format"}}      
194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552305}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "copy-vda-libvirt-3-format", "len": 10737811456, "offset": 10737811456, "speed": 0, "type": "mirror"}}
194.577 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 552344}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "copy-vda-libvirt-3-format"}}    
194.578 < 0x7ffb8c04a080 {"return": [{"current-progress": 10737811456, "status": "concluded", "total-progress": 10737811456, "type": "mirror", "id": "copy-vda-libvirt-3-format"}], "id": "libvirt-399"}
194.579 ! 0x7ffb8c04a080 {"timestamp": {"seconds": 1621402736, "microseconds": 554354}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "copy-vda-libvirt-3-format"}}         
194.581 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-400"}
194.577 > 0x7ffb8c04a080 {"execute":"query-jobs","id":"libvirt-399"}
194.578 > 0x7ffb8c04a080 {"execute":"job-dismiss","arguments":{"id":"copy-vda-libvirt-3-format"},"id":"libvirt-400"}                                                                         
194.582 > 0x7ffb8c04a080 {"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-format"},"id":"libvirt-401"}                                                                          
194.583 > 0x7ffb8c04a080 {"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-storage"},"id":"libvirt-402"}                                                                         
194.583 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-401"}
194.584 < 0x7ffb8c04a080 {"return": {}, "id": "libvirt-402"}

The block-job-cancel is used instead of job-cancel. PASS

Comment 6 Han Han 2021-05-19 06:23:04 UTC
Though I know it is correct to use block-job-cancel, what's the difference to the users after this patch?
Is it required patching to z stream?

Comment 7 Peter Krempa 2021-05-19 06:30:14 UTC
(In reply to Han Han from comment #6)
> Though I know it is correct to use block-job-cancel, what's the difference
> to the users after this patch?

The difference is only for cases which would use:

 # virsh blockcopy ....

and then once the job is in synchronised phase

 # virsh blockjob --abort 

with the intention of keeping the destination image as a backup. In such case a full consistency of the image is ensured by qemu only when 'block-job-cancel' is used, but not when 'job-cancel' is used.

> Is it required patching to z stream?

No. I don't think this is a common use case.

Comment 9 errata-xmlrpc 2021-11-16 07:52:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684


Note You need to log in before you can comment on or make changes to this bug.