Bug 1119387

Summary: The default behavor of abort block job with pivot flag isn't sync
Product: Red Hat Enterprise Linux 7 Reporter: Eric Blake <eblake>
Component: libvirtAssignee: Eric Blake <eblake>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: dyuan, eblake, mzhan, pkrempa, rbalakri, shyu, virt-bugs, xuhj, zhwang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.7-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1119385 Environment:
Last Closed: 2015-03-05 07:41:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1119173, 1119385    
Bug Blocks:    

Description Eric Blake 2014-07-14 16:27:06 UTC
Cloning to RHEL 7

+++ This bug was initially created as a clone of Bug #1119385 +++

Cloning to RHEL 6 - 'virsh blockcopy' is not obeying its docs.

+++ This bug was initially created as a clone of Bug #1119173 +++

Description of problem:
http://libvirt.org/html/libvirt-libvirt.html#virDomainBlockJobAbort

The API virDomainBlockJobAbort should be sync default. But with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT flag, the api is async, not sync.


    if (disk->mirror && mode == BLOCK_JOB_ABORT &&
        (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) {
        ret = qemuDomainBlockPivot(conn, driver, vm, device, disk);
        goto endjob;
    }

After called qemuDomainBlockPivot, the code goto the endjob. That skip the
block simulation code.

Version-Release number of selected component (if applicable):
Trunk code

How reproducible:


Steps to Reproduce:
1. virDomainBlockRebase with VIR_DOMAIN_BLOCK_REBASE_COPY flag
2. virDomainBlockJobAbort with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT flag


Actual results:
virDomainBlockJobAbort is async

Expected results:
virDomainBlockJobAbort is sync

Additional info:

--- Additional comment from Peter Krempa on 2014-07-14 02:37:40 MDT ---

From what I understand, the "block-job-complete" qmp command called by qemuDomainBlockPivot() is inherently synchronous and thus we don't need to simulate the blocking of the API.

--- Additional comment from Xu He Jie on 2014-07-14 07:13:26 MDT ---

From this test code http://git.qemu.org/?p=qemu.git;a=blob_plain;f=tests/qemu-iotests/041;hb=HEAD to see, the block-job-complete is async. Even if you are right 'block-job-complete' is sync, that also means VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag doesn't work.

--- Additional comment from Eric Blake on 2014-07-14 09:59:54 MDT ---

Blech.  qemu's block-job-complete is inherently async - it merely requests the end of a job and returns control immediately; but the job may take a while longer to actually end because of time spent flushing sectors to disk.  Looks like I introduced the bug when adding blockcopy in commit eaba79d, and that I merely need to update where the goto jumps.

--- Additional comment from Eric Blake on 2014-07-14 10:10:03 MDT ---

We've explicitly documented that the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag can, but not must, return control before the job has actually ended.  Ignoring the async flag is not a problem. But when the flag is not present, and we have promised sync behavior, then libvirt does need to wait for the event.

--- Additional comment from Eric Blake on 2014-07-14 10:24:23 MDT ---

Upstream patch proposed:
https://www.redhat.com/archives/libvir-list/2014-July/msg00664.html

Comment 1 Eric Blake 2014-07-16 13:28:34 UTC
Will be fixed in 7.1 by virtue of rebase:

commit 97c59b9c46f915c48cd5db96ada40f060553bcae
Author: Eric Blake <eblake>
Date:   Mon Jul 14 10:13:18 2014 -0600

    blockjob: wait for pivot to complete
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
    commit eaba79d was flawed in the implementation of the
    VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
    a blockcopy.  Basically, the qemu pivot action is async (the QMP
    command returns immediately, but the user must wait for the
    BLOCK_JOB_COMPLETE event to know that all I/O related to the job
    has finally been flushed), but the libvirt command was documented
    as synchronous by default.  As active block commit will also be
    using this code, it is worth fixing now.
    
    * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
    loop after pivot.
    
    Signed-off-by: Eric Blake <eblake>

Comment 3 Shanzhi Yu 2014-11-26 10:12:28 UTC
Verify this bug with 1.2.8-9.el7
Steps are same as comment 5 and comment 6 in bug 1119385
https://bugzilla.redhat.com/show_bug.cgi?id=1119385#c5
https://bugzilla.redhat.com/show_bug.cgi?id=1119385#c5

Comment 5 errata-xmlrpc 2015-03-05 07:41:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html