Bug 1119173 - The default behavor of abort block job with pivot flag isn't sync
Summary: The default behavor of abort block job with pivot flag isn't sync
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: Eric Blake
QA Contact:
Depends On:
Blocks: 1119385 1119387
TreeView+ depends on / blocked
Reported: 2014-07-14 08:22 UTC by Xu He Jie
Modified: 2014-07-16 13:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1119385 (view as bug list)
Last Closed: 2014-07-16 13:28:03 UTC

Attachments (Terms of Use)

Description Xu He Jie 2014-07-14 08:22:39 UTC
Description of problem:

The API virDomainBlockJobAbort should be sync default. But with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT flag, the api is async, not sync.

    if (disk->mirror && mode == BLOCK_JOB_ABORT &&
        ret = qemuDomainBlockPivot(conn, driver, vm, device, disk);
        goto endjob;

After called qemuDomainBlockPivot, the code goto the endjob. That skip the
block simulation code.

Version-Release number of selected component (if applicable):
Trunk code

How reproducible:

Steps to Reproduce:
1. virDomainBlockRebase with VIR_DOMAIN_BLOCK_REBASE_COPY flag
2. virDomainBlockJobAbort with VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT flag

Actual results:
virDomainBlockJobAbort is async

Expected results:
virDomainBlockJobAbort is sync

Additional info:

Comment 1 Peter Krempa 2014-07-14 08:37:40 UTC
From what I understand, the "block-job-complete" qmp command called by qemuDomainBlockPivot() is inherently synchronous and thus we don't need to simulate the blocking of the API.

Comment 2 Xu He Jie 2014-07-14 13:13:26 UTC
From this test code http://git.qemu.org/?p=qemu.git;a=blob_plain;f=tests/qemu-iotests/041;hb=HEAD to see, the block-job-complete is async. Even if you are right 'block-job-complete' is sync, that also means VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag doesn't work.

Comment 3 Eric Blake 2014-07-14 15:59:54 UTC
Blech.  qemu's block-job-complete is inherently async - it merely requests the end of a job and returns control immediately; but the job may take a while longer to actually end because of time spent flushing sectors to disk.  Looks like I introduced the bug when adding blockcopy in commit eaba79d, and that I merely need to update where the goto jumps.

Comment 4 Eric Blake 2014-07-14 16:10:03 UTC
We've explicitly documented that the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag can, but not must, return control before the job has actually ended.  Ignoring the async flag is not a problem. But when the flag is not present, and we have promised sync behavior, then libvirt does need to wait for the event.

Comment 5 Eric Blake 2014-07-14 16:24:23 UTC
Upstream patch proposed:

Comment 6 Eric Blake 2014-07-16 13:28:03 UTC
Fixed for 1.2.7:

commit 97c59b9c46f915c48cd5db96ada40f060553bcae
Author: Eric Blake <eblake@redhat.com>
Date:   Mon Jul 14 10:13:18 2014 -0600

    blockjob: wait for pivot to complete
    https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
    commit eaba79d was flawed in the implementation of the
    VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
    a blockcopy.  Basically, the qemu pivot action is async (the QMP
    command returns immediately, but the user must wait for the
    BLOCK_JOB_COMPLETE event to know that all I/O related to the job
    has finally been flushed), but the libvirt command was documented
    as synchronous by default.  As active block commit will also be
    using this code, it is worth fixing now.
    * src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
    loop after pivot.
    Signed-off-by: Eric Blake <eblake@redhat.com>

Note You need to log in before you can comment on or make changes to this bug.