RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1443493 - Improve live block device job status reporting via virDomainBlockJobInfo()
Summary: Improve live block device job status reporting via virDomainBlockJobInfo()
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Han Han
URL:
Whiteboard:
Depends On: 1372613
Blocks: 1442266
TreeView+ depends on / blocked
 
Reported: 2017-04-19 11:02 UTC by Nir Soffer
Modified: 2017-08-02 01:30 UTC (History)
12 users (show)

Fixed In Version: libvirt-3.2.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1372613
Environment:
Last Closed: 2017-08-02 00:05:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Testing program src code (1.31 KB, text/x-csrc)
2017-06-16 03:14 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description Nir Soffer 2017-04-19 11:02:10 UTC
+++ This bug was initially created as a clone of Bug #1372613 +++

In OpenStack Nova, we're trying to analyze a potential race
condition[0].  The operation flow is something like: perform a live
shallow blockRebase(), check for progress with blockJobInfo(), followed
by a blockJobAbort() (QMP 'block-job-cancel') to get a live
point-in-time snapshot, then convert the resulting shallow copy with a
backing file into a flattenned image with (`qemu-img convert`), and Nova
uploads that into the image repository (Glance).  Nova uses some
wrappers around the above APIs, and calls them this way, in its
_live_snapshot() method[1]:

            [...]
            dev.rebase(disk_delta, copy=True, reuse_ext=True, shallow=True)

            while dev.wait_for_job():
                time.sleep(0.5)

            dev.abort_job()
            [...]

Looking at a failed blockRebase() operation from libvirt debug log, the
root cause turns out to be libvirt issuing QMP command
'block-job-cancel' (blockJobAbort()) when the block device job status
is "ready": false, which results in a corrupted destination file.  (In
a successful case, QMP command 'block-job-cancel' should be issued when
the job status is: "ready": true).  

My libvirtd log analysis in the Nova bug is here[2] from a failed case,
which has QMP traffic back-n-forth.

    - - -

So, I'm trying to understand how libvirt reports the "cur" and "end"
values.  I've read the virDomainBlockJobInfo() struct, it wasn't crystal
clear.  It states:

/*
 * The following fields provide an indication of block job progress.  @cur
 * indicates the current position and will be between 0 and @end.  @end is
 * the final cursor position for this operation and represents completion.
 * To approximate progress, divide @cur by @end.
 */

Now, if a job hasn't started, what would libvirt report?  Talking to
Michal Privoznik on IRC:

[mprivozn]

    I wonder if libvirt should report something else than start=0 end=0
    in order to tell the caller that job hasn't been started yet.  I
    suspect that libvirt does not report correctly whether job has
    started already, which in turns forces Nova to use some workarounds.

[kashyap]

    So, if the job hasn't started yet, what should libvirt report? 

[mprivozn]

    That's the question. We can't change the [virDomainBlockJobInfo]
    struct (otherwise we won't be ABI compatible), so we can't really
    add a bolean there 'bool job_started'.

    Or we can introduce new "job type" which wouldn't really be a job
    type, but we will fill status.job with it to say explicitly job
    hasn't started yet


    - - -

I realize that a copy job has two phases, as clearly stated in the API
documenation of virDomainBlockRebase()[3]:

  "[...] The job transitions to the second phase when the job info
  states cur == end, and remains alive to mirror all further changes to
  both source and destination.  The user must call
  virDomainBlockJobAbort() to end the mirroring while choosing whether
  to revert to source or pivot to the destination. [...]"

--- Additional comment from Kashyap Chamarthy on 2016-09-02 04:29:21 EDT ---

Eric Blake's response from the upstream libvirt mailing list thread, when I asked for any thoughts on how to improve things here:

http://www.redhat.com/archives/libvir-list/2016-September/msg00017.html

Re-posting it here:

On 09/01/2016 08:57 AM, Kashyap Chamarthy wrote:
> So, I'm trying to understand how libvirt reports the "cur" and "end"
> values.  I've read the virDomainBlockJobInfo() struct, it wasn't crystal
> clear.  It states:
> 
> /*
>  * The following fields provide an indication of block job progress.  @cur
>  * indicates the current position and will be between 0 and @end.  @end is
>  * the final cursor position for this operation and represents completion.
>  * To approximate progress, divide @cur by @end.
>  */

Libvirt is (more or less) reporting numbers from qemu.  What's more,
@end need not be the same between calls; qemu is free to change the end
value as it comes up with more work to do (or sees that less work
remains than initially estimated), all that REALLY matters is that the
ratio between the two numbers converges, and is < 1 while busy, and == 1
when complete.  Except that there are cases in qemu where a block job
really has 0 work to do.

Prior to qemu exposing the "ready":true/false flag, libvirt had to guess
whether equal numbers really meant done, or if it merely meant nearly
done.  But now that we have "ready", I think the sanest course of action
for libvirt is to fudge the numbers from qemu.  After all, we've already
documented (both in libvirt and in qemu) that @end is not fixed, so much
as a moving target (it just doesn't move very much on operations where
we have a good grasp on how much work remains from the start, like a
deep copy; but is more prone to move on operations like live commit that
are influenced by how active the guest is at writing data that we are
attempting to commit at the same time).

> 
> Now, if a job hasn't started, what would libvirt report?  Talking to
> Michal Privoznik on IRC:
> 
> [mprivozn]
> 
>     I wonder if libvirt should report something else than start=0 end=0
>     in order to tell the caller that job hasn't been started yet.  I
>     suspect that libvirt does not report correctly whether job has
>     started already, which in turns forces Nova to use some workarounds.
> 
> [kashyap]
> 
>     So, if the job hasn't started yet, what should libvirt report? 
> 
> [mprivozn]
> 
>     That's the question. We can't change the [virDomainBlockJobInfo]
>     struct (otherwise we won't be ABI compatible), so we can't really
>     add a bolean there 'bool job_started'.
> 
>     Or we can introduce new "job type" which wouldn't really be a job
>     type, but we will fill status.job with it to say explicitly job
>     hasn't started yet
> 
> 
> So, any other thoughts here, on how to proceed here?

My preference would be:

If qemu doesn't report anything (because the job is not started yet),
then libvirt should report cur=0, end=1 (the job still has 100% to go).

If qemu reports 0/0 and "done":false, then libvirt should report cur=0,
end=1 (that is, we fudge the end to be larger, because the job is not
done yet).

If qemu reports 0/0 and "done":true (because the job was really a
no-op), then libvirt should report cur=1, end=1 (the job is 100% complete).

If qemu reports 0/0 and lacks "done" (older qemu), then libvirt just has
to guess.  I'm not sure which guess is most appropriate; maybe libvirt
itself will have to set up a timer and report 0/1 the first time, and
only report 1/1 after a minimum time has elapsed, to make sure qemu has
had a chance to do something about the job.  Or maybe we don't worry
about it, and just have libvirt report 0/0 because we really don't know
any better.

If qemu reports a/b, where a < b and b > 0, use those numbers as is.  We
don't even have to check "done".

If qemu reports a/a, where a > 0, then also check "done". If
"done":false is present, report a-1/a (the job is not quite done); if
"done" is absent or "done":true is present, report a/a (the job is done).


> I realize that a copy job has two phases, as clearly stated in the API
> documenation of virDomainBlockRebase()[3]:
> 
>   "[...] The job transitions to the second phase when the job info
>   states cur == end, and remains alive to mirror all further changes to
>   both source and destination.  The user must call
>   virDomainBlockJobAbort() to end the mirroring while choosing whether
>   to revert to source or pivot to the destination. [...]"

We may also want to enhance the libvirt documentation to mention cur ==
non-zero end as the phase change point, mentioning that 0/0 is a special
case (hopefully reserved for older qemu).

--- Additional comment from Michal Privoznik on 2016-09-14 06:47:41 EDT ---

I've just pushed the patches upstream:

commit 988218ca3f0de571d97d02365db9ffa0845c18da
Author:     Michal Privoznik <mprivozn>
AuthorDate: Fri Sep 2 09:45:44 2016 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Wed Sep 14 12:44:42 2016 +0200

    virDomainGetBlockJobInfo: Fix corner case when qemu reports no info
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1372613
    
    Apparently, some management applications use the following code
    pattern when waiting for a block job to finish:
    
      while (1) {
        virDomainGetBlockJobInfo(dom, disk, info, flags);
    
        if (info.cur == info.end)
            break;
    
        sleep(1);
      }
    
    Problem with this approach is in its corner cases. In case of
    QEMU, libvirt merely pass what has been reported on the monitor.
    However, if the block job hasn't started yet, qemu reports cur ==
    end == 0 which tricks mgmt apps into thinking job is complete.
    
    The solution is to mangle cur/end values as described here [1].
    
    1: https://www.redhat.com/archives/libvir-list/2016-September/msg00017.html
    
    Signed-off-by: Michal Privoznik <mprivozn>

commit 5d213b34de442490c66ee54b17d15e68cfeb2174
Author:     Michal Privoznik <mprivozn>
AuthorDate: Fri Sep 2 08:38:19 2016 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Wed Sep 14 12:44:42 2016 +0200

    qemuDomainGetBlockJobInfo: Move info translation into separate func
    
    Even though we merely just pass to users whatever qemu provided
    on the monitor, we still do some translation. For instance we
    turn bytes into mebibytes, or fix job type if needed. However, in
    the future there is more fixing to be done so this code deserves
    its own function.
    
    Signed-off-by: Michal Privoznik <mprivozn>

--- Additional comment from Nir Soffer on 2017-04-19 03:08:26 EDT ---

Michal, is this fix available in 7.3?

We need this for bug 1442266.

--- Additional comment from Peter Krempa on 2017-04-19 03:56:35 EDT ---

No. Also this is the upstream bugzilla.

Comment 2 Nir Soffer 2017-04-19 11:10:30 UTC
This will allow RHV to handle correctly blockJobInfo returning cur=0 and end=0, 
currently RHV assumes that a block job was completed, and invoke blockJobAbort
too early.

I discussed this bug with Michal on irc, and he thinks we can backport the fix
to 7.3.

Comment 4 Han Han 2017-06-15 10:47:14 UTC
Hi Michal, I tried to reproduce this bug on libvirt-2.0.0-10.el7_3.9.x86_64 
qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64, but failed.
My code blockjob.c
...
#include <stdio.h>
#include <stdlib.h>
#include <libvirt/libvirt.h>

int main(int argc, char *argv[])
{
    virConnectPtr conn;
    virDomainPtr dom;
    virDomainBlockJobInfo info;
    const char *domName =argv[1];
    const char *disk = argv[2];

    conn = virConnectOpen("qemu:///system");
    if (conn == NULL) {
        fprintf(stderr, "Failed to open connection to qemu:///system\n");
        return 1;
    }
    dom = virDomainLookupByName(conn, domName);
    virDomainBlockRebase(dom, disk, argv[3], 0, VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT);
    while (1) {
        virDomainGetBlockJobInfo(dom, disk, &info, 0);
        printf("blockjob info: bw %lu, cur %llu, end %llu\n", info.bandwidth,
                info.cur, info.end);
        if (info.cur == info.end)
            break;
        sleep(1);
    }
    virDomainBlockJobAbort(dom, disk, 0);
    printf("The end");
    virConnectClose(conn);
    return 0;
}
...

Compile the code:
# gcc blockjob.c -o blockjob `pkg-config libvirt --libs` -g

Start the VM:
# virsh list 
 Id    Name                           State
----------------------------------------------------
 7     V                              running

# virsh domblklist V
Target     Source
------------------------------------------------
vda        /exports/nfs.s1
vdb        /exports/vdb

# qemu-img info /exports/nfs.s1
image: /exports/nfs.s1
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 1.1M
cluster_size: 65536
backing file: /exports/nfs.1497518937
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

# qemu-img create -f qcow2 /exports/nfs.s3 10G                                                                                                 
Formatting '/exports/nfs.s3', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

Then run the code:
# /blockjob V vda /exports/nfs.s3                                                                                                             
blockjob info: bw 0, cur 0, end 786432
blockjob info: bw 0, cur 786432, end 786432

Check /exports/nfs.s3:
# qemu-img info /exports/nfs.s3
image: /exports/nfs.s3
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 1.1M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

I didn't get any event like 'cur == end == 0' or BlockRebase failed or BlockJobAbort failed.
Could you give some ideas abort hitting the corner case?

Comment 5 Michal Privoznik 2017-06-15 11:48:47 UTC
(In reply to Han Han from comment #4)

> I didn't get any event like 'cur == end == 0' or BlockRebase failed or
> BlockJobAbort failed.
> Could you give some ideas abort hitting the corner case?

While your program is running you need to abort the job from a different terminal. The JobAbort() call you have in your code is called only after the job has finished.

Comment 6 Han Han 2017-06-16 03:11:54 UTC
Reproduce it on libvirt-2.0.0-10.el7_3.9.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64

To hit the corner case, we should do BlockGetJobInfo() and BlockJobAbort() in parallel and check if cur==end==0 .

1. Compile the test code
# gcc blockjob.c -o blockjob `pkg-config libvirt --libs` -g

2. Prepare a running VM, and the VM has big disk size to hit the corner case more easily.
# virsh list 
 Id    Name                           State
----------------------------------------------------
 12    V                              running
# virsh domblklist V
Target     Source
------------------------------------------------
vda        /exports/nfs.qcow2
# qemu-img info /exports/nfs.qcow2
image: /exports/nfs.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 8.0G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3. Create a image the same virtual size as guest image's for reusing .
# qemu-img create -f qcow2 /exports/nfs.s3 10G
Formatting '/exports/nfs.s3', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

4. Run the testing progrem as this format:
./blockjob DOM_NAME TARGET_DISK REUSE_DISK

# ./blockjob V vda /exports/nfs.s3            
blockjob info: bw 0, cur 0, end 0
Abort start
Abort finished

The rebase job started but not ready with cur==end==0. Corner case hits.

  
Verify it on libvirt-3.2.0-10.el7.x86_64 qemu-kvm-rhev-2.9.0-10.el7.x86_64:
Redo step 1~4 then run testing program
# ./blockjob test vda /tmp/aaa
blockjob info: bw 0, cur 0, end 1
blockjob info: bw 0, cur 0, end 6115360768
Abort start
blockjob info: bw 0, cur 6356992, end 6115360768
Abort finished
blockjob info: bw 0, cur 0, end 0

As the patch says, when rebase job started but not ready, BlockJobGetInfo should get cur==0,end==1.
Expected result. Bug fixed.

Comment 7 Han Han 2017-06-16 03:14:45 UTC
Created attachment 1288237 [details]
Testing program src code

Comment 8 errata-xmlrpc 2017-08-02 00:05:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 9 errata-xmlrpc 2017-08-02 01:30:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.