RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1461303 - RHEL7.4:Live merge fails: libvirtError: Requested operation is not valid: can't keep relative backing relationship
Summary: RHEL7.4:Live merge fails: libvirtError: Requested operation is not valid: can...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: x86_64
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 7.4
Assignee: Peter Krempa
QA Contact: Han Han
URL:
Whiteboard:
Depends On:
Blocks: 1464002
TreeView+ depends on / blocked
 
Reported: 2017-06-14 07:29 UTC by Elad
Modified: 2019-04-28 13:31 UTC (History)
16 users (show)

Fixed In Version: libvirt-3.2.0-12.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1464002 (view as bug list)
Environment:
Last Closed: 2017-08-02 01:34:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs from engine and hosts (without libivrtd.log) (8.93 MB, application/x-gzip)
2017-06-14 08:22 UTC, Elad
no flags Details
VM xml as reported by libvirt (6.88 KB, text/plain)
2017-06-15 09:46 UTC, Ala Hino
no flags Details
Scripts in comment28 (1.64 KB, application/x-gzip)
2017-06-22 06:23 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description Elad 2017-06-14 07:29:50 UTC
Description of problem:
Live merge fails with the following libvirt error in vdsm:

2017-06-11 05:52:07,880+0300 ERROR (jsonrpc/5) [virt.vm] (vmId='3a2f7d53-32c6-4739-a707-8c0355c5b13e') Live merge failed (job: 044df6b6-4d24-4b4d-aae3-ccc4dc1fe8ce) (vm:4878)
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 4874, in merge
    flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 941, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 678, in blockCommit
    if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
libvirtError: Requested operation is not valid: can't keep relative backing relationship
2017-06-11 05:52:07,893+0300 DEBUG (jsonrpc/2) [storage.Misc.excCmd] SUCCESS: <err> = ''; <rc> = 0 (commands:93)


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Elad 2017-06-14 07:58:43 UTC
Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux Server 7.4 Beta (Maipo)
vdsm-4.19.18-1.el7ev.x86_64
libvirt-daemon-3.2.0-4.el7.x86_64
qemu-kvm-rhev-2.9.0-9.el7.x86_64
sanlock-3.5.0-1.el7.x86_64
selinux-policy-3.13.1-145.el7.noarch
ovirt-engine-4.1.3.2-0.1.el7.noarch

How reproducible:
Found in RHV automation, was the first attempt to execute live merge [1] test plan on RHL7.4. 1 case failed with this out of 5 executed 

[1]
https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/workitem?id=RHEVM3-6037

Steps to Reproduce:
From https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/workitem?id=RHEVM3-6052
1. Create a VM with 4 disks and OS installed
2. Write file 1 to all disks	
3. Create a snapshot of the VM with all disks	
4. Write file 2 to all disks	
5. Create a snapshot 2 of the VM with all disks	
6. Write file 3 to all disks	
7. Create snapshot 3 fo the VM with all disks	
8. Start writing to one of the disks using dd and delete snapshot 2

Actual results:
Live snapshot merge failed with the mentioned exception

Expected results:
Live merge should succeed

Additional info:
Logs from hosts and engine

Comment 2 Allon Mureinik 2017-06-14 08:21:58 UTC
(In reply to Elad from comment #1)
> Additional info:
> Logs from hosts and engine
I think you forgot them :-(

Comment 3 Elad 2017-06-14 08:22:28 UTC
Created attachment 1287558 [details]
logs from engine and hosts (without libivrtd.log)

libivrtd.log:
http://file.tlv.redhat.com/ebenahar/libvirtd.log.tar.gz

Comment 4 Elad 2017-06-14 08:24:06 UTC
Provided

Comment 5 Allon Mureinik 2017-06-14 08:28:15 UTC
Targetting to 4.1.3 and treating as a blocker unless we can show this isn't a regression.
RHEL 7.4 is quite close, and we need to future-proof 4.1.3 against it.

Comment 6 Ala Hino 2017-06-14 20:16:36 UTC
My analysis below:

No need for a VM with 4 disks, neither to install OS nor copy data. All my tests were done on a VM with one disk and no OS installed.

1. When creating a single snapshot and then deleting (live merging) it, the merge succeeds

2. When creating two snapshots and then deleting any one, I am getting the following error:

2017-06-14 22:58:16,542+0300 ERROR (jsonrpc/7) [jsonrpc.JsonRpcServer] Internal server error (__init__:577)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 572, in _handle_request
    res = method(**params)
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod
    result = fn(*methodArgs)
  File "/usr/share/vdsm/API.py", line 838, in merge
    return v.merge(drive, baseVolUUID, topVolUUID, bandwidth, jobUUID)
  File "<string>", line 2, in merge
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/share/vdsm/virt/vm.py", line 4891, in merge
    capacity, alloc, physical = self._getExtendInfo(drive)
  File "/usr/share/vdsm/virt/vm.py", line 834, in _getExtendInfo
    capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 941, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 694, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: internal error: failed to query the maximum written offset of block device 'sda'

3. When creating three snapshots and then deleting the first or the second, I am getting the following *different* exception:

2017-06-14 22:41:54,074+0300 ERROR (jsonrpc/7) [virt.vm] (vmId='eb457d4d-3b9c-4a0f-b050-6ae5d5903910') Live merge failed (job: 23c7ac0f-a85e-4bea-85c8-3ba722bbb93e) (vm:4878)
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 4874, in merge
    flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 941, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 678, in blockCommit
    if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
libvirtError: Requested operation is not valid: can't keep relative backing relationship

However, when deleting the third one (the base of the active layer), I am getting same error as in scenario #2 above.

I shared this info with libvirt guys (pkrempa and eblake) and waiting their advice.

Comment 8 Ala Hino 2017-06-15 09:46:00 UTC
Created attachment 1287971 [details]
VM xml as reported by libvirt

Comment 13 Ala Hino 2017-06-19 09:18:00 UTC
Moving to Red Hat Enterprise Linux 7 and assigning bug to Peter.

Comment 16 Peter Krempa 2017-06-19 11:12:37 UTC
I've figured it out. This was caused by commit:

commit 7456c4f5f064f692e5f89a9ee3ef0fb54041e23b
Author: Peter Krempa <pkrempa>
Date:   Fri Dec 16 07:10:46 2016 +0100

    qemu: snapshot: Don't redetect backing chain after snapshot
    
    Libvirt is able to properly model what happens to the backing chain
    after a snapshot so there's no real need to redetect the data.
    Additionally with the _REUSE_EXT flag this might end up in redetecting
    wrong data if the user puts wrong backing chain reference into the
    snapshot image.

Since that commit libvirt does not redetect the backing chain. This is not wrong, but libvirt does not load the data necessary to keep the relative relationship, when _REUSE_EXT flag.

Comment 17 Han Han 2017-06-20 07:13:23 UTC
As I tested, this bug can be reproduced on RHV4.1 and RHEL7.4 host.

Comment 18 Peter Krempa 2017-06-20 07:30:26 UTC
Reproducer only in libvirt:

Create a backing chain of images:

# qemu-img create -f qcow2 a 10M
Formatting 'a', fmt=qcow2 size=10485760 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# qemu-img create -f qcow2 -o backing_fmt=qcow2 -b a b 
Formatting 'b', fmt=qcow2 size=10485760 backing_file=a backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# qemu-img create -f qcow2 -o backing_fmt=qcow2 -b b c
Formatting 'c', fmt=qcow2 size=10485760 backing_file=b backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# qemu-img create -f qcow2 -o backing_fmt=qcow2 -b c d
Formatting 'd', fmt=qcow2 size=10485760 backing_file=c backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# qemu-img info --backing-chain /var/lib/libvirt/images/d 
image: /var/lib/libvirt/images/d
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
backing file: c (actual path: /var/lib/libvirt/images/c)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/c
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
backing file: b (actual path: /var/lib/libvirt/images/b)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/b
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
backing file: a (actual path: /var/lib/libvirt/images/a)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/a
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Use the /var/lib/libvirt/images/a (or equivalent) base image to start a VM.

Create a few snapshots using the pre-created image files:

virsh snapshot-create-as --reuse-external --disk-only --no-metadata relsnap --diskspec hda,file=/var/lib/libvirt/images/b

virsh snapshot-create-as --reuse-external --disk-only --no-metadata relsnap --diskspec hda,file=/var/lib/libvirt/images/c

virsh snapshot-create-as --reuse-external --disk-only --no-metadata relsnap --diskspec hda,file=/var/lib/libvirt/images/d

Attempt to merge the image c into b:

virsh blockcommit --keep-relative relsnap hda --top /var/lib/libvirt/images/c --base /var/lib/libvirt/images/b
error: Requested operation is not valid: can't keep relative backing relationship

Comment 19 Peter Krempa 2017-06-20 07:56:30 UTC
The expected result is:

$ virsh blockcommit --keep-relative relsnap hda --top /var/lib/libvirt/images/c --base /var/lib/libvirt/images/b
Block Commit started

$ virsh dumpxml relsnap
[...]
  <devices>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/b'/>
        <backingStore type='file' index='2'>
          <format type='qcow2'/>
          <source file='/var/lib/libvirt/images/a'/>
          <backingStore/>
        </backingStore>
      </backingStore>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

# qemu-img info --backing-chain /var/lib/libvirt/images/d 
image: /var/lib/libvirt/images/d
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
backing file: b (actual path: /var/lib/libvirt/images/b)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/b
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
backing file: a (actual path: /var/lib/libvirt/images/a)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: /var/lib/libvirt/images/a
file format: qcow2
virtual size: 10M (10485760 bytes)
disk size: 196K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 21 Peter Krempa 2017-06-20 11:16:34 UTC
Upstream patches posted:

https://www.redhat.com/archives/libvir-list/2017-June/msg00806.html

Comment 24 Peter Krempa 2017-06-20 12:29:51 UTC
Fixed upstream:

commit e20853e1d32ff517e6feec3146066ec433fc39e6
Author: Peter Krempa <pkrempa>
Date:   Tue Jun 20 08:19:02 2017 +0200

    qemu: snapshot: Load data necessary for relative block commit to work
    
    Commit 7456c4f5f introduced a regression by not reloading the backing
    chain of a disk after snapshot. The regression was caused as
    src->relPath was not set and thus the block commit code could not
    determine the relative path.
    
    This patch adds code that will load the backing store string if
    VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT and store it in the correct place
    when a snapshot is successfully completed.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1461303

Comment 28 Han Han 2017-06-22 06:18:32 UTC
Test it on libvirt-3.2.0-14.el7.x86_64 qemu-kvm-rhev-2.9.0-12.el7.x86_64
But the result seems incorrect.

1. Prepare a VM with raw disk
# virsh domblklist n1
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/a

# qemu-img info /var/lib/libvirt/images/a
image: /var/lib/libvirt/images/a
file format: raw
virtual size: 10G (10737418240 bytes)
disk size: 1.2G

2. Create backing qcow2 metadata file then start VM
+ qemu-img create -f qcow2 -b a b
Formatting 'b', fmt=qcow2 size=10737418240 backing_file=a encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
+ qemu-img create -f qcow2 -b b c
Formatting 'c', fmt=qcow2 size=10737418240 backing_file=b encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
+ qemu-img create -f qcow2 -b c d
Formatting 'd', fmt=qcow2 size=10737418240 backing_file=c encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
+ virsh start n1
Domain n1 started

3. Reuse the qcow2 metadata file to create external snapshot
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/b
Domain snapshot 1498111914 created
+ for i in '{b..d}'
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/c
Domain snapshot 1498111915 created
+ for i in '{b..d}'
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/d
Domain snapshot 1498111915 created
+ virsh dumpxml n1| awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/c'/>
        <backingStore type='file' index='2'>
          <format type='qcow2'/>
          <source file='/var/lib/libvirt/images/b'/>
          <backingStore type='file' index='3'>
            <format type='raw'/>
            <source file='/var/lib/libvirt/images/a'/>
            <backingStore/>
          </backingStore>
        </backingStore>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>

3. Do blockcommit from the layer near base layer with --keep-relative option
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/b --wait --verbose
Block commit: [100 %]
Commit complete
+ sleep 2
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/var/lib/libvirt/images/c'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/c --wait --verbose
error: invalid argument: top '/var/lib/libvirt/images/c' in chain for 'vda' has no backing file

+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/var/lib/libvirt/images/c'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/d --pivot --active --wait --verbose
Block commit: [100 %]
Successfully pivoted
+ sleep 2
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/c'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>

The disk xml is not correct after the first blockcommit. It should be like this:
a --> c --> d

Please check the result.

Comment 29 Han Han 2017-06-22 06:23:13 UTC
Created attachment 1290504 [details]
Scripts in comment28

Comment 30 Han Han 2017-06-22 06:29:44 UTC
However, it works well on RHV.
I created the snapshot chain like this:
base <-- s1 <-- s2 <-- s3
And the delete the snapshot s1,s2,s3 by order. All of them can be deleted without error. And the disk xml is correct after each deletion.

Comment 31 Peter Krempa 2017-06-22 14:31:37 UTC
(In reply to Han Han from comment #28)

[...]

> 2. Create backing qcow2 metadata file then start VM
> + qemu-img create -f qcow2 -b a b
> Formatting 'b', fmt=qcow2 size=10737418240 backing_file=a encryption=off
> cluster_size=65536 lazy_refcounts=off refcount_bits=16

You did not specify -o backing_fmt=qcow2/raw in those...

[...]


> 3. Reuse the qcow2 metadata file to create external snapshot

[...]

> + virsh dumpxml n1| awk '/<disk/,/<\/disk/'
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2'/>
>       <source file='/var/lib/libvirt/images/d'/>
>       <backingStore type='file' index='1'>
>         <format type='qcow2'/>
>         <source file='/var/lib/libvirt/images/c'/>
>         <backingStore type='file' index='2'>
>           <format type='qcow2'/>
>           <source file='/var/lib/libvirt/images/b'/>
>           <backingStore type='file' index='3'>
>             <format type='raw'/>
>             <source file='/var/lib/libvirt/images/a'/>
>             <backingStore/>

This is correct since backing store is tracked internally for snapshots.

[...]

> 3. Do blockcommit from the layer near base layer with --keep-relative option
> + virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/b
> --wait --verbose
> Block commit: [100 %]
> Commit complete
> + sleep 2
> + virsh dumpxml n1
> + awk '/<disk/,/<\/disk/'
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2'/>
>       <source file='/var/lib/libvirt/images/d'/>
>       <backingStore type='file' index='1'>
>         <format type='raw'/>
>         <source file='/var/lib/libvirt/images/c'/>
>         <backingStore/>

After block commit the backing store is dropped and re-probed, so if you don't have backing store format probing enabled you'll get a  truncated backing chain. Backing store format detection is disabled by default.

We know at this point that the overlay image is a qcow2 (since it's set in the XML), but the format of the backing store was not specified in the image and thus is detected as "raw" here.

The result is expected given the security implications of format probing. Specifying the backing store format (as in my example) should avoid this problem.

Comment 32 Han Han 2017-06-23 02:05:42 UTC
According to comment31, add -o backing_fmt=qcow2 when creating qcow2 file. So the result is expected:
+ DOM=n1
+ path=/var/lib/libvirt/images/
+ virsh define n1.xml
Domain n1 defined from n1.xml

+ cd /var/lib/libvirt/images/
+ qemu-img create -f qcow2 -o backing_fmt=qcow2 -b a b
qemu-img: b: Image is not in qcow2 format
+ qemu-img create -f qcow2 -o backing_fmt=qcow2 -b b c
Formatting 'c', fmt=qcow2 size=10737418240 backing_file=b backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
+ qemu-img create -f qcow2 -o backing_fmt=qcow2 -b c d
Formatting 'd', fmt=qcow2 size=10737418240 backing_file=c backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
+ virsh start n1
Domain n1 started

+ sleep 5
+ for i in '{b..d}'
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/b
Domain snapshot 1498182998 created
+ for i in '{b..d}'
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/c
Domain snapshot 1498182998 created
+ for i in '{b..d}'
+ virsh snapshot-create-as --reuse-external --disk-only --no-metadata n1 --diskspec vda,file=/var/lib/libvirt/images/d
Domain snapshot 1498182999 created
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/c'/>
        <backingStore type='file' index='2'>
          <format type='qcow2'/>
          <source file='/var/lib/libvirt/images/b'/>
          <backingStore type='file' index='3'>
            <format type='raw'/>
            <source file='/var/lib/libvirt/images/a'/>
            <backingStore/>
          </backingStore>
        </backingStore>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/b --wait --verbose
Block commit: [100 %]
Commit complete
+ sleep 2
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/c'/>
        <backingStore type='file' index='2'>
          <format type='raw'/>
          <source file='/var/lib/libvirt/images/a'/>
          <backingStore/>
        </backingStore>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/c --wait --verbose
Block commit: [100 %]
Commit complete
+ sleep 2
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/d'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/var/lib/libvirt/images/a'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh blockcommit --keep-relative n1 vda --top /var/lib/libvirt/images/d --pivot --active --wait --verbose
Block commit: [100 %]
Successfully pivoted
+ sleep 2
+ virsh dumpxml n1
+ awk '/<disk/,/<\/disk/'
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/a'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
+ virsh destroy n1
Domain n1 destroyed

+ virsh undefine n1
Domain n1 has been undefined

Comment 33 Xuesong Zhang 2017-06-23 05:22:11 UTC
hi, Raz,

This issue is fixed in latest libvirt build, would you please try to check in your RHV env with the build libvirt-3.2.0-14.el7.x86_64? 
It will be more better if RHV QE can test and provide your own testing result for this BZ, although libvirt QE have checked with our RHV env and the result is PASS, see comment 30. Thanks.

Comment 34 Elad 2017-06-23 12:29:03 UTC
Hi, 
I've executed one of the test cases [1] that was failing consistently in our automation using libvirt-3.2.0-14.el7.x86_64 on oVirt 4.2 latest master. It passed on NFS, iSCSI and gluster.

[1]
https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/workitem?id=RHEVM3-6038

Comment 35 errata-xmlrpc 2017-08-02 01:34:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.