Bug 1490826

Summary: Cannot shallow blockcopy a vm disk backing chain to a block device (whose format=raw)
Product: Red Hat Enterprise Linux 7 Reporter: yisun
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Han Han <hhan>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.5CC: chhu, hhan, lmen, meili, pkrempa, rbalakri, xuzhang, yisun
Target Milestone: rcKeywords: Automation, Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-3.8.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:57:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yisun 2017-09-12 10:31:47 UTC
Description:
Cannot shallow blockcopy a vm disk backing chain to a block device (whose format=raw)

Versions:
qemu-kvm-rhev-2.9.0-16.el7_4.6.x86_64
libvirt-3.7.0-2.el7.x86_64

REGRESSION: not reproduded with libvirt-3.2.0-14.el7_4.3.x86_64

How reproducible:
100%

Steps:
1. create a transient vm
## virsh create /tmp/avocado-vt-vm1
Domain avocado-vt-vm1 created from /tmp/avocado-vt-vm1

2. create a disk only snapshot for it
## virsh snapshot-create avocado-vt-vm1 --disk-only
Domain snapshot 1505209180 created


3. prepare a block device with format=raw
## dd if=/dev/urandom of=/dev/sde bs=1M count=1000
qemu1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 4.52631 s, 232 MB/s

## qemu-img info /dev/sde
image: /dev/sde
file format: raw
virtual size: 20G (21474836480 bytes)
disk size: 0

4. do a shallow blockcopy to /dev/sde
## virsh blockcopy avocado-vt-vm1 vda /dev/sde  --blockdev --shallow
error: internal error: unable to execute QEMU command 'drive-mirror': Backing file not supported for file format 'raw'

Actual result:
blockcopy failed.

Expected result:
blockcopy can be executed as usual

Additional info:
The reason is libvirt commit 703abf1d7 has:
----------------------------------------
@@ -16692,36 +16692,49 @@ qemuDomainBlockCopyValidateMirror(virStorageSourcePtr mirror,
     int desttype = virStorageSourceGetActualType(mirror);
     struct stat st;
...
+        if (S_ISBLK(st.st_mode)) {
+            /* if the target is a block device, assume that we are reusing it,
+             * so there are no attempts to create it */
+            *reuse = true;
----------------------------------------

so for block device mirror, the **reuse** will always be set to **true** and later used in:
----------------------------------------
if (!mirror->format) {                                                      
    if (!reuse) {                                                           
        mirror->format = disk->src->format;                                 
    } else {                                                                
        /* If the user passed the REUSE_EXT flag, then either they          
         * can also pass the RAW flag or use XML to tell us the format.     
         * So if we get here, we assume it is safe for us to probe the      
         * format from the file that we will be using.  */                  
        mirror->format = virStorageFileProbeFormat(mirror->path, cfg->user,
                                                   cfg->group);             
    }                                                                       
}   
----------------------------------------


and when we do a shallow blockcopy, this will send a qmp cmd as:
{"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"/dev/sde","sync":"top","mode":"absolute-paths","format":"raw"},"id":"libvirt-12"} which trigger the error message.
But in previous libvirt versions the qmp command with have "format": "qcow2" instead, so this will be successful.


Additional info:
And format=raw doesn't affect full chain blockcopy, following virsh/qmp cmd won't trigger error:
----------------------------------------
## virsh blockcopy avocado-vt-vm1 vda /dev/sde  --blockdev
Block Copy started

{"device":"drive-virtio-disk0","target":"/dev/sde","sync":"full","mode":"absolute-paths","format":"raw"},"id":"libvirt-15"}'
----------------------------------------

So seems only shallow copy to a format=raw block device is not supported. And above libvirt commit just expose this problem. Pls check if we need to do some workarounds such as prepare the block device to qcow2 format as usual when we do shallow blockcopy.

Comment 3 Peter Krempa 2017-09-12 14:05:30 UTC
Fixed upstream:

commit 4fc305125833255de16d32289ae24902d670d166 
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 12 14:53:59 2017 +0200

    qemu: blockcopy: Probe image format only with VIR_DOMAIN_BLOCK_COPY_REUSE_EXT
    
    Commit 703abf1d7 changed the logic so that we don't attempt to re-create
    the image if it's a block device. This was done by modifying the
    'reuse' variable. Unfortunately after modifying it one of the uses was
    to infer whether we should probe the disk format. After changes in the
    commit mentioned above we would attempt the probe if the target of the
    copy is a block device and the format was not provided explicitly rather
    than using the format of the disk.
    
    Fix it by explicitly checking whether the user requested a reuse of the
    disk rather than the modified boolean flag.

Comment 5 Han Han 2017-11-06 03:04:03 UTC
Verified on libvirt-3.9.0-1.el7.x86_64 qemu-kvm-rhev-2.10.0-4.el7.x86_64:
1. Prepare a transient VM:
# virt-install --import --transient -n V-raw --disk path=/var/lib/libvirt/images/V-raw.raw,bus=virtio -r 1024 

2. Create a external snapshot:
# virsh snapshot-create-as V-raw s1 --no-metadata --disk-only                                                 
Domain snapshot s1 created

3. Shallow blockcopy to a raw format block device:
# qemu-img info /dev/rhel/raw
image: /dev/rhel/raw
file format: raw
virtual size: 4.0G (4294967296 bytes)
disk size: 0

# virsh blockcopy V-raw vda /dev/rhel/raw --blockdev --shallow --pivot
Successfully pivoted

4. Check disk xml of VM
# virsh dumpxml V-raw|awk '/<disk/,/<\/disk/'                         
    <disk type='block' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source dev='/dev/rhel/raw'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/var/lib/libvirt/images/V-raw.raw'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

5. Check block device format
# qemu-img info /dev/rhel/raw                                                                                 
image: /dev/rhel/raw
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 0
cluster_size: 65536
backing file: /var/lib/libvirt/images/V-raw.raw
backing file format: raw
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

It works as expected.

Comment 6 Han Han 2017-11-06 03:41:36 UTC
Covered by tp-libvirt case: virsh.blockcopy.positive_test.non_acl.local_disk.blockdev.shallow.no_option

Comment 10 errata-xmlrpc 2018-04-10 10:57:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704