Bug 1213786

Summary: qemu-img doesn't check if base image exists when size parameter indicated.
Product: Red Hat Enterprise Linux 7 Reporter: yisun
Component: qemu-kvm-rhevAssignee: John Snow <jsnow>
Status: CLOSED ERRATA QA Contact: Ping Li <pingl>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: aliang, areis, bmcclain, cliao, coli, dyuan, eblake, hachen, hhuang, jreznik, jsnow, juzhang, kwolf, michen, mrezanin, mzhan, ngu, nsoffer, pingl, virt-maint, xuwei, xuzhang, ylavi
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.10.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-11 00:09:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1395941, 1439984    

Description yisun 2015-04-21 09:43:58 UTC
Description of problem:
qemu-img doesn't check if base image exists when size parameter indicated.  

Version-Release number of selected component (if applicable):
kernel-tools-3.10.0-240.el7.x86_64
qemu-kvm-common-rhev-2.2.0-8.el7.x86_64
qemu-kvm-rhev-2.2.0-8.el7.x86_64
qemu-kvm-tools-rhev-2.2.0-8.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. when create a image based on a non-existing image, error produced. 
[root@localhost ~]# qemu-img create -f qcow2 -F qcow2 -b aaa test.img
qemu-img: test.img: Could not open 'aaa': No such file or directory

2. when creating a image based on a non-existing image, but with size parameter, creation succeeds. 
[root@localhost ~]# qemu-img create -f qcow2 -F qcow2 -b aaa test.img 1M
Formatting 'test.img', fmt=qcow2 size=1048576 backing_file='aaa' backing_fmt='qcow2' encryption=off cluster_size=65536 lazy_refcounts=off 


Actual results:
qemu-img successfully create a image based on non-existing image when size indicated. 

Expected results:
No matter size indicated or not, actions should be consistent.

Comment 4 Eric Blake 2017-05-01 21:19:42 UTC
It seems to me like we should have a '-u' (unsafe) mode that says "store this backing name in the file, even if you can't open it at the moment", when a size is present, and otherwise give a hard error if the requested backing can't be opened. (similar to how rebase -u lets you store any name, but plain rebase fails, if the name doesn't resolve)

Comment 5 Nir Soffer 2017-06-13 20:54:05 UTC
(In reply to Eric Blake from comment #4)
> It seems to me like we should have a '-u' (unsafe) mode that says "store
> this backing name in the file, even if you can't open it at the moment",
> when a size is present, and otherwise give a hard error if the requested
> backing can't be opened. (similar to how rebase -u lets you store any name,
> but plain rebase fails, if the name doesn't resolve)

This behavior will be very good for RHV - there are case when we can use this to 
avoid activating and deactivating the backing chain logical volume, and we have
all the information needed to create the new image (size, backing_fmt).

See bug 1395941 - we currently try to activate and deactivate every volume in 
a qcow chain *used* by a vm. This takes lot of time and fill the log with useless
tracebacks. In other cases when the chain is not used, this works but is just
waste of time because we always know the size and backing format.

Comment 6 Yaniv Lavi 2017-06-14 10:36:34 UTC
I'm raising severity and requesting you reconsider solve this asap as it has a great impact on RHV.

Comment 9 John Snow 2017-06-14 15:33:47 UTC
Keep in mind, Nir, that the current behavior does not verify if a backing file is present at all, so "Don't change the existing behavior, and document this as supported behavior" is also an option that we have.

The last version of the patch to separate the behaviors got rejected upstream because it conflicted with the new image locking patches; when we attempt to open a backing file (even when we don't need to) it breaks existing behavior in some of our iotests because it causes new double opens.

At this point, because this patch
(1) Is in fact a change from previous behavior, and
(2) Is causing some auxiliary problems,

I am rather more inclined now than I was to just leave the behavior of qemu-img alone, unless RHV *wants* the more explicit behavior instead, in which case I will pursue it.

Comment 10 Ademar Reis 2017-06-14 18:58:48 UTC
(In reply to John Snow from comment #9)
> Keep in mind, Nir, that the current behavior does not verify if a backing
> file is present at all, so "Don't change the existing behavior, and document
> this as supported behavior" is also an option that we have.
> 
> The last version of the patch to separate the behaviors got rejected
> upstream because it conflicted with the new image locking patches; when we
> attempt to open a backing file (even when we don't need to) it breaks
> existing behavior in some of our iotests because it causes new double opens.
> 
> At this point, because this patch
> (1) Is in fact a change from previous behavior, and
> (2) Is causing some auxiliary problems,
> 
> I am rather more inclined now than I was to just leave the behavior of
> qemu-img alone, unless RHV *wants* the more explicit behavior instead, in
> which case I will pursue it.

Setting NEEDINFO.

Comment 11 Nir Soffer 2017-06-18 06:17:01 UTC
(In reply to John Snow from comment #9)
> Keep in mind, Nir, that the current behavior does not verify if a backing
> file is present at all, so "Don't change the existing behavior, and document
> this as supported behavior" is also an option that we have.

This is good enough for RHV. We don't want to depend on undocumented behavior.
 
> The last version of the patch to separate the behaviors got rejected
> upstream because it conflicted with the new image locking patches; when we
> attempt to open a backing file (even when we don't need to) it breaks
> existing behavior in some of our iotests because it causes new double opens.
> 
> At this point, because this patch
> (1) Is in fact a change from previous behavior, and
> (2) Is causing some auxiliary problems,
> 
> I am rather more inclined now than I was to just leave the behavior of
> qemu-img alone, unless RHV *wants* the more explicit behavior instead, in
> which case I will pursue it.

Keeping the current behaviour is good enough to solve the problem RHV has now. The more explicit behavior suggested in comment 4 is probably better as longer term
solution.

How about introducing a -u / --unsafe flag so make this behavior explicit, but 
keeping the current behavior without the --unsafe flag, until the locking issue
is resolved?

If you add a new flag, we need an good way to detect this capability. In the 
past we were running "qemu-img xxx -h" and checking the available options, but
this is ugly.

Comment 15 Ping Li 2017-10-09 13:51:26 UTC
Hi John,
According to the comment 9, seems we just give an warning message to this issue.
I think the issue should be handled as what qemu-img rebase do. Please help to check the below result.


Verify the issue with below packages:
kernel-3.10.0-727.el7.x86_64
qemu-kvm-rhev-2.10.0-1.el7

Test steps:
1. Create snapshot on non-existing backing file without specifying the image size.
# qemu-img create -f qcow2 -F qcow2 -b base.qcow2 sn.qcow2
qemu-img: sn.qcow2: Could not open 'base.qcow2': No such file or directory
Could not open backing image to determine size.

2. Create snapshot on non-existing backing file with specifying the image size.
# qemu-img create -f qcow2 -F qcow2 -b base.qcow2 sn.qcow2 1G   ----------------------> Gave error but image is created successfully.
qemu-img: warning: Could not verify backing image. This may become an error in future versions.
Could not open 'base.qcow2': No such file or directory
Formatting 'sn.qcow2', fmt=qcow2 size=1073741824 backing_file=base.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
# echo $?   --------> should return "1" as rebase
0
# qemu-img info sn.qcow2 
image: sn.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: base.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3. Use option '-u' to create snapshot on non-existing backing file with specifying the image size.
# qemu-img create -u -f qcow2 -F qcow2 -b base.qcow2 sn_u.qcow2 1G
Formatting 'sn_u.qcow2', fmt=qcow2 size=1073741824 backing_file=base.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
# echo $?
0
# qemu-img info sn_u.qcow2 
image: sn_u.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: base.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


Additional info:
1. create base image and snapshot.
# qemu-img create -f qcow2 base.qcow2 1G
Formatting 'base.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16
# qemu-img create -f qcow2 -F qcow2 -b base.qcow2 sn.qcow2
Formatting 'sn.qcow2', fmt=qcow2 size=1073741824 backing_file=base.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16

2. rebase snapshot to an non-existing backing file
# qemu-img rebase -b new.qcow2 -F qcow2 -f qcow2 sn.qcow2 
qemu-img: Could not open new backing file 'new.qcow2': Could not open 'new.qcow2': No such file or directory
# echo $?    --------> return "1"
1
# qemu-img info sn.qcow2 
image: sn.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: base.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3. Use option "-u" to rebase snapshot to an non-existing backing file.
# qemu-img rebase -u -b new.qcow2 -F qcow2 -f qcow2 sn.qcow2 
# echo $?
0
# qemu-img info sn.qcow2 
image: sn.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: new.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 16 John Snow 2017-10-09 20:48:20 UTC
For various reasons, emitting a warning was the safest thing to do to preserve backwards compatibility until a future version where we are able to fail the command automatically.

The upstream QEMU deprecation process gives us a timeframe of about two major releases before I am allowed to fail the command automatically.

As the image is created successfully, '0' is the correct return code here. After all, that we couldn't open the backing image is only a warning -- it didn't actually change how we created the image.

Rebase, however, is actually failing the operation here:
sn.qcow2 uses base.qcow2 as a backing image prior to the operation.
We attempt to have it use new.qcow2, but the operation fails.

Afterwards, you can see it is still using base.qcow2.

I think the error codes here are correct. In two more upstream releases, this warning will become an error, it will return 1 and the image will not be created.

Does this make sense? Please let me know.
John

Comment 17 Ping Li 2017-10-10 01:51:36 UTC
(In reply to John Snow from comment #16)
> For various reasons, emitting a warning was the safest thing to do to
> preserve backwards compatibility until a future version where we are able to
> fail the command automatically.
> 
> The upstream QEMU deprecation process gives us a timeframe of about two
> major releases before I am allowed to fail the command automatically.
> 
> As the image is created successfully, '0' is the correct return code here.
> After all, that we couldn't open the backing image is only a warning -- it
> didn't actually change how we created the image.
> 
> Rebase, however, is actually failing the operation here:
> sn.qcow2 uses base.qcow2 as a backing image prior to the operation.
> We attempt to have it use new.qcow2, but the operation fails.
> 
> Afterwards, you can see it is still using base.qcow2.
> 
> I think the error codes here are correct. In two more upstream releases,
> this warning will become an error, it will return 1 and the image will not
> be created.
> 
> Does this make sense? Please let me know.
> John

In my opinion, we have provide an option "-u" for create snapshot on non-existing backing file. So the behaviour should be different from without specifying option "-u". Now we only give a warning message. But I found you have added documentation in the qemu-img manpage as follow. So I think it is acceptable.

+Note that a given backing file will be opened to check that it is valid. Use
+the @code{-u} option to enable unsafe backing file mode, which means that the
+image will be created even if the associated backing file cannot be opened. A
+matching backing file must be created or additional options be used to make the
+backing file specification valid when you want to use an image created this
+way.

Comment 18 Ping Li 2017-10-10 01:59:19 UTC
Set the bug as verified according to the comment 9, comment 15 and comment 17

Comment 21 errata-xmlrpc 2018-04-11 00:09:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104