Bug 1739149

Summary: cnv-tests: import invalid-qcow-large-size.img should fail for xfs
Product: Container Native Virtualization (CNV) Reporter: Daniel Erez <derez>
Component: StorageAssignee: Maya Rashish <mrashish>
Status: CLOSED ERRATA QA Contact: Qixuan Wang <qixuan.wang>
Severity: low Docs Contact:
Priority: low    
Version: 2.0CC: alitke, awels, cnv-qe-bugs, danken, mrashish, ncredi, ngavrilo, rgarcia, talayan
Target Milestone: ---   
Target Release: 2.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-cdi-operator-container-v2.3.0-35 hco-bundle-registry-container-v2.2.0-445 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-04 19:10:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Erez 2019-08-08 15:49:25 UTC
Description of problem:
cnv-tests -> test_import_http -> test_import_invalid_qcow:
importing an image ('invalid-qcow-large-size.img'[1]) with a virtual size of 152T,
is completed successfully on xfs file system, instead of failing as in ext4.

I.e. 
* using ext4:
mount-512-ext4-mount]# qemu-img convert invalid-qcow-large-size.img -O raw invalid-qcow-large-size.raw
qemu-img: invalid-qcow-large-size.raw: error while converting raw: Could not resize file: File too large

* using xfs:
mount-512-xfs-mount]# qemu-img convert invalid-qcow-large-size.img -O raw invalid-qcow-large-size.raw


[1]
qemu-img info invalid-qcow-large-size.img
image: invalid-qcow-large-size.img
file format: qcow
virtual size: 152T (167125767422464 bytes)
disk size: 4.0K
cluster_size: 4096


Steps to Reproduce:
1. Using ocp 4.1 eng with xfs storage.
2. Run 'test_import_invalid_qcow' tests.
3.

Expected results:
Importing 'invalid-qcow-large-size.img' should fail.


Additional info:
Possible solutions:
* Use an image with a larger virtual size that fails also on xfs.
* Determine the file system of the storage before executing the test.
* Remove this specific test as not necessarily relevant for cdi 
  since it's actually testing qemu-img.

Comment 1 Daniel Erez 2019-10-28 16:22:21 UTC
Seems that the failure in tier1 tests[1] is caused due to an old qemu-img version in the slave: qemu-img-1.5.3-167.el7_7.1.x86_64
Upgrading to qemu-img-ev-2.12.0-18.el7_6.5.1.x86_64 resolves the issue.

@Tareq - is it possible to install qemu-img-ev-2.12.0-18 on the jenkins slaves? So it can be used on test-cdi-ocp-4.2-cnv-2.1 job.

[1]
https://cnv-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/test-cdi-ocp-4.2-cnv-2.1/118/testReport/(root)/Tests%20Suite/_vendor_cnv_qe_redhat_com__level_component_DataVolume_tests_Verify_DataVolume_should__rfe_id_1120__crit_high__posneg_negative__test_id_2555_fail_creating_import_dv__invalid_qcow_large_size/

Comment 2 Daniel Erez 2019-10-29 14:42:50 UTC
After some more testings, seems that the issue is with qemu-img in the cdi-importer container that is based on ubi8:

Using rhel8 (qemu-img 3.1.0)
$ qemu-img info afl1.img
image: afl1.img
file format: raw
virtual size: 9.5K (9728 bytes)
disk size: 12K
$ qemu-img --version
qemu-img version 3.1.0 (qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed)

Whereas, using fedora 30 (qemu-img 4.1.0):
$ qemu-img info afl1.img
file format: qcow 
virtual size: 152 TiB (167125767422464 bytes) 
disk size: unavailable 
cluster_size: 4096
$ qemu-img --version
qemu-img version 4.1.0 (qemu-4.1.0-4.fc30)

I.e. qemu-img info erroneously returns 'file format: raw' in cdi-importer container.
So should understand whether it's a specific issue in ubi8.

Comment 3 Maya Rashish 2019-11-11 10:54:09 UTC
The reason it fails on ext4 is because the outputted raw file is larger than the largest single file supported by ext4.
It would fail even if the filesystem had enough space.
XFS isn't as limited.

Comment 4 Natalie Gavrielov 2019-11-13 13:55:43 UTC
Daniel, you moved this to post, is there a pr you can link here?

Comment 5 Daniel Erez 2019-11-13 14:45:01 UTC
(In reply to Natalie Gavrielov from comment #4)
> Daniel, you moved this to post, is there a pr you can link here?

https://github.com/kubevirt/containerized-data-importer/pull/1001

Already merged, can move to modified.

Comment 6 Alexander Wels 2019-11-27 14:10:12 UTC
@Nelly,

So this fix is part of the functional test suite, so there is no 'release' to put in the 'fixed in version'. Is that a required field?

Comment 7 Nelly Credi 2019-11-28 14:15:51 UTC
according to comment #5 this is a code fix in CDI, 
so you should be able to tell in which tag it is and match the buils version.
or am i missing something ?

Comment 8 Alexander Wels 2019-12-02 14:20:19 UTC
THe fix is a fix in a functional test, not actual CDI code. So there is no tag or version.

Comment 9 Natalie Gavrielov 2019-12-04 13:26:03 UTC
fixed in version is kind of short here, not the familiar format, can you please add a correct fixed in version?

Comment 10 Natalie Gavrielov 2019-12-18 13:22:32 UTC
Verification for this issue will be running the automation (tier2)

Comment 13 Maya Rashish 2020-02-20 14:46:57 UTC
With https://github.com/kubevirt/containerized-data-importer/pull/1119/files#diff-b27f2bd8770fa661f7a6114f7e8d25a3L139 we have a pretty new qemu-img.
This makes the too-large qcow2 image is fail validation, and is not dependent on the filesystem used.

Additionally Alexander has mentioned that we've switched to pre-allocating the space used for the images, so the ability to have a 170BTB sparse file on XFS shouldn't cause this test to unexpectedly pass.

Comment 14 Maya Rashish 2020-02-20 14:51:53 UTC
Alexander, I've linke your pull request to update to Fedora 31 and re-enable the "large qcow" image to this bug report.
This bug report is targetted to the 2.3.0 version

I can see that the test passed even before your Fedora 31 update, it just has been disabled.

Do you agree that there's no code that has to be backported to 2.3.0, because the test would already pass on this version?

Comment 15 Alexander Wels 2020-02-24 14:04:31 UTC
The problem was a bug in qemu-img, which has been fixed in RHEL 8.2. No CDI code has changed.

Comment 16 Qixuan Wang 2020-02-28 13:14:40 UTC
Currrently, we have qemu-img-4.1.0-23.module+el8.1.1+5467+ba2d821b.x86_64 on OCP4.4 + CNV2.3 cluster. Wait for qemu update.

Comment 17 Adam Litke 2020-03-10 20:41:33 UTC
As indicated by previous comments this bug is resolved.

Comment 20 errata-xmlrpc 2020-05-04 19:10:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2011