Bug 1712832 - Storage migration of a compressed image is failing with error "No space left on device" in block storage domain
Summary: Storage migration of a compressed image is failing with error "No space left ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.3.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.0
: 4.4.0
Assignee: Benny Zlotnik
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
Depends On:
Blocks: 1547336
TreeView+ depends on / blocked
 
Reported: 2019-05-22 10:52 UTC by nijin ashok
Modified: 2023-10-06 18:19 UTC (History)
13 users (show)

Fixed In Version: vdsm-4.40.7, ovirt-engine-4.4.0 gitb5b5c99ca2f
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 13:27:06 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1727678 0 unspecified CLOSED Create VM from template from glance fails with 'enospc', engine hits NPE and VM remains locked 2022-01-18 12:20:41 UTC
Red Hat Knowledge Base (Solution) 3186081 0 Troubleshoot None RHV: certain uploaded disks fail to move and copy 2019-07-09 01:28:34 UTC
Red Hat Product Errata RHEA-2020:3246 0 None None None 2020-08-04 13:27:38 UTC
oVirt gerrit 105999 0 master MERGED core: add MeasureVolume API call 2020-10-30 12:03:39 UTC
oVirt gerrit 106000 0 master MERGED core: use Volume.measure to determine size 2020-10-30 12:03:39 UTC
oVirt gerrit 106032 0 master MERGED storage,api: expose Volume.measure 2020-10-30 12:03:39 UTC
oVirt gerrit 106045 0 master MERGED core: use Volume.measure for copy-collapse 2020-10-30 12:03:41 UTC
oVirt gerrit 106055 0 master MERGED core: change return value of tearDownImage 2020-10-30 12:03:41 UTC
oVirt gerrit 106955 0 master MERGED core: add FeatureSupported for measure volume 2020-10-30 12:03:40 UTC
oVirt gerrit 106956 0 master MERGED core: add generic host filter 2020-10-30 12:03:41 UTC

Internal Links: 1727678

Description nijin ashok 2019-05-22 10:52:53 UTC
Description of problem:

vdsm use "qemu-img convert" to move/copy the images from one storage domain to another. The convert is done without "compress" option so the compressed cluster will get uncompressed during the operation. So the size of the destination image will be larger than the source. However, RHV uses the output from "qemu-img info" to create the LV which is not enough for the uncompressed image. Hence "qemu-img convert" will fail with an error "No space left on device".    

Used cfme-rhvm qcow2 image to test.

===
qemu-img check cfme-rhevm-5.9.3.4-1.x86_64.qcow2
No errors were found on the image.
51654/655360 = 7.88% allocated, 96.31% fragmented, 95.35% compressed clusters
Image end offset: 1200685056

qemu-img info cfme-rhevm-5.9.3.4-1.x86_64.qcow2
image: cfme-rhevm-5.9.3.4-1.x86_64.qcow2
file format: qcow2
virtual size: 40G (42949672960 bytes)
disk size: 1.1G
cluster_size: 65536
Format specific information:
    compat: 0.10
    refcount bits: 16
===

Uploaded the disk to RHV-M and tried both LSM and offline disk migration. Both will fail during "qemu-img convert" phase.

As per the bug 1470435, the issue was fixed in "copyCollapsed" which is used during the VM clone, template creation etc. It uses "qemu-img measure" to calculate the destination image size and the LV is created with this size. However, the sdm_copy_data doesn't use  "qemu-img measure" to calculate the destination image size and still use "imageInitialSizeInBytes" to create the destination LV which is causing this issue.


Version-Release number of selected component (if applicable):

rhvm-4.3.3
vdsm-4.30.13-4.el7ev.x86_64


How reproducible:

100%


Steps to Reproduce:

1. Upload a compressed image to a block storage domain.
2. Do LSM or offline disk migration to another block storage domain. Both will fail with the error "No space left on device".

Actual results:

Storage migration of a compressed image is failing with error "No space left on device" in the block storage domain

Expected results:

Storage migration of compressed image should work.

Additional info:

Comment 2 Tal Nisan 2019-05-27 14:19:39 UTC
Not sure we support compressed images, did that scenario used to work in the past?

Comment 3 nijin ashok 2019-05-28 05:31:03 UTC
(In reply to Tal Nisan from comment #2)
> Not sure we support compressed images, did that scenario used to work in the
> past?

I don't think it was working in the past. But we provide appliance images like cloudforms in compressed format and have instructions in the documentation to upload those in RHV. A similar issue was fixed in Bug 1470435 using "qemu-img measure".

Comment 4 Tal Nisan 2019-06-10 14:21:29 UTC
Daniel, what do you think?

Comment 5 Daniel Erez 2019-06-11 10:52:31 UTC
(In reply to Tal Nisan from comment #4)
> Daniel, what do you think?

afaik, we don't support it for images chain, i.e. need to support compressed format on images chain.

Comment 6 Marina Kalinin 2019-07-08 15:52:48 UTC
(In reply to Tal Nisan from comment #2)
> Not sure we support compressed images, did that scenario used to work in the
> past?

According to this BZ 1470435 we do support them, since this BZ was fixed with a note implying we support it.
Also, based on bz 1470435 comment 15 and others, it seems like there is no such thing as a compressed qcow image.

Comment 8 Germano Veit Michel 2019-07-10 03:39:58 UTC
Probably the very same thing here, but using images from glance: BZ1727678

Comment 15 Ilan Zuckerman 2020-03-23 07:56:54 UTC
Verified. According those steps:

1. Downloaded from web to a host "CFME 5.11.3 Red Hat Virtual Appliance (qcow)" image from here [1]
2. Uploaded the qcow image to ISCSI sd using upload_disk.py [2]
3. Attach the image/disk to a VM created out of rhel8 template.
4. perform LSM + cold disk migration

Actual result:
Both migrations succeeded.
Both DID NOT fail with the error "No space left on device".

Tested on:
ovirt-engine-4.4.0-0.25.master.el8ev.noarch
vdsm-4.40.5-1.el8ev.x86_64

[1]:
https://access.redhat.com/downloads/content/167/ver=5.0/rhel---8/5.0/x86_64/product-software

[2]:
python3 upload_disk.py cfme-rhevm-5.11.3.1-1.x86_64.qcow2 --engine-url https://storage-ge-09.XXX.XXX --username XXX --disk-format qcow2 --disk-sparse --sd-name iscsi_0 -c /root/ca.pem --insecure
https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/upload_disk.py

Comment 19 errata-xmlrpc 2020-08-04 13:27:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246

Comment 20 Red Hat Bugzilla 2023-09-14 05:28:59 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.