Bug 2034542

Summary: [RFE] reduce size of VM disks that are created from a RAW based template on a block-based storage domain
Product: [oVirt] ovirt-engine Reporter: Amit Sharir <asharir>
Component: BLL.StorageAssignee: Albert Esteve <aesteve>
Status: CLOSED DEFERRED QA Contact: Avihai <aefrat>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.10CC: aefrat, ahadas, bugs, michal.skrivanek, nsoffer
Target Milestone: ---Keywords: FutureFeature
Target Release: ---Flags: pm-rhel: ovirt-4.5?
pm-rhel: planning_ack?
pm-rhel: devel_ack?
pm-rhel: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-24 15:12:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amit Sharir 2021-12-21 09:22:40 UTC
Description of problem:
When we clone a VM from a raw template on a block-based storage domain (for example ISCSI), we create a logical volume in the maximum size (virtual size * 1.1). 
This behavior results in a big waste of storage space for the user.

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.10-0.17.el8ev.noarch

How reproducible:
Create a VM using a RAW disk format on a block-based storage domain.

Steps to Reproduce:
1. Create a template disk in RAW format on a block-based storage domain.
2. Create a VM via UI using the template (when creating the VM enter the resource allocation window and choose: format -> QCOW, target -> ISCSI, Disk profile -> ISCSI)

Actual results:
The logical volume created is in the maximum size (virtual size * 1.1)

Expected results:
A more efficient creation of the logical volume created, so less storage would be wasted (reduce the VM cloned from the raw template to optimal size - like when using "reduce_disk.py")

Additional info:

Template created using RAW format:

<disk href="/ovirt-engine/api/disks/da0ac715-d511-4468-9675-bf3871b2be5b" id="da0ac715-d511-4468-9675-bf3871b2be5b">
  <format>raw</format>
  <provisioned_size>6442450944</provisioned_size>
  <sparse>false</sparse>
  ...
</disk>


Clone from raw template:

Virtual Size: 6 GiB
Actual Size: 6 GiB
disk id: d3c98805-b3c2-4562-9d50-29e46314075d

# lvs -o vg_name,lv_name,size,tags | grep d3c98805-b3c2-4562-9d50-29e46314075d

# lvs -o vg_name,lv_name,size,tags | grep d3c98805-b3c2-4562-9d50-29e46314075d
  feab3738-c158-4d48-8a41-b5a95c057a50 d29f8f4a-4536-4b37-b89d-41739d561343   6.62g IU_d3c98805-b3c2-4562-9d50-29e46314075d,MD_13,PU_00000000-0000-0000-0000-000000000000

# lvchange -ay feab3738-c158-4d48-8a41-b5a95c057a50/d29f8f4a-4536-4b37-b89d-41739d561343

# qemu-img info /dev/feab3738-c158-4d48-8a41-b5a95c057a50/d29f8f4a-4536-4b37-b89d-41739d561343
image: /dev/feab3738-c158-4d48-8a41-b5a95c057a50/d29f8f4a-4536-4b37-b89d-41739d561343
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false

# lvchange -an feab3738-c158-4d48-8a41-b5a95c057a50/d29f8f4a-4536-4b37-b89d-41739d561343

Measuring the template disks:

disk id: da0ac715-d511-4468-9675-bf3871b2be5b

# lvs -o vg_name,lv_name,size,tags | grep da0ac715-d511-4468-9675-bf3871b2be5b
  feab3738-c158-4d48-8a41-b5a95c057a50 be385bd3-3777-4240-95f5-147d56f66c48   6.00g IU_da0ac715-d511-4468-9675-bf3871b2be5b,MD_10,PU_00000000-0000-0000-0000-000000000000

# lvchange -ay feab3738-c158-4d48-8a41-b5a95c057a50/be385bd3-3777-4240-95f5-147d56f66c48

# qemu-img measure -O qcow2 /dev/feab3738-c158-4d48-8a41-b5a95c057a50/be385bd3-3777-4240-95f5-147d56f66c48
required size: 6443696128
fully allocated size: 6443696128


Since the RAW template does not have any metadata, qemu-img measure cannot tell
which areas are allocated and which are not, so must report that we need 
the fully allocated size (6443696128 bytes, 6.001 GiB).
Engine sends this value to vdsm, which always allocate 1.1 * virtual size
for block based volumes. Allocating 10% more is not needed in this case, but
removing this may break old engine that assumes that vdsm allocates more.
So we allocate 6.60 GiB, aligned to 6.62 by lvm.

Comment 1 Nir Soffer 2021-12-22 09:23:26 UTC
This should be easy to fix - after cloning a disk from raw template on block
storage, use StoragePool.reduceVolume to reduce volume size to optimal size.
We already use this API after cold delete snapshot.

Comment 3 Sandro Bonazzola 2022-03-29 16:16:40 UTC
We are past 4.5.0 feature freeze, please re-target.

Comment 4 Arik 2022-05-11 15:05:28 UTC
(In reply to Nir Soffer from comment #1)
> This should be easy to fix - after cloning a disk from raw template on block
> storage, use StoragePool.reduceVolume to reduce volume size to optimal size.
> We already use this API after cold delete snapshot.

Nir, but why allocating 10% more in the first place? if it's RAW on block storage, wouldn't it mean we should allocate exactly the virtual size (for preallocated volume without metadata)?

Comment 5 Nir Soffer 2022-05-11 15:21:45 UTC
(In reply to Arik from comment #4)
> (In reply to Nir Soffer from comment #1)
> > This should be easy to fix - after cloning a disk from raw template on block
> > storage, use StoragePool.reduceVolume to reduce volume size to optimal size.
> > We already use this API after cold delete snapshot.
> 
> Nir, but why allocating 10% more in the first place? if it's RAW on block
> storage, wouldn't it mean we should allocate exactly the virtual size (for
> preallocated volume without metadata)?

If the target vm disk is raw, vdsm create a raw image in the right size. But I think
this bug is about creating a qcow2 disk from raw template, and in this case vdsm create
a qcow2 image with maximum allocation (virtual size * 1.1).

When I tested the system Amit worked this is what I saw. But this contradicts what
comment 0 says.

I suggest someone from QE will try to reproduce it again and fix the description
to make the issue more clear.

Comment 6 Arik 2022-05-12 10:10:43 UTC
(In reply to Nir Soffer from comment #5)
> When I tested the system Amit worked this is what I saw. But this
> contradicts what comment 0 says.

Yeah, exactly
I remember we talked about qcow2 but I wasn't sure if it was on this bug when reading comment 0

Avihai, how would you like to proceed with this one?

Comment 7 Avihai 2022-05-15 07:01:07 UTC
(In reply to Arik from comment #6)
> (In reply to Nir Soffer from comment #5)
> > When I tested the system Amit worked this is what I saw. But this
> > contradicts what comment 0 says.
> 
> Yeah, exactly
> I remember we talked about qcow2 but I wasn't sure if it was on this bug
> when reading comment 0
> 
> Avihai, how would you like to proceed with this one?
I do not want to waist more time retesting it.

The main issue here is :
"2. Create a VM via UI using the template (when creating the VM enter the resource allocation window and choose: format -> RAW, target -> ISCSI, Disk profile -> ISCSI)"

This is wrong typo/description and should be :

2. Create a VM via UI using the template (when creating the VM enter the resource allocation window and choose: format -> COW, target -> ISCSI, Disk profile -> ISCSI)

The additional info in comment#0 show this is qcow2 case and not raw.
I clarified it here, not further action from QE is required as we know this behavior of creating a VM with qcow2 will cause the 1.1X increase in storage space.

Comment 8 Arik 2022-05-15 09:53:41 UTC
OK, updated comment 0 accordingly
And yeah, then it's a known issue that would be nice to address in order to optimize storage consumption

Comment 9 Arik 2022-05-24 15:12:48 UTC
Moved to GitHub: https://github.com/oVirt/ovirt-engine/issues/390