2019265 – Scratch disk created with wrong initial size

Bug 2019265 - Scratch disk created with wrong initial size

Summary: Scratch disk created with wrong initial size

Keywords:
Status:	CLOSED DUPLICATE of bug 2018971
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	BLL.Storage
Sub Component:
Version:	4.4.8.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Eyal Shenitzky
QA Contact:	Avihai
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-11-02 00:24 UTC by Nir Soffer
Modified:	2021-11-02 00:29 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2021-11-02 00:29:14 UTC
oVirt Team:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHV-43893	0	None	None	None	2021-11-02 00:26:28 UTC

Description Nir Soffer 2021-11-02 00:24:15 UTC

Description of problem:

When creating a backup for raw disk, engine should estimate the maximum size
needed for a scratch disk, and create the scratch disk with this initial size.

I started a vm with 3 disks:
- os-disk.qcow2 - new disk based on template with ~2g of data
- data-disk.qcow2 - new empty 2g qcow2 disk (actual size 1g)
- data-disk.raw - new empty 2g raw disk (actual size 2g)

Then started a backup:
$ ./backup_vm.py -c engine-dev start 70431c48-16d5-4f9e-b2cc-67df802e5705
[   0.0 ] Starting full backup for VM '70431c48-16d5-4f9e-b2cc-67df802e5705'
[   0.3 ] Waiting until backup '6b96f61d-ef60-4367-aa41-34b942887327' is ready
[  19.5 ] Created checkpoint '0c4d8eca-e01a-405f-aebb-2f89be4cc35b'
(to use in --from-checkpoint-uuid for the next incremental backup)
[  19.6 ] Backup '6b96f61d-ef60-4367-aa41-34b942887327' is ready

After starting the backup, we got:

scratch disk for os-disk: actual size 1g - should be 2g (allocated size) 
scratch disk for data-disk.qcow2: actual size 1g - correct
scratch disk for data-disk.raw: actual size 1g - should be 2g (virtual size)

Based on testing, qemu writes to the scratch disk only what was allocated
when the backup was started.

For raw disk, the entire disk is always allocated, so the maximum scratch
disk size is the virtual size of the image.

For qcow2 image, engine uses now the size of the top volume. This works only
for disk with single volume. Disks with snapshots or based on a template must
consider all the data allocated in the entire chain at the time of the backup.

To calculate the allocated size, we can use:

    qemu-img measure -O qcow2 /path/to/top/volume

using the existing Volume.measure() vdsm API.

Since the guest may write data after we measure but before we start the backup
we can add additional space.

Version-Release number of selected component (if applicable):
4.5.0-0.0.master.20211012155641.git83f724e492a.el8

How reproducible:
Always, code is wrong.

Steps to Reproduce:

Raw disk:
1. Start vm with 10g raw disk
2. Start backup
3. Verify that scratch disk initial size is at least 10g
   (we allocate about ~11g for qcow2 metadata)

Qcow2 disk based on a template:
1. Start VM with qcow2 disk based on template
2. Start backup
3. Verify that scratch disk initial size is 1g more than the size reported
   by qemu-img measure.

To measure the disk you can use:

    # virsh -r dumpxml vm-name

find the disk path in the xml, measure the disk with qemu-img measure

    # qemu-img measure -O qcow2 /path/to/disk/from/xml
    ...
    required: 2684354560

(this is only an example, your actual disk may be smaller or larger)

Add 1g:

    2684354560 + 1073741824 * 1.1 = 4133906022

Get the backup xml:

    # virsh -r backup-dumpxml vm-name

Find the scratch disk path in the xml

    /rhev/data-center/mnt/blockSD/domain-id/images/disk-id/volume-id

Check the size of the logical volume:

    # lvs domain-id/volume-id

The size should be 3.875g.

Comment 1 Nir Soffer 2021-11-02 00:29:14 UTC

I see that Eyal already file bug 2018971 for this.

Closing as duplicate.

*** This bug has been marked as a duplicate of bug 2018971 ***

Note You need to log in before you can comment on or make changes to this bug.