Bug 2018971 - [CBT][Veeam] Scratch disks on block-based storage domain created with the wrong initial size.
Summary: [CBT][Veeam] Scratch disks on block-based storage domain created with the wro...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.9
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: ovirt-4.4.10
: 4.4.10
Assignee: Arik
QA Contact: Amit Sharir
URL:
Whiteboard:
: 2019265 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-01 11:55 UTC by Eyal Shenitzky
Modified: 2022-02-02 10:34 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.4.10
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-19 07:00:13 UTC
oVirt Team: Storage
pm-rhel: ovirt-4.4+
asharir: testing_plan_complete+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43891 0 None None None 2021-11-01 11:58:53 UTC
oVirt gerrit 117173 0 master MERGED core: allow configurable backup scratch disk size 2021-11-08 09:21:24 UTC
oVirt gerrit 117507 0 ovirt-engine-4.4 MERGED core: allow configurable backup scratch disk size 2021-11-09 14:33:46 UTC

Internal Links: 2043175

Description Eyal Shenitzky 2021-11-01 11:55:01 UTC
Description of problem:

When Starting a live VM backup, scratch disk created for each disk that participates in the backup.

When the backed-up disk resides on a block-based storage domain, the scratch disk
created with the wrong initial size and can cause the VM to pause.

For RAW block-based scratch disk, the initial size is set for - 0.
For COW block-based scratch disk, the initial size is set according to the active volume size.

The values that should be set for the initial size are - 

For a RAW block-based scratch disk, the initial size is set for - the backed-up disk's actual size.
For a COW block-based scratch disk, the initial size should be measured to calculate the volumes chain size.

Also, for RAW block-based disks, the scratch disk size should be the backed-up disk actual size.


Version-Release number of selected component (if applicable):
4.5 - master

How reproducible:
100%

Steps to Reproduce:
1. Create a VM with RAW / COW block-based disk
2. Start the VM
3. Start live VM backup

Actual results:
Scratch disk created with wrong size / initial size

Expected results:
Scratch disk should be created with the size as described above.

Additional info:
This bug will be fixed together with the option to configure the initial size for block-based scratch disks and can be tested by setting the 'BackupBlockScratchDiskInitialSizePercents' configuration value to 100%.

Comment 1 Eyal Shenitzky 2021-11-01 12:54:04 UTC
> This bug will be fixed together with the option to configure the initial
> size for block-based scratch disks and can be tested by setting the
> 'BackupBlockScratchDiskInitialSizePercents' configuration value to 100%.

bug 2018986

Comment 2 Nir Soffer 2021-11-02 00:29:14 UTC
*** Bug 2019265 has been marked as a duplicate of this bug. ***

Comment 3 Nir Soffer 2021-11-02 00:32:04 UTC
How to reproduce and verify:

Steps to Reproduce:

Raw disk:
1. Start vm with 10g raw disk
2. Start backup
3. Verify that scratch disk initial size should be 11g
   (we allocate about 1g extra for qcow2 metadata)

Qcow2 disk based on a template:
1. Start VM with qcow2 disk based on template
2. Start backup
3. Verify that scratch disk initial size is 1g more than the size reported
   by qemu-img measure.

To measure the disk you can use:

    # virsh -r dumpxml vm-name

find the disk path in the xml, measure the disk with qemu-img measure

    # qemu-img measure -O qcow2 /path/to/disk/from/xml
    ...
    required: 2684354560

(this is only an example, your actual disk may be smaller or larger)

Add 1g:

    2684354560 + 1073741824 * 1.1 = 4133906022

Get the backup xml:

    # virsh -r backup-dumpxml vm-name

Find the scratch disk path in the xml

    /rhev/data-center/mnt/blockSD/domain-id/images/disk-id/volume-id

Check the size of the logical volume:

    # lvs domain-id/volume-id

The size should be 3.875g.

Comment 10 Amit Sharir 2021-12-20 07:44:18 UTC
Version:
ovirt-engine-4.4.10-0.17.el8ev.noarch / vdsm-4.40.100.1-1.el8ev.x86_64

For the raw disk flow, everything works as expected.
For the QCOW disk on block storage, we need also to round up the volume size to a multiply of 128 mib.

Verification flow for qcow2:

1. Set MaxBackupBlockScratchDiskInitialSizePercents to 100% and MinBackupBlockScratchDiskInitialSizeInGB to 1. (via vdsm run - <engine-config -s "MaxBackupBlockScratchDiskInitialSizePercents=100">)
2. Start VM with qcow2 disk based on a template.
3. Start backup - via API
4. Use <virsh -r dumpxml vm-name> on vdsm to find the disk path in the XML. 
5. For the disk of the template I got the following size: 

# qemu-img measure -O qcow2 /rhev/data-center/mnt/blockSD/d106a99f-ed75-4a3f-b50c-6bd002bede3a/images/0e291e06-5089-4321-ad1a-e63200488b8f/3277f5e3-c351-44e0-821f-d452b85eab4d
required size: 3385655296
fully allocated size: 10739318784
bitmaps size: 0

5. Then I used virsh in order to reach the relevant "scratch disk" path.
6. Used command: <vdsm-client StorageDomain dump sd_id=d106a99f-ed75-4a3f-b50c-6bd002bede3a | grep -A 16 018a92c1-c9bd-480e-91ac-df04450f7e58> via vdsm to find the size of the logical volume


        "018a92c1-c9bd-480e-91ac-df04450f7e58": {
            "apparentsize": 4966055936,
            "capacity": 10737418240,
            "ctime": 1638782825,
            "description": "{\"DiskAlias\":\"VM test1100 backup 31f88ccd-7670-469e-b2dc-eb111e6c6820 scratch disk for latest-rhel-guest-image-8.5-infra\",\"DiskDescription\":\"Backup 31f88ccd-7670-469e-b2dc-eb111e6c6820 scratch disk\"}",
            "disktype": "SCRD",
            "format": "COW",
            "generation": 0,
            "image": "11f63d5c-f276-4a30-8223-2c15090fa116",
            "legality": "LEGAL",
            "mdslot": 10,
            "parent": "00000000-0000-0000-0000-000000000000",
            "status": "OK",
            "truesize": 4966055936,
            "type": "SPARSE",
            "voltype": "LEAF"
        },


Summary of disk sizes and calculations:

required size: 3385655296
expected size: (3385655296 + 1073741824) * 1.1 -> 4905336832
actual size:   4966055936

If you round up 4905336832 to a mutiply of 128 MiB:

>>> 4905336832 / (128 * 1024**2)
36.547607421875
>>> 37 * (128 * 1024**2)
4966055936 - as expected

Verification Conclusions:
The sizes of the generated scratch disks were correct.

Bug verified.

Comment 11 Sandro Bonazzola 2022-01-19 07:00:13 UTC
This bugzilla is included in oVirt 4.4.10 release, published on January 18th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.4.10 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 12 Amit Sharir 2022-02-02 09:50:20 UTC
Added Polarion test plan: RHEVM 27932


Note You need to log in before you can comment on or make changes to this bug.