Description of problem: When creating a backup for raw disk, engine should estimate the maximum size needed for a scratch disk, and create the scratch disk with this initial size. I started a vm with 3 disks: - os-disk.qcow2 - new disk based on template with ~2g of data - data-disk.qcow2 - new empty 2g qcow2 disk (actual size 1g) - data-disk.raw - new empty 2g raw disk (actual size 2g) Then started a backup: $ ./backup_vm.py -c engine-dev start 70431c48-16d5-4f9e-b2cc-67df802e5705 [ 0.0 ] Starting full backup for VM '70431c48-16d5-4f9e-b2cc-67df802e5705' [ 0.3 ] Waiting until backup '6b96f61d-ef60-4367-aa41-34b942887327' is ready [ 19.5 ] Created checkpoint '0c4d8eca-e01a-405f-aebb-2f89be4cc35b' (to use in --from-checkpoint-uuid for the next incremental backup) [ 19.6 ] Backup '6b96f61d-ef60-4367-aa41-34b942887327' is ready After starting the backup, we got: scratch disk for os-disk: actual size 1g - should be 2g (allocated size) scratch disk for data-disk.qcow2: actual size 1g - correct scratch disk for data-disk.raw: actual size 1g - should be 2g (virtual size) Based on testing, qemu writes to the scratch disk only what was allocated when the backup was started. For raw disk, the entire disk is always allocated, so the maximum scratch disk size is the virtual size of the image. For qcow2 image, engine uses now the size of the top volume. This works only for disk with single volume. Disks with snapshots or based on a template must consider all the data allocated in the entire chain at the time of the backup. To calculate the allocated size, we can use: qemu-img measure -O qcow2 /path/to/top/volume using the existing Volume.measure() vdsm API. Since the guest may write data after we measure but before we start the backup we can add additional space. Version-Release number of selected component (if applicable): 4.5.0-0.0.master.20211012155641.git83f724e492a.el8 How reproducible: Always, code is wrong. Steps to Reproduce: Raw disk: 1. Start vm with 10g raw disk 2. Start backup 3. Verify that scratch disk initial size is at least 10g (we allocate about ~11g for qcow2 metadata) Qcow2 disk based on a template: 1. Start VM with qcow2 disk based on template 2. Start backup 3. Verify that scratch disk initial size is 1g more than the size reported by qemu-img measure. To measure the disk you can use: # virsh -r dumpxml vm-name find the disk path in the xml, measure the disk with qemu-img measure # qemu-img measure -O qcow2 /path/to/disk/from/xml ... required: 2684354560 (this is only an example, your actual disk may be smaller or larger) Add 1g: 2684354560 + 1073741824 * 1.1 = 4133906022 Get the backup xml: # virsh -r backup-dumpxml vm-name Find the scratch disk path in the xml /rhev/data-center/mnt/blockSD/domain-id/images/disk-id/volume-id Check the size of the logical volume: # lvs domain-id/volume-id The size should be 3.875g.
I see that Eyal already file bug 2018971 for this. Closing as duplicate. *** This bug has been marked as a duplicate of bug 2018971 ***