Bug 1971185

Summary: [RFE] Report zero status in dirty extents response
Product: [oVirt] ovirt-imageio Reporter: Nir Soffer <nsoffer>
Component: CommonAssignee: Nir Soffer <nsoffer>
Status: CLOSED CURRENTRELEASE QA Contact: sshmulev
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.1.1CC: bugs, eshames, eshenitz, mlehrer, sfishbai, sgordon, sshmulev, tnisan
Target Milestone: ovirt-4.4.8Keywords: FutureFeature, Performance, ZStream
Target Release: 2.2.0Flags: sbonazzo: ovirt-4.4?
sshmulev: testing_plan_complete?
pm-rhel: planning_ack?
pm-rhel: devel_ack+
pm-rhel: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-imageio-2.2.0-1 Doc Type: Enhancement
Doc Text:
Feature: ovirt-imageio reports now the zero status of blocks that changes since a specified incremental backup. Reason: Having the zero status enable backup application to optimize incremental backup by skipping download of zeroed areas, and adding the area to the backup in a more efficient way. Result: Using the new info, incremental backup can faster, consume less network bandwidth, and require less storage space.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-03 10:08:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nir Soffer 2021-06-12 19:08:52 UTC
Description of problem:

When getting image extents during incremental backup, imageio does not report
the zero status:

    GET /images/{ticket-id}/extents?context=dirty

    [
        {"start": 0, "length": 1073741824, "dirty": false},
        {"start": 1073741824, "length": 9663676416, "dirty": true},
    ]

In this example, the backup application has to copy 9g of data from offset
1073741824.

But it is possible that some dirty areas are actually read as zero. For 
example the situation may be:

    [
        {"start": 0, "length": 1073741824, "dirty": false, "zero": false},
        {"start": 1073741824, "length": 1073741824, "dirty": true, "zero": false},
        {"start": 2147483648, "length": 7516192768, "dirty": true, "zero": true},
        {"start": 9663676416, "length": 1073741824, "dirty": true, "zero": false},
    ]

In this case the backup application can copy only the non-zero extents, minimizing
the network bandwidth reading data from the host, and writing data to the backup
storage, since zero areas can be written without sending any zeroes over the wire.

Clients could always perform this task by merging zero and dirty extents, but this
was not easy. Current imageio supports merging extents internally, introduced for
bug 1971182, so reporting zero status should be easy to do now.

Comment 1 Nir Soffer 2021-06-12 19:44:39 UTC
This is a performance enhancement. Incremental backup is correct without this,
but may be must aster with this change.

To test this change, run the existing tests for incremental backup to check
that there are no regressions.

To test the performance improvement, test this flow:

1. Create VM with 20g virtio-scsi disk enable for incremental backup
   For easier testing, do not create any file system on the disk.

2. Write 10g of data to the disk

   dd if=/dev/urandom bs=1M count=$((10 * 1024)) of=/dev/sda oflag=direct conv=fsync

3. Perform incremental backup 1
   This should create incremental backup file of ~10g for the modified disk.

4. Zero the same areas written before in the guest

   fallocate --length 10g /dev/sda  

5. Perform incremental backup 2

Incremental backup 2 download should take no time, and the size of the 
backup file should be very small (~1m)

If the backupo is performed from another host, network bandwidth used 
during the backup should be minimal - almost no data is copied over the
write during the backup.

Comparing incremental backup 1 to 2 should make the performance change
very clear.

Comment 4 Nir Soffer 2021-06-17 17:58:09 UTC
Here is example flow showing how this change speed up incremental backup,
minimize network bandwidth, and decrease storage for backup.


1. In the guest, create 1g file in the vm:

    [root@f32 ~]# dd if=/dev/urandom bs=1M count=1024 of=big-file conv=fsync
    1024+0 records in
    1024+0 records out
    1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.1303 s, 33.4 MB/s


2. Incremental backup with the new file:

$ ./backup_vm.py -c engine-dev incremental --backup-dir backups --from-checkpoint-uuid dd9c32e3-92a1-4bea-8f31-9590bd319a22 fecfd7ba-2884-4be7-be49-af39c252e1aa
[   0.0 ] Starting incremental backup for VM 'fecfd7ba-2884-4be7-be49-af39c252e1aa'
[   0.2 ] Waiting until backup 'be40e534-99dc-44f9-b87a-8433d042a161' is ready
[  17.4 ] Created checkpoint 'ce40107e-4282-4e61-8926-847ee532bfa9' (to use in --from-checkpoint-uuid for the next incremental backup)
[  17.4 ] Creating image transfer for disk '99374276-9609-49fd-94ad-0e7da6822257'
[  18.5 ] Image transfer '54b37214-e675-4d24-a9ba-9a58b313cee4' is ready
[ 100.00% ] 6.00 GiB, 3.04 seconds, 1.98 GiB/s                                
[  21.6 ] Finalizing image transfer
[  24.6 ] Finalizing backup
[  24.7 ] Waiting until backup is being finalized
[  24.7 ] Incremental backup completed successfully

Backup file:

$ qemu-img info backups/99374276-9609-49fd-94ad-0e7da6822257.202106170431.incremental.qcow2
image: backups/99374276-9609-49fd-94ad-0e7da6822257.202106170431.incremental.qcow2
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 1 GiB
cluster_size: 65536
backing file: 99374276-9609-49fd-94ad-0e7da6822257.202106170426.incremental.qcow2 (actual path: backups/99374276-9609-49fd-94ad-0e7da6822257.202106170426.incremental.qcow2)
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false


3. In the guest, delete the file and trim:

[root@f32 ~]# rm -f big-file
[root@f32 ~]# fstrim -av
/boot: 785.5 MiB (823635968 bytes) trimmed on /dev/sda2
/: 3 GiB (3151028224 bytes) trimmed on /dev/sda4


4. Incremental backup without the big file:

$ ./backup_vm.py -c engine-dev incremental --backup-dir backups --from-checkpoint-uuid ce40107e-4282-4e61-8926-847ee532bfa9 fecfd7ba-2884-4be7-be49-af39c252e1aa
[   0.0 ] Starting incremental backup for VM 'fecfd7ba-2884-4be7-be49-af39c252e1aa'
[   0.3 ] Waiting until backup '4700d3d7-51c9-4adb-9987-9b4c5f5f9d86' is ready
[  13.4 ] Created checkpoint '67a59bf2-7dda-46e6-8b76-747b7028d24a' (to use in --from-checkpoint-uuid for the next incremental backup)
[  13.5 ] Creating image transfer for disk '99374276-9609-49fd-94ad-0e7da6822257'
[  14.6 ] Image transfer '12fd144a-bccb-4b0f-84ad-344d64d37d31' is ready
[ 100.00% ] 6.00 GiB, 0.72 seconds, 8.38 GiB/s                                
[  15.3 ] Finalizing image transfer
[  17.3 ] Finalizing backup
[  17.4 ] Waiting until backup is being finalized
[  17.4 ] Incremental backup completed successfully

Backup image:

$ qemu-img info backups/99374276-9609-49fd-94ad-0e7da6822257.202106170434.incremental.qcow2 
image: backups/99374276-9609-49fd-94ad-0e7da6822257.202106170434.incremental.qcow2
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 30.1 MiB
cluster_size: 65536
backing file: 99374276-9609-49fd-94ad-0e7da6822257.202106170431.incremental.qcow2 (actual path: backups/99374276-9609-49fd-94ad-0e7da6822257.202106170431.incremental.qcow2)
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false


Downloading incremental backup was 4 times faster, and the downloaded image
was 36 times smaller.

Comment 7 sshmulev 2021-08-25 07:56:14 UTC
Verified.

Versions:
vdsm-4.40.80.5-1.el8ev.x86_64
engine-4.4.8.4-0.7.el8ev

The results of the flow:
Can see the after the second incremental backup the download of the image is much faster.

#./backup_vm.py -c engine incremental --backup-dir backups --from-checkpoint-uuid 1a891b51-5818-4ff7-8c67-79e96c54c30c 28fef3aa-6158-47da-9fd8-ebd5be04739a
[   0.0 ] Starting incremental backup for VM '28fef3aa-6158-47da-9fd8-ebd5be04739a'
[   4.2 ] Waiting until backup 'f8d3b472-f227-4a2e-b172-96fb79735505' is ready
[  21.4 ] Created checkpoint '38e106a4-f82a-4638-b101-f5220682c8c6' (to use in --from-checkpoint-uuid for the next incremental backup)
[  21.9 ] Incremental backup not available for disk '1adc5737-bf61-4f01-b22b-85cc34012f56'
[  21.9 ] Downloading full backup for disk '1adc5737-bf61-4f01-b22b-85cc34012f56'
[  24.3 ] Image transfer 'e8bb1721-8719-4c6c-8ebc-596894500b0a' is ready
[ 100.00% ] 10.00 GiB, 34.25 seconds, 298.97 MiB/s                             
[  58.6 ] Finalizing image transfer
[  67.7 ] Download completed successfully
[  67.7 ] Downloading incremental backup for disk 'f25506d2-bd02-46e7-89b9-830e15e68ca9'
[  68.8 ] Image transfer '102b63bb-2231-4ab6-891d-c7fb65bae69a' is ready
[ 100.00% ] 10.00 GiB, 8.43 seconds, 1.19 GiB/s                                
[  77.2 ] Finalizing image transfer
[  82.3 ] Download completed successfully
[  82.3 ] Finalizing backup
[  88.6 ] Incremental backup 'f8d3b472-f227-4a2e-b172-96fb79735505' completed successfully


# qemu-img info backups/1adc5737-bf61-4f01-b22b-85cc34012f56.202108251016.full.qcow2
image: backups/1adc5737-bf61-4f01-b22b-85cc34012f56.202108251016.full.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 4.37 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false


# ./backup_vm.py -c engine incremental --backup-dir backups --from-checkpoint-uuid 38e106a4-f82a-4638-b101-f5220682c8c6 28fef3aa-6158-47da-9fd8-ebd5be04739a
[   0.0 ] Starting incremental backup for VM '28fef3aa-6158-47da-9fd8-ebd5be04739a'
[   1.4 ] Waiting until backup '8bf1647f-199a-4b42-9a22-b70e17e84186' is ready
[  29.7 ] Created checkpoint 'a1f4cd04-6f0c-4c8d-bcac-b3961cb468c1' (to use in --from-checkpoint-uuid for the next incremental backup)
[  29.9 ] Downloading incremental backup for disk '1adc5737-bf61-4f01-b22b-85cc34012f56'
[  31.0 ] Image transfer 'e5f04f16-d0b0-4a74-af86-140824058ca0' is ready
[ 100.00% ] 10.00 GiB, 0.44 seconds, 22.76 GiB/s                               
[  31.5 ] Finalizing image transfer
[  32.5 ] Download completed successfully
[  32.5 ] Downloading incremental backup for disk 'f25506d2-bd02-46e7-89b9-830e15e68ca9'
[  33.6 ] Image transfer '5ecc0465-b739-4b82-8eae-122e35cbfe61' is ready
[ 100.00% ] 10.00 GiB, 0.18 seconds, 55.46 GiB/s                               
[  33.8 ] Finalizing image transfer
[  35.8 ] Download completed successfully
[  35.8 ] Finalizing backup
[  43.0 ] Incremental backup '8bf1647f-199a-4b42-9a22-b70e17e84186' completed successfully


# qemu-img info backups/1adc5737-bf61-4f01-b22b-85cc34012f56.202108251020.incremental.qcow2
image: backups/1adc5737-bf61-4f01-b22b-85cc34012f56.202108251020.incremental.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 960 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false