Bug 1908647 - [incremental_backup] Creating push mode backups (with checkpoints) will be failed with messed up qcow2 block-dirty-bitmaps
Summary: [incremental_backup] Creating push mode backups (with checkpoints) will be fa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 8.4
Assignee: Peter Krempa
QA Contact: yisun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-17 09:20 UTC by yisun
Modified: 2021-05-25 06:47 UTC (History)
4 users (show)

Fixed In Version: libvirt-7.0.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 06:46:30 UTC
Type: Bug
Target Upstream Version: 7.0.0
Embargoed:


Attachments (Terms of Use)

Description yisun 2020-12-17 09:20:18 UTC
Description:
Creating push mode backups (with checkpoints) will be failed with messed up qcow2 block-dirty-bitmaps

Versions:
qemu-kvm-5.2.0-2.module+el8.4.0+9186+ec44380f.x86_64
libvirt-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64
	
Reproduce:
100%

Steps:
1. prepare a running vm with disk vdb pointing to a qcow2 image
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# qemu-img create -f qcow2 /var/lib/libvirt/images/vdb.qcow2 1G

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh domblklist avocado-vt-vm1
 Target   Source
------------------------------------------------------------------------
 vda      /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2
 vdb      /var/lib/libvirt/images/vdb.qcow2

2. prepare 3 rounds' PUSH MODE backup and checkpoint xml files, as follow:
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat bk0.xml 
<domainbackup mode="push"><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_0" /><driver type="qcow2" /></disk></disks></domainbackup>

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat ck0.xml 
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_0</name><description>desc of cp_0</description></domaincheckpoint>

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat bk1.xml 
<domainbackup mode="push"><incremental>checkpoint_0</incremental><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_1" /><driver type="qcow2" /></disk></disks></domainbackup>

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat ck1.xml 
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_1</name><description>desc of cp_1</description></domaincheckpoint>

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat bk2.xml 
<domainbackup mode="push"><incremental>checkpoint_1</incremental><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_2" /><driver type="qcow2" /></disk></disks></domainbackup>

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# cat ck2.xml 
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_2</name><description>desc of cp_2</description></domaincheckpoint>

3. start the full backup
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh backup-begin avocado-vt-vm1 bk0.xml ck0.xml 
Backup started

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh domjobinfo avocado-vt-vm1 --completed
Job type:         Completed   
Operation:        Backup      
Time elapsed:     207          ms
File processed:   1.000 GiB
File remaining:   0.000 B
File total:       1.000 GiB

4. start the first incremental backup
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh backup-begin avocado-vt-vm1 bk1.xml ck1.xml 
Backup started

(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh domjobinfo avocado-vt-vm1 --completed
Job type:         Completed   
Operation:        Backup      
Time elapsed:     76           ms

5. start the second incremental backup, FAILED HERE
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# virsh backup-begin avocado-vt-vm1 bk2.xml ck2.xml 
error: internal error: unable to execute QEMU command 'transaction': Bitmap already exists: backup-vdb

6. check the bitmaps in qcow2 image
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# qemu-img info /var/lib/libvirt/images/vdb.qcow2
image: /var/lib/libvirt/images/vdb.qcow2
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: 324 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    bitmaps:
        [0]:
            flags:
                [0]: auto
            name: checkpoint_1
            granularity: 65536
    refcount bits: 16
    corrupt: false
    extended l2: false
<===== only has checkponit_1, there is no checkpoint_0 (lost?) and no checkpoint_2(of course it's failed as step 5)

Expected result:
All block-dirty-bitmaps should be kept if we not deleted them. And the incremtnal backup should be successfull

Actual result:
Second round incremental backup failed, and block-dirty-bitmap checkpoint_0 lost.

Additional info:
Tried same steps with pull-mode backup xml files, and everything seems ok. qcow2 file will be like following after the backup.
(.libvirt-ci-venv-ci-runtest-LN8jxF) [root@dell-per740xd-13 ~]# qemu-img info /var/lib/libvirt/images/vdb.qcow2
image: /var/lib/libvirt/images/vdb.qcow2
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: 452 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    bitmaps:
        [0]:
            flags:
                [0]: auto
            name: checkpoint_2
            granularity: 65536
        [1]:
            flags:
                [0]: auto
            name: checkpoint_1
            granularity: 65536
        [2]:
            flags:
                [0]: auto
            name: checkpoint_0
            granularity: 65536
    refcount bits: 16
    corrupt: false
    extended l2: false

Comment 2 Peter Krempa 2021-01-06 08:30:55 UTC
Fixed upstream:

commit 6ac2327060ffc0837584bbfc2d4955fd5221c557
Author: Peter Krempa <pkrempa>
Date:   Tue Jan 5 15:43:21 2021 +0100

    qemu: backup: Properly delete temporary bitmap after push-mode incremental backup
    
    Refactor in 0316c28a453ac used incorrect source variable to initialize
    the variable which holds the name of the bitmap which needs to be
    deleted after the backup job finishes. This resulted into deleting the
    source bitmap of the backup rather than the temporary one.
    
    Use 'dd->incrementalBitmap' which holds the temporary bitmap name
    instead of 'dd->backupdisk->incremental' which holds the name of the
    source bitmap which is used by the backup.
    
    Fixes: 0316c28a453ac15f58c61f30359f66ab9a649884

v6.10.0-308-g6ac2327060

Comment 3 yisun 2021-01-08 03:57:43 UTC
Preverified with upstream libvirt-6.10.0-1.fc34.x86_64

➜  fedora virsh start pc
Domain 'pc' started

➜  fedora virsh domblklist pc
 Target   Source
---------------------------------------------
 vda      /var/lib/libvirt/images/pc.qcow2
 vdb      /var/lib/libvirt/images/vdb.qcow2


➜  fedora cat bk0.xml bk1.xml bk2.xml ck0.xml ck2.xml ck2.xml 
<domainbackup mode="push"><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_0" /><driver type="qcow2" /></disk></disks></domainbackup>
<domainbackup mode="push"><incremental>checkpoint_0</incremental><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_1" /><driver type="qcow2" /></disk></disks></domainbackup>
<domainbackup mode="push"><incremental>checkpoint_1</incremental><disks><disk backup="no" name="vda" /><disk backup="yes" name="vdb" type="file"><target file="/tmp/target_file_2" /><driver type="qcow2" /></disk></disks></domainbackup>
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_0</name><description>desc of cp_0</description></domaincheckpoint>
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_2</name><description>desc of cp_2</description></domaincheckpoint>
<domaincheckpoint><disks><disk checkpoint="no" name="vda" /><disk checkpoint="bitmap" name="vdb" /></disks><name>checkpoint_2</name><description>desc of cp_2</description></domaincheckpoint>


➜  fedora virsh domjobinfo pc --completed
Job type:         Completed   
Operation:        Backup      
Time elapsed:     808          ms
File processed:   1.000 GiB
File remaining:   0.000 B
File total:       1.000 GiB

➜  fedora virsh backup-begin pc bk1.xml ck1.xml
Backup started

➜  fedora virsh domjobinfo pc --completed      
Job type:         Completed   
Operation:        Backup      
Time elapsed:     165          ms

➜  fedora virsh backup-begin pc bk2.xml ck2.xml
Backup started

➜  fedora virsh domjobinfo pc --completed      
Job type:         Completed   
Operation:        Backup      
Time elapsed:     177          ms

➜  fedora virsh destroy pc
Domain 'pc' destroyed

➜  fedora qemu-img info /var/lib/libvirt/images/vdb.qcow2 
image: /var/lib/libvirt/images/vdb.qcow2
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: 216 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    bitmaps:
        [0]:
            flags:
                [0]: auto
            name: checkpoint_2
            granularity: 65536
        [1]:
            flags:
                [0]: auto
            name: checkpoint_1
            granularity: 65536
        [2]:
            flags:
                [0]: auto
            name: checkpoint_0
            granularity: 65536
    refcount bits: 16
    corrupt: false
    extended l2: false

Comment 6 yisun 2021-01-17 07:46:41 UTC
Auto test result is PASS
Version: libvirt-7.0.0-1.module+el8.4.0+9464+3e71831a.x86_64
Case: rhel.incremental_backup.push_mode.original_disk_local.coldplug_disk.backup_to_qcow2.backup_to_file.reuse_target_file.positive_test

Comment 10 errata-xmlrpc 2021-05-25 06:46:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098


Note You need to log in before you can comment on or make changes to this bug.