Description of problem: Currently, when having a checkpoint that needs to be defined for a VM with a backup, the checkpoint should be defined by giving the <domaincheckpoint> XML and the entire VM <domain> XML. Supplying the VM domain XML is redundant. In RHV, it forces us to persists the XML that was created when the backup was taken in the RHV engine database and use it when the checkpoint should be redefined. Moreover, RHV may encounter some errors during the backup (communication issue etc..) and might lose the returned checkpoint and domain XML so we added a recovery mechanism to fetch the entire XML. If Libvirt will provide a way to redefine the checkpoint without the domain XML, it will facilitate the process and reduce many error-prone stages. The call for defining a checkpoint when backup is started using Libvirt python SDK is currently - dom.backupBegin(backup_xml, checkpoint_xml, flags=flags) The call for defining a checkpoint after a backup was taken using Libvirt python SDK is currently - dom.checkpointCreateXML(checkpoint_xml, flags) The solution might be to provide a flag for backupBegin and checkpointCreateXML that will indicate whether the domain XML is needed or not (similar to virDomainCheckpointGetXMLDesc with VIR_DOMAIN_CHECKPOINT_XML_NO_DOMAIN flag) Version-Release number of selected component (if applicable): 6.6.0-6.module+el8.3.0+8125+aefcf088.x86_64 How reproducible: 100% Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Upstream: commit abf12f071b17d8be51ce5fe368175f2f5612eafe Author: Peter Krempa <pkrempa> Date: Wed Dec 2 14:18:40 2020 +0100 conf: checkpoint: Don't require <domain> when redefining checkpoints The domain definition stored with a checkpoint isn't used currently apart from matching disks when creating a new checkpoints. As some users of the incremental backup API want to provide backups in offline mode under their control (obviously while compying with our documentation on how the on-disk state should be handled) and then want to define the checkpoint for live use, supplying a <domain> sub-element is overly complex and not actually needed by the code. Relax the restriction when re-defining a checkpoint so that <domain> is not necessary and add (alibistic) documentation saying that future actions may not work if it's missing. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Daniel Henrique Barboza <danielhb413> commit 392eacfeb1c924a7532955d8866677d905fa74fa Author: Peter Krempa <pkrempa> Date: Wed Dec 2 14:13:17 2020 +0100 conf: checkpoint: Prepare internals for missing domain definition Conditionalize code which assumes that the domain definition stored in the checkpoint is present. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Daniel Henrique Barboza <danielhb413> commit 9fd8ba3b2d681e5d0c58fc68f99224dbf0a5f810 Author: Peter Krempa <pkrempa> Date: Wed Dec 2 13:35:29 2020 +0100 virDomainCheckpointRedefineCommit: Don't check ABI of definition in checkpoint Checking the definition ABI when redefining checkpoints doesn't make much sense for the following reasons: * the domain definition in the checkpoint is mostly unused (a relic adopted from the snapshot code) * can be very easily overridden by deleting the checkpoint metadata before redefinition Rather than complicating the logic when we'll be taking into account that the domain definition may be missing, let's just remove the check. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Daniel Henrique Barboza <danielhb413> commit 9a58f1a53c8d94cf867fd3a76515d222750d5d77 Author: Peter Krempa <pkrempa> Date: Wed Dec 2 14:33:21 2020 +0100 virDomainCheckpointDefParse: Use 'unsigned int' for flags Fix the type for a variable holding flags to the usual 'unsigned int' and change the name to be more appropriate to its use. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Daniel Henrique Barboza <danielhb413> commit d1fd4a37555123fbdf9ebb77f3a004320d75b1a3 Author: Peter Krempa <pkrempa> Date: Wed Dec 2 14:29:30 2020 +0100 virDomainCheckpointDefParse: Don't extract unused domain type We can extract './domain' directly and let the parser deal with the type. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Daniel Henrique Barboza <danielhb413>
One more upstream commit is related to this bug: commit f40a72a32e47c248d5002a765913fa8f547960de Author: Peter Krempa <pkrempa> Date: Tue Dec 8 16:16:08 2020 +0100 qemuDomainCheckpointLoad: Don't align disks when restoring config from disk The alignment step is not really necessary once we've done it already since we fully populate the definition. In case of checkpoints it was a relic necessary for populating the 'idx' to match checkpoint disk to definition disk, but that was already removed.
Verified with: libvirt-6.6.0-10.module+el8.3.1+9163+37dc0a58.x86_64 Steps: 1. prepare a vm with vdb pointing to a qcow2 image [root@dell-per740xd-12 ~]# qemu-img create -f qcow2 /var/lib/libvirt/images/vdb.qcow2 1G Formatting '/var/lib/libvirt/images/vdb.qcow2', fmt=qcow2 cluster_size=65536 compression_type=zlib size=1073741824 lazy_refcounts=off refcount_bits=16 [root@dell-per740xd-12 ~]# virsh domblklist vm1 Target Source -------------------------------------------------------- vda /var/lib/libvirt/images/jeos-27-x86_64.qcow2 vdb /var/lib/libvirt/images/vdb.qcow2 [root@dell-per740xd-12 ~]# virsh start vm1 Domain vm1 started 2. create a checkpoint for vdb, named cp0 [root@dell-per740xd-12 ~]# cat checkpoint_0.xml <domaincheckpoint> <disks> <disk checkpoint="no" name="vda" /> <disk checkpoint="bitmap" name="vdb" /> </disks> <name>cp0</name> <description>cp0</description> </domaincheckpoint> [root@dell-per740xd-12 ~]# virsh checkpoint-create vm1 checkpoint_0.xml Domain checkpoint cp0 created from 'checkpoint_0.xml' 3. dumpxml the checkponit cp0 without domain info, and save it in test_cp.xml [root@dell-per740xd-12 ~]# virsh checkpoint-dumpxml vm1 cp0 --no-domain <domaincheckpoint> <name>cp0</name> <description>cp0</description> <creationTime>1608105517</creationTime> <disks> <disk name='vda' checkpoint='no'/> <disk name='vdb' checkpoint='bitmap' bitmap='cp0'/> </disks> </domaincheckpoint> [root@dell-per740xd-12 ~]# cat test_cp.xml <domaincheckpoint> <name>cp0</name> <description>cp0</description> <creationTime>1608105517</creationTime> <disks> <disk name='vda' checkpoint='no'/> <disk name='vdb' checkpoint='bitmap' bitmap='cp0'/> </disks> </domaincheckpoint> 4. delete the metadata of the checkpoint cp0 [root@dell-per740xd-12 ~]# virsh checkpoint-delete vm1 cp0 --metadata Domain checkpoint cp0 deleted [root@dell-per740xd-12 ~]# virsh checkpoint-list vm1 Name Creation Time ----------------------- 5. try to create checkpoint with test_cp.xml, failed as expected [root@dell-per740xd-12 ~]# virsh checkpoint-create vm1 test_cp.xml error: internal error: unable to execute QEMU command 'transaction': Bitmap already exists: cp0 6. try to create the checkpoint with test_cp.xml, with --redefine param [root@dell-per740xd-12 ~]# virsh checkpoint-create vm1 test_cp.xml --redefine Domain checkpoint cp0 created from 'test_cp.xml' 7. check the checkpoint metadata is correct [root@dell-per740xd-12 ~]# virsh checkpoint-list vm1 Name Creation Time ----------------------------------- cp0 2020-12-16 02:58:37 -0500 [root@dell-per740xd-12 ~]# virsh checkpoint-dumpxml vm1 cp0 --no-domain <domaincheckpoint> <name>cp0</name> <description>cp0</description> <creationTime>1608105517</creationTime> <disks> <disk name='vda' checkpoint='no'/> <disk name='vdb' checkpoint='bitmap' bitmap='cp0'/> </disks> </domaincheckpoint> 8. check the qcow2 image's bitmap cp0 still exists [root@dell-per740xd-12 ~]# virsh destroy vm1 Domain vm1 destroyed [root@dell-per740xd-12 ~]# virsh start vm1 qemDomain vm1 started [root@dell-per740xd-12 ~]# qemu-img info /var/lib/libvirt/images/vdb.qcow2 -U image: /var/lib/libvirt/images/vdb.qcow2 file format: qcow2 virtual size: 1 GiB (1073741824 bytes) disk size: 328 KiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false bitmaps: [0]: flags: [0]: in-use [1]: auto name: cp0 granularity: 65536 refcount bits: 16 corrupt: false 9. try to delete the checkpoint cp0 without --metadata, [root@dell-per740xd-12 ~]# virsh checkpoint-delete vm1 cp0 Domain checkpoint cp0 deleted 10. check both of the bitmap cp0 in qcow2 image and metadata of libvirt are removed [root@dell-per740xd-12 ~]# qemu-img info /var/lib/libvirt/images/vdb.qcow2 -U image: /var/lib/libvirt/images/vdb.qcow2 file format: qcow2 virtual size: 1 GiB (1073741824 bytes) disk size: 456 KiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false [root@dell-per740xd-12 ~]# virsh checkpoint-list vm1 Name Creation Time -----------------------
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0639