Hide Forgot
Description of problem: Libvirt was recently fixed to prevent undefine of a snapshot with managed save metadata (bug 697742). However, this only covers part of the metadata; there is also a problem with snapshots. Version-Release number of selected component (if applicable): libvirt-0.9.4-7.el6 How reproducible: I didn't actually test the scenario to see how things fail, but suspect (based on my work in the code to fix the issue, and the similarity to managed save) that the problem will show up as bogus log messages or even wholescale libvirt corruption as it tries to use the stale metadata. Steps to Reproduce: 1. define a persistent domain with qcow2 disks (doesn't even have to run) 2. 'virsh snapshot-create dom' 3. 'virsh undefine dom' 4. ls /var/lib/libvirt/qemu/snapshot/dom 5. define a new domain with the same name, but different UUID 6. 'virsh snapshot-list dom' Actual results: step 3 succeeded, but step 4 demonstrated that it stranded /var/lib/libvirt/qemu/snapshot/dom/*.xml steps 5 and 6 can then get confused by the stale files - it may take other steps, such as an attempt to create a new snapshot by the same name as the stale ones, before the confusion causes visible problems, but it is certainly risky Expected results: step 3 should be forbidden until the snapshot metadata is removed first, and virsh should be enhanced to make this easier to do (virsh undefine --snapshots-metadata dom'. Once a domain is no longer present in 'virsh list --all', then there should not be any /var/lib/libvirt/qemu/snapshot/dom directory for that domain. Additional info: Upstream patch series fixes this and more: https://www.redhat.com/archives/libvir-list/2011-September/msg00137.html
Getting this fixed is a prereq to bug 638510 support for live snapshots via the snapshot_blkdev qemu monitor command.
Since this patch proposes blocking undefine only if metadata is present, it also becomes important to identify when metadata is present, as well as to delete metadata without affecting snapshot contents. Additionally, it becomes important to be able to redefine metadata to the state that it was before deletion, so that snapshot hierarchy can be preserved across transient domain restart or migrated between machines. I'm lumping all of those fixes into this bug.
Upstream series ending in this commit: commit e2fb96d92b4b986a2b5732416f7bfd302a848970 Author: Eric Blake <eblake> Date: Fri Aug 12 13:23:09 2011 -0600 snapshot: prevent migration from stranding snapshot data Migration is another case of stranding metadata. And since snapshot metadata is arbitrarily large, there's no way to shoehorn it into the migration cookie of migration v3.
In POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-September/msg00126.html
Two additional patches make it so that 'virsh snapshot-create dom --no-metadata' will print out the just-generated snapshot name rather than failing (however, directly using the first of these two patches would be an incompatible API change, so it can't be back-ported as-is): https://www.redhat.com/archives/libvir-list/2011-September/msg00390.html
I posted the followup patches for the --no-metadata improvement, although I'm not yet sure whether they belong to this BZ or a new one: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-September/msg00307.html
Hmm, I realized that -12 fails to remove snapshot metadata for transient domains as documented; since my patches to date focused only on persistent domains. I'm not sure whether to split that into another BZ or move this one back to assigned.
Pulling back to ASSIGNED while waiting for three more upstream patches to be approved: https://www.redhat.com/archives/libvir-list/2011-September/msg00860.html
In POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-September/msg00768.html
Verify it with libvirt-0.9.4-13.el6.x86_64 1. define a persistent domain named "snap"with qcow2 disks 2. create a snapshot for domain snap # virsh snapshot-create snap 3. virsh # snapshot-list snap Name Creation Time State ------------------------------------------------------------ 1317020538 2011-09-26 15:02:18 +0800 shutoff 4.virsh # undefine snap error: Failed to undefine domain snap error: Requested operation is not valid: cannot delete inactive domain with 1 snapshots 5. virsh undefine --snapshots-metadata snap Domain snap has been undefined 4. check snapshot metadata (no metadata) # ls /var/lib/libvirt/qemu/snapshot/snap 5. define a new domain with the same name, but different UUID 6. check snapshot for domain test: # virsh snapshot-list snap Name Creation Time State ---------------------------------------------------------
This patch series introduced a typo in the user-visible error message: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-October/msg00233.html
These issues have been resolved on rhel6 beta(2.6.32-193.el6.x86_64) with libvirt-0.9.4-16.el6.x86_64, so move the bug to VERIFIED status. The following are some details: # qemu-img create -f qcow2 /var/lib/libvirt/images/foo.img 10M Formatting '/var/lib/libvirt/images/foo.img', fmt=qcow2 size=10485760 encryption=off cluster_size=65536 # qemu-img info /var/lib/libvirt/images/foo.img image: /var/lib/libvirt/images/foo.img file format: qcow2 virtual size: 10M (10485760 bytes) disk size: 140K cluster_size: 65536 $ cat > /root/demo.xml <<EOF <domain type='qemu'> <name>demo</name> <memory>219200</memory> <vcpu>1</vcpu> <os> <type arch='x86_64'>hvm</type> <boot dev='cdrom'/> </os> <devices> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/foo.img'/> <target dev='vda' bus='virtio'/> </disk> <input type='mouse' bus='ps2'/> <graphics type='spice' autoport='yes' listen='0.0.0.0'/> </devices> </domain> EOF # virsh define /root/demo.xml Domain demo defined from /root/demo.xml # virsh snapshot-list demo Name Creation Time State ------------------------------------------------------------ # cat > /root/snap.xml <<EOF <domainsnapshot> <state>shutoff</state> </domainsnapshot> EOF # virsh snapshot-create demo /root/snap.xml Domain snapshot 1318054922 created from '/root/snap.xml' # virsh snapshot-list demo Name Creation Time State ------------------------------------------------------------ 1318054922 2011-10-08 14:22:02 +0800 shutoff # virsh snapshot-list demo --parent Name Creation Time State Parent ------------------------------------------------------------ 1318054922 2011-10-08 14:22:02 +0800 shutoff # virsh snapshot-list demo --roots Name Creation Time State ------------------------------------------------------------ 1318054922 2011-10-08 14:22:02 +0800 shutoff # virsh snapshot-list demo --parent --roots error: --parent and --roots are mutually exclusive Notes, without typo issue for 'exclusive' words. # ls /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml # virsh undefine demo error: Failed to undefine domain demo error: Requested operation is not valid: cannot delete inactive domain with 1 snapshots Notes, this is a expected behaviour. # ls /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml Notes, snapshot metadata still exists, this is a expected result. # virsh undefine --snapshots-metadata demo Domain demo has been undefined # ls /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml ls: cannot access /var/lib/libvirt/qemu/snapshot/demo/1318054922.xml: No such file or directory Notes, everything is okay now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html