Bug 733143
Summary: | qemu: numeric snapshot names can lead to inadvertently deleting the wrong snapshot | ||
---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | Eric Blake <eblake> |
Component: | libvirt | Assignee: | Libvirt Maintainers <libvirt-maint> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | unspecified | CC: | berrange, crobinso, cwei, ddumas, dyuan, eblake, fsimonce, jhradile, mzhan, shawn.pagan, vlastimil.holer |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-11 16:52:02 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 638510 | ||
Bug Blocks: | 740375, 756082, 840699, 840921 |
Description
Eric Blake
2011-08-24 21:49:20 UTC
I need to also test if this same qcow2 ambiguity between snapshot ids vs. numeric tags can affect 'qemu -loadvm' or the loadvm monitor command using live system checkpoint snapshots. That is, I suspect that the wrong image will be loaded in this scenario (although I haven't yet tested): With a running guest, open a terminal in the guest terminal, type 2 virsh snapshot-create-as dom 2 in the guest terminal, erase 2 and type 3 virsh snapshot-create-as dom 3 in the guest terminal, erase 3 and type 1 virsh snapshot-create-as dom 1 virsh snapshot-revert dom 2 inspect the guest terminal - if it states 3 instead of 2, we reverted to the wrong snapshot Fortunately, compared to deletion, reverting the wrong snapshot is not a data loss scenario, just wrong behavior. Getting this fixed is a prereq to bug 638510 support for live snapshots via the snapshot_blkdev qemu monitor command. I just confirmed the test in comment #2. qemu loadvm gives higher priority to snapshot ids than tags, and reverted to the wrong snapshot. However, I also discovered that while 'qemu-img snapshot -c 1' created id 3 with tag 1, 'qemu savevm 1' _reused_ slot 1, without changing its name, silently corrupting the previous contents of that slot. I'm not sure how to go about fixing this short of forbidding integers below a particular threshold (but what threshold?) so that seconds-since-Epoch still works but small numbers like '10' are forbidden. :( Maybe I'll just clone this as a bug against qemu, that it should behave like qemu-img in creating a tagged snapshot with a new id even if the requested name matches an existing id; then fix libvirt to check if the name to revert to or delete is an integer, and if so, do an id lookup to revert or delete by id instead. Thinking about it a bit more: libvirt supports reuse of existing qcow2 images, which may already have internal snapshots by arbitrary names not already present in libvirt's metadata. I think the safest course of action is to change libvirt to forbid creation of a new snapshot _if_ the requested snapshot name already exists as an internal snapshot in one of the qcow2 images, whether or not that new name is integral. Then, for revert and delete, if the snapshot name to be deleted is an integer, then delete by id, otherwise delete by tag. This should be safe for most uses, and forbid only the cases that really would cause problems. I still have to work on the patches for this, but it should not be too difficult to do for both online (qemu savevm) and offline (qemu-img snapshot -c) cases. Eric, can you confirm with the ovirt guys that they don't use the API in a way that would be problematic? VDSM (oVirt) is not using internal qcow2 snapshots. Reading the comments I didn't understand if there's any actual consequence (API/XML change) for external snapshots. External snapshots are immune - either the external snapshot filename exists or it doesn't, but that's exposed in the file system. It's only internal snapshots where libvirt is not (yet) checking where a requested snapshot name conflicts with existing internal snapshot metadata. Yes sorry if I wasn't clear enough, I was asking if fixing the internal snapshots issue will produce any change in the API/XML that is going to affect the external snapshots too. (It shouldn't but I'm asking just to be on the safe side). The fix for this bug would be a change that requesting a snapshot name that conflicts with an existing internal snapshot name would now error out instead of proceeding on with potential data loss. If all you use is external snapshots, you should see no difference in behavior. I can reproduce it with: libvirt-0.9.10-21.el6_3.1.x86_64 virsh # list Id Name State ---------------------------------------------------- 2 qcow running virsh # snapshot-list qcow Name Creation Time State ------------------------------------------------------------ virsh # snapshot-create-as qcow 2 Domain snapshot 2 created virsh # snapshot-list qcow Name Creation Time State ------------------------------------------------------------ 2 2012-06-25 14:52:45 +0800 running [root@zhpeng ~]# qemu-img snapshot -l /var/lib/libvirt/images/qcow.img Snapshot list: ID TAG VM SIZE DATE VM CLOCK 1 2 3.0M 2012-06-25 14:52:45 00:04:44.761 virsh # snapshot-create-as qcow 3 Domain snapshot 3 created virsh # snapshot-list qcow Name Creation Time State ------------------------------------------------------------ 2 2012-06-25 14:52:45 +0800 running 3 2012-06-25 14:53:38 +0800 running [root@zhpeng ~]# qemu-img snapshot -l /var/lib/libvirt/images/qcow.img Snapshot list: ID TAG VM SIZE DATE VM CLOCK 1 2 3.0M 2012-06-25 14:52:45 00:04:44.761 2 3 3.0M 2012-06-25 14:53:38 00:05:37.760 virsh # snapshot-current --name qcow 3 virsh # snapshot-current --name qcow 3 virsh # snapshot-create-as qcow 1 Domain snapshot 1 created virsh # snapshot-list qcow Name Creation Time State ------------------------------------------------------------ 1 2012-06-25 14:55:42 +0800 running 2 2012-06-25 14:52:45 +0800 running 3 2012-06-25 14:53:38 +0800 running [root@zhpeng ~]# qemu-img snapshot -l /var/lib/libvirt/images/qcow.img Snapshot list: ID TAG VM SIZE DATE VM CLOCK 2 3 3.0M 2012-06-25 14:53:38 00:05:37.760 1 2 3.0M 2012-06-25 14:55:42 00:07:39.494 virsh # snapshot-current --name qcow 1 virsh # snapshot-current qcow <domainsnapshot> <name>1</name> <state>running</state> <parent> <name>3</name> </parent> <creationTime>1340607342</creationTime> virsh # snapshot-revert qcow 2 check in the guest, it's snapshot 3 status Upstream qemu may be helping us reach a better solution: https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01936.html Any update to this? We have just stumbled onto what appears to be this same issue. Is there a workaround or does pulling the upstream patch work? FWIW I suspect this is still relevant. The referenced qemu patch was about converting savevm to QMP which never happened either, so I doubt this changed from the qemu side. Maybe the simplest thing to do is just have libvirt reject all numeric snapshot names, at least for the qemu driver. It's still relevant also for RHEL 7, the use-case with snapshot-create-as and snapshot-revert still misbehaves. QEMU dropped support for using numeric IDs in handling snapshots in commit 6ca080453ea403959ccde661030ca16264acc181 Author: Daniel Henrique Barboza <danielhb413> Date: Wed Nov 7 11:09:58 2018 -0200 block/snapshot.c: eliminate use of ID input in snapshot operations I don't think we need to duplicate this in libvirt at this point in time |