Bug 674537
Summary: | Restarting libvirtd resets current snapshot | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jiri Denemark <jdenemar> |
Component: | libvirt | Assignee: | Eric Blake <eblake> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 6.1 | CC: | dallan, dyuan, eblake, nzhang, syeghiay, veillard, whuang, xen-maint, yupzhang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-0.9.4-8.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause
A number of logic bugs were present in libvirt snapshot (system checkpoint) handling; among these, restarting libvirtd would lose track of the current snapshot, and a change in qemu behavior triggered a latent bug in libvirt's ability to restore certain snapshots.
Consequence
Snapshots were unreliable and hard to manage without tripping up on limitations, contrary to the documentation.
Fix
A number of bug fixes and added flags to existing snapshot management APIs, along with better testing of more snapshot scenarios, allowed libvirt to actually provide all the snapshot features that it had previously documented.
Result
Management applications can use system checkpoint snapshots for better control in rolling back to known stable states of a VM.
|
Story Points: | --- |
Clone Of: | 662026 | Environment: | |
Last Closed: | 2011-12-06 10:54:00 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 733529, 733762 | ||
Bug Blocks: | 638510 |
Description
Jiri Denemark
2011-02-02 11:27:52 UTC
I might end up fixing this one as a side effect of implementing disk snapshots on top of the virDomainSnapshotCreateXML command, since disk snapshots have to interact with the hierarchy of checkpoint snapshots. In fact, I _did_ end up fixing this as a side effect. Upstream patch still awaiting ACK, but should be trivial to backport once approved: https://www.redhat.com/archives/libvir-list/2011-August/msg00627.html snapshot: track current snapshot across restarts Audit all changes to the qemu vm->current_snapshot, and make them update the saved xml file for both the previous and the new snapshot, so that there is always at most one snapshot with <active>1</active> in the xml, and that snapshot is used as the current snapshot even across libvirtd restarts. * src/conf/domain_conf.h (_virDomainSnapshotDef): Alter member type and name. * src/conf/domain_conf.c (virDomainSnapshotDefParseString) (virDomainSnapshotDefFormat): Update clients. * docs/schemas/domainsnapshot.rng: Tighten rng. * src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Reload current snapshot. (qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot) (qemuDomainSnapshotDiscard): Track current snapshot. Getting this fixed is a prereq to bug 638510 support for live snapshots via the snapshot_blkdev qemu monitor command. In POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-August/msg00633.html for all but one corner case ('virsh snapshot-delete dom --children') that will be fixed by bug 733529) As committed in libvirt-0.9.4-6.el6, this introduced a regression when reverting to offline snapshots using old qemu. However, bug 733762 documents that this has already been broken when using newer qemu that rejects -loadvm of inactive snapshots. So either way, this bug cannot be fully verified until that bug has been fixed. Moving back to ASSIGNED - Philipp Hahn pointed out that SIGHUP also has problems remembering the current snapshot: https://www.redhat.com/archives/libvir-list/2011-August/msg01444.html I haven't been able to reproduce Philipp's SIGHUP issues (but did ask him for more details), but I have found a corner case where an OOM condition could leave stale metadata behind: https://www.redhat.com/archives/libvir-list/2011-September/msg00094.html Back in POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-September/msg00049.html Reproduce this issue with libvirt-0.8.7-18.el6.x86_64 : # virsh snapshot-current rhel6 <domainsnapshot> <name>1315604441</name> <state>running</state> <creationTime>1315604441</creationTime> <domain> <uuid>6df50163-a754-8430-34dc-b8b8e549736e</uuid> </domain> </domainsnapshot> [root@dhcp-93-226 libvirt-0.8.7-18]# service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] [root@dhcp-93-226 libvirt-0.8.7-18]# virsh snapshot-current rhel6 Verified this issue with: libvirt-0.9.4-11.el6.x86_64 qemu-kvm-0.12.1.2-2.185.el6.x86_64 # virsh snapshot-current rhel6 <domainsnapshot> <name>1315618546</name> <state>shutoff</state> <creationTime>1315618546</creationTime> <domain type='kvm'> <name>rhel6</name> <uuid>6df50163-a754-8430-34dc-b8b8e549736e</uuid> <memory>1048576</memory> <currentMemory>1048576</currentMemory> .... # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh snapshot-current rhel6 <domainsnapshot> <name>1315618546</name> <state>shutoff</state> <creationTime>1315618546</creationTime> <domain type='kvm'> <name>rhel6</name> <uuid>6df50163-a754-8430-34dc-b8b8e549736e</uuid> <memory>1048576</memory> <currentMemory>1048576</currentMemory> <vcpu>1</vcpu> ............. Still test virsh snapshot-revert then restart libvirtd,virsh snapshot-current still works well. So change the status to VERIFIED. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause A number of logic bugs were present in libvirt snapshot (system checkpoint) handling; among these, restarting libvirtd would lose track of the current snapshot, and a change in qemu behavior triggered a latent bug in libvirt's ability to restore certain snapshots. Consequence Snapshots were unreliable and hard to manage without tripping up on limitations, contrary to the documentation. Fix A number of bug fixes and added flags to existing snapshot management APIs, along with better testing of more snapshot scenarios, allowed libvirt to actually provide all the snapshot features that it had previously documented. Result Management applications can use system checkpoint snapshots for better control in rolling back to known stable states of a VM. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |