Bug 733529 - virsh snapshot-delete --children is broken
Summary: virsh snapshot-delete --children is broken
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Eric Blake
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 638510 674537
TreeView+ depends on / blocked
 
Reported: 2011-08-25 22:12 UTC by Eric Blake
Modified: 2011-12-06 11:27 UTC (History)
8 users (show)

Fixed In Version: libvirt-0.9.4-8.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 11:27:22 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description Eric Blake 2011-08-25 22:12:00 UTC
Description of problem:
libvirt 0.9.0 changed hash operations to forbid nested hash iterations (commit fba550) because they could lead to memory corruption.  But the implementation for virDomainSnapshotDelete with the VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN flags (aka 'virsh snapshot-delete domain name --children') has always been relying on nested hash iterations since its introduction in 0.8.0.

Prior to 0.9.0, that means that attempts to delete a hierarchy of snapshots would usually appear to succeed, but in a corner case dependent on hash values, could corrupt memory; after that point, this means that this functionality flat out fails to work and definitely leaves behind corrupt snapshot layouts (stranded grandchildren that point to non-existent parents).

Version-Release number of selected component (if applicable):
libvirt-0.9.4-5.el6


How reproducible:
100%

Steps to Reproduce:
1. with an offline domain (online works too, but takes much longer to demonstrate): for name in s1 s2 s3 s4; do virsh snapshot-create-as dom $name; done
2. virsh snapshot-delete dom s2
3. virsh snapshot-current dom --name && virsh snapshot-list dom --parent
  
Actual results:
1. Creates four snapshots; s4 is current
2. error messages are logged about forbidden nested hash iterations
3. snapshot s2 and s3 are gone, but s4 still exists and still claims to be current


Expected results:
1. Creates four snapshots; s4 is current
2. no error messages in the logs
3. snapshot s2, s3, and s4 are gone, and s1 is now current

Additional info:
Step 3 requires some additional snapshot patches in virsh that weren't present in libvirt-0.9.4-5.el6.  For validating the bug in older libvirtd, you can either:
1. Run older libvirtd but upgrade to newer virsh
2. Use 'virsh snapshot-current dom' and inspect the xml to learn the current snapshot name, as well as 'virsh snapshot-list dom' followed by 'virsh snapshot-dumpxml dom name' for each name in that list to see which snapshots still exist and what parent they claim to have

Comment 2 Eric Blake 2011-08-25 22:16:54 UTC
Marking this as a regression - the fact that snapshot children deletion usually worked in RHEL 6.0 but broke during hash-table tightening for upstream libvirt 0.9.0 (picked up by rebase in RHEL 6.2, but also hash-table tightening was backported to RHEL 6.1), is a visible worsening of behavior.

Meanwhile, getting this fixed is a prereq to bug 638510 support for live snapshots via the snapshot_blkdev qemu monitor command.

Comment 3 Eric Blake 2011-08-25 22:19:16 UTC
A chain of upstream patches to fix this ends here:
https://www.redhat.com/archives/libvir-list/2011-August/msg01213.html

snapshot: track current domain across deletion of children

Deleting a snapshot and all its descendants had problems with
tracking the current snapshot.  The deletion does not necessarily
proceed in depth-first order, so a parent could be deleted
before a child, wreaking havoc on passing the notion of the
current snapshot to the parent.  Furthermore, even if traversal
were depth-first, doing multiple file writes to pass current up
the chain one snapshot at a time is wasteful, comparing to a
single update to the current snapshot at the end of the algorithm.

* src/qemu/qemu_driver.c (snap_remove): Add field.
(qemuDomainSnapshotDiscard): Add parameter.
(qemuDomainSnapshotDiscardDescendant): Adjust accordingly.
(qemuDomainSnapshotDelete): Properly reset current.

Comment 7 yuping zhang 2011-09-09 11:13:12 UTC
I can't reproduce this issue with libvirt-0.9.4-5.el6.
# rpm -qa | grep libvirt
libvirt-python-0.9.4-5.el6.x86_64
libvirt-0.9.4-5.el6.x86_64
libvirt-client-0.9.4-5.el6.x86_64

#for name in s1 s2 s3 s4; do virsh snapshot-create-as kvm-rhel6.1-x86_64-qcow2 $name; done
Domain snapshot s1 created

Domain snapshot s2 created

Domain snapshot s3 created

Domain snapshot s4 created

# virsh snapshot-delete kvm-rhel6.1-x86_64-qcow2 s2
Domain snapshot s2 deleted

# virsh snapshot-list kvm-rhel6.1-x86_64-qcow2
 Name                 Creation Time             State
---------------------------------------------------
 s1                   2011-09-10 09:10:59 +0800 shutoff
 s3                   2011-09-10 09:11:10 +0800 shutoff
 s4                   2011-09-10 09:11:16 +0800 shutoff

# virsh snapshot-current kvm-rhel6.1-x86_64-qcow2
<domainsnapshot>
  <name>s4</name>
  <state>shutoff</state>
  <parent>
    <name>s3</name>
  </parent>
  <creationTime>1315617076</creationTime>
  <domain>
    <uuid>1c0c1a0f-f014-b261-fc3a-32168ac5e0a6</uuid>
  </domain>
</domainsnapshot>

And also no --name and --parent options with snapshot-current in libvirt-0.9.4-5.el6

Comment 8 yuping zhang 2011-09-09 12:01:48 UTC
I can't test this issue with libvirt-0.9.4-11.el6 as bug 737010 blocks snapshot-create-as command.

Comment 9 dyuan 2011-09-13 09:16:58 UTC
Reproduced this issue with libvirt-0.9.4-5.el6 and libvirt-client-0.9.4-11.el6.

virsh # snapshot-list rhel6q2 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:07:15 +0800 shutoff
 s2                   2011-09-13 17:07:20 +0800 shutoff         s1
 s3                   2011-09-13 17:07:25 +0800 shutoff         s2
 s4                   2011-09-13 17:07:30 +0800 shutoff         s3

virsh # snapshot-current rhel6q2 --name
s4

virsh # snapshot-delete rhel6q2 s2 --children
Domain snapshot s2 deleted

virsh # snapshot-list rhel6q2 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:07:15 +0800 shutoff
 s4                   2011-09-13 17:07:30 +0800 shutoff         s3


virsh # snapshot-current rhel6q2 --name
s4

Comment 10 dyuan 2011-09-13 10:46:06 UTC
Tested with online domain, reproduced with libvirt-0.9.4-5.el6 and verified PASS with libvirt-0.9.4-11.el6.

Will re-check it with offline domain once bug 737010 fixed.

==> Reproduced with libvirt-0.9.4-5.el6.
virsh # snapshot-list rhel6q2 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 18:15:57 +0800 running
 s2                   2011-09-13 18:17:16 +0800 running         s1
 s3                   2011-09-13 18:22:39 +0800 running         s2
 s4                   2011-09-13 18:27:31 +0800 running         s3

virsh # snapshot-current rhel6q2 --name
s4
virsh # snapshot-delete rhel6q2 s2 --children
Domain snapshot s2 deleted

virsh # snapshot-list rhel6q2 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 18:15:57 +0800 running
 s4                   2011-09-13 18:27:31 +0800 running         s3

virsh # snapshot-current rhel6q2 --name
s4

==> Verified with libvirt-0.9.4-11.el6.
# for name in s1 s2 s3 s4; do virsh snapshot-create-as rhel6 $name; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created
Domain snapshot s4 created

virsh # snapshot-list rhel6 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:28:31 +0800 running
 s2                   2011-09-13 17:40:19 +0800 running         s1
 s3                   2011-09-13 17:45:21 +0800 running         s2
 s4                   2011-09-13 17:50:33 +0800 running         s3

virsh # snapshot-current rhel6 --name
s4

virsh # snapshot-delete rhel6 s2 --children
Domain snapshot s2 deleted

virsh # snapshot-list rhel6 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:28:31 +0800 running

virsh # snapshot-current rhel6 --name
s1

Comment 11 Huang Wenlong 2011-09-21 08:12:40 UTC
Verified with libvirt-0.9.4-12.el6.
# for name in s1 s2 s3 s4; do virsh snapshot-create-as rhel6 $name; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created
Domain snapshot s4 created

virsh # snapshot-list rhel6 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:28:31 +0800 running
 s2                   2011-09-13 17:40:19 +0800 running         s1
 s3                   2011-09-13 17:45:21 +0800 running         s2
 s4                   2011-09-13 17:50:33 +0800 running         s3

virsh # snapshot-current rhel6 --name
s4

virsh # snapshot-delete rhel6 s2 --children
Domain snapshot s2 deleted

virsh # snapshot-list rhel6 --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-13 17:28:31 +0800 running

virsh # snapshot-current rhel6 --name
s1

Comment 12 Huang Wenlong 2011-09-26 05:18:02 UTC
Verify it with offline guest  

 for name in s1 s2 s3 s4; do virsh snapshot-create-as snap $name; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created
Domain snapshot s4 created
[root@rhel62-wh ~]# virsh 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit
virsh # snapshot-current snap --name
s4
virsh # snapshot-list snap --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-26 13:13:54 +0800 shutoff
 s2                   2011-09-26 13:13:59 +0800 shutoff         s1
 s3                   2011-09-26 13:14:03 +0800 shutoff         s2
 s4                   2011-09-26 13:14:08 +0800 shutoff         s3
 snap1                2011-08-16 15:46:41 +0800 running

virsh # snapshot-delete snap s2 --children
Domain snapshot s2 deleted

virsh # snapshot-list rhel6 --parent
error: failed to get domain 'rhel6'
error: Domain not found: no domain with matching name 'rhel6'

virsh # snapshot-list snap --parent
 Name                 Creation Time             State           Parent
------------------------------------------------------------
 s1                   2011-09-26 13:13:54 +0800 shutoff
 snap1                2011-08-16 15:46:41 +0800 running

virsh # snapshot-current snap --name 
s1
virsh #

Comment 13 errata-xmlrpc 2011-12-06 11:27:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.