Bug 733499

Summary: memory leak and bogus snapshot listing on failed libvirt snapshots
Product: Red Hat Enterprise Linux 6 Reporter: Eric Blake <eblake>
Component: libvirtAssignee: Eric Blake <eblake>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2CC: abartlet, ajia, berrange, clalance, crobinso, dyuan, eblake, itamar, jforbes, laine, mzhan, nzhang, redhat, ricardo.arguello, rwu, veillard, virt-maint, whuang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.4-6.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 727709 Environment:
Last Closed: 2011-12-06 11:27:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 638510    

Description Eric Blake 2011-08-25 19:57:56 UTC
Cloning to RHEL 6.2 to fix memory leak on failed snapshots (RHEL 6 is immune to the missing qmp savevm of F15, because it has the same hmp fallback code as F16).

+++ This bug was initially created as a clone of Bug #727709 +++

Description of problem:
Running virsh snapshot-create <domain> fails

Version-Release number of selected component (if applicable):
libvirt-0.8.8-7.fc15.x86_64
qemu-kvm-0.14.0-7.fc15.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Install virtual machine (Windows 2008R2 in this case) with qcow2 disk
2. Start the VM
3. sudo virsh snapshot-create Win2008R2-5
4. sudo virsh snapshot-list Win2008R2-5

  
Actual results:
sudo virsh snapshot-create Win2008R2-5
error: internal error unable to execute QEMU command 'savevm': The command savevm has not been found

[abartlet@obed samba-1]$ sudo virsh snapshot-list Win2008R2-5
 Name                 Creation Time             State
---------------------------------------------------
 1312291332           2011-08-02 23:22:12 +1000 shutoff
 1312328890           2011-08-03 09:48:10 +1000 nostate

Expected results:
The creation of a running snapshot

Additional info:

This did work in Fedora 14.  It fails for all the VMs I've tried. 

This snapshot mechanism was being used to support Samba development via Wintest:
http://blog.tridgell.net/?p=91

--- Additional comment from redhat on 2011-08-12 06:30:20 MDT ---

I think this is a result of libvirt communicating with qemu via the new json interface, the savevm command -is- present in qemu, just not through json.

--- Additional comment from eblake on 2011-08-12 09:54:49 MDT ---

Ouch - libvirt should have detected the monitor failure, rather than proceeding to create a bogus metadata entry.  I'll take a further look into this today.

--- Additional comment from redhat on 2011-08-12 11:43:26 MDT ---

BTW - it looks like there is a work-around in the libvirtd package in rawhide, it falls back to the previous way of communicating with QEMU instead of the JSON stuff.

Think that individual change could be back ported to F15?

--- Additional comment from eblake on 2011-08-12 15:02:01 MDT ---

There's two bugs here:

1. libvirt not attempting qmp->hmp fallback with qemu that doesn't support qmp savevm (upstream commit 89241fe0, v0.9.0, although several other related commits would also have to be backported; at least: 266265a, 89241fe, ce81bc5, abdfca0, 24c56ce, c33ac2e, 0ecfa7f)

2. On savevm failure, libvirt leaks bogus metadata into snapshot lists even when snapshot creation fails (just posted the upstream fix for that):
https://www.redhat.com/archives/libvir-list/2011-August/msg00531.html

F15 (at libvirt 0.8.8) is affected by both problems; F16 is immune to the first, and the second is less likely to hit.  I'm not sure if F14 also has an issue.

If you are impatient, you can use the virt-preview repo to pick up the libvirt build from F16 compiled for F15, which will solve the first bug, and probably get you to the point of not tickling the second bug.

And there's lots more active work going on for snapshots for upstream 0.9.5, you could always help test libvirt.git.

--- Additional comment from eblake on 2011-08-12 15:07:53 MDT ---

I've checked F14; this is a regression (F14 at 0.8.3 predated the switch to use qmp by default, so it was using snapshots on hmp), so it is definitely a candidate for fixing for F15.

Comment 1 Eric Blake 2011-08-25 20:00:00 UTC
Getting this fixed is a prereq to bug 638510 support for live snapshots via the snapshot_blkdev qemu monitor command.

Comment 3 Eric Blake 2011-08-25 21:56:28 UTC
*** Bug 626870 has been marked as a duplicate of this bug. ***

Comment 5 Nan Zhang 2011-08-29 08:39:48 UTC
Verified with libvirt-0.9.4-6.el6.x86_64, move it to VERIFIED.

# virsh start foo
Domain foo started

# virsh list
 Id Name                 State
----------------------------------
  3 foo                  running

# virsh snapshot-list foo
 Name                 Creation Time             State
---------------------------------------------------

# virsh snapshot-create foo
Domain snapshot 1314602860 created

# virsh snapshot-list foo
 Name                 Creation Time             State
---------------------------------------------------
 1314602860           2011-08-29 03:27:40 -0400 running

# qemu-img info /var/lib/libvirt/images/foo.qcow2 
image: /var/lib/libvirt/images/foo.qcow2
file format: qcow2
virtual size: 6.0G (6442450944 bytes)
disk size: 1.5G
cluster_size: 65536
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         1314602860             121M 2011-08-29 03:27:40   00:00:11.959

Comment 6 errata-xmlrpc 2011-12-06 11:27:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html