Bug 1097503

Summary: guest will be paused and can't resume when do external system checkpoint snapshot with wrong compression format
Product: Red Hat Enterprise Linux 7 Reporter: Shanzhi Yu <shyu>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: dyuan, juzhang, mzhan, pkrempa, rbalakri, yanyang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.7-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 07:35:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shanzhi Yu 2014-05-14 03:28:43 UTC
Description of problem:

guest will be paused and can't resume when do external system checkpoint snapshot with wrong compression format

Version-Release number of selected component (if applicable):

libvirt-1.1.1-29.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev.x86_64

How reproducible:

100%

Steps to Reproduce:

1. prepare an running guest with health os installed

# virsh list --state-running
 Id    Name                           State
----------------------------------------------------
 2     rhel6-qcow2                    running


2. configure an valid value to snapshot_image_format in qemu.conf

# grep "snapshot_image_format =" /etc/libvirt/qemu.conf
snapshot_image_format = "invalid"

# systemctl restart  libvirtd.service

2. create external system checkpoint snapshot

# virsh snapshot-create-as rhel6-qcow2 s4 --memspec file=/tmp/mem.s4
error: operation failed: Invalid snapshot image format specified in configuration file

3. check guest state and resume it

# virsh list
 Id    Name                           State
----------------------------------------------------
 2     rhel6-qcow2                    paused

# virsh resume rhel6-qcow2
error: Failed to resume domain rhel6-qcow2
error: Timed out during operation: cannot acquire state change lock

4. destroy/start guest

# virsh destroy rhel6-qcow2
Domain rhel6-qcow2 destroyed

# virsh start rhel6-qcow2
error: Failed to start domain rhel6-qcow2
error: Timed out during operation: cannot acquire state change lock

5. restart libvirt then start guest

# systemctl restart  libvirtd.service

#virsh start rhel6-qcow2

Domain rhel6-qcow2 started



Actual results:


Expected results:

guest should be not paused when fail to create snapshot or should can be resumed after paused

Additional info:

# grep -i "error"  /tmp/libvirtd.log
2014-05-07 09:44:12.451+0000: 31788: error : qemuDomainSnapshotCreateActiveExternal:12974 : operation failed: Invalid snapshot image format specified in configuration file
2014-05-07 09:44:12.451+0000: 31788: debug : virNetServerProgramSendError:151 : prog=536903814 ver=1 proc=185 type=1 serial=5 msg=0x7fe2cdf03680 rerr=0x7fe2bd197c80
2014-05-07 09:44:48.001+0000: 31787: error : qemuDomainObjBeginJobInternal:1068 : Timed out during operation: cannot acquire state change lock

Comment 1 Peter Krempa 2014-05-14 08:23:40 UTC
Fixed upstream with:

commit 71802685ba49a80326d69fd446d2f25844526ba8
Author: Peter Krempa <pkrempa>
Date:   Wed May 14 09:43:52 2014 +0200

    qemu: snapshot: Terminate job when memory compression program isn't found
    
    If the compression program for external snapshot memory image isn't
    found we exitted the function without terminating the domain job. This
    caused the domain to be unusable.
    
    The problem was introduced in commit 7df5093f.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1097503

Comment 3 Yang Yang 2014-10-28 08:39:14 UTC
Hi Peter,
It seems that the patch in #1 is not pushed 
into the latest libvirt.  However, the bug 
is fixed in the latest libvirt. So I wonder
which patch fixes it. Could you please paste
them here?

qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn,
if (cfg->snapshotImageFormat) {
            compressed = qemuSaveCompressionTypeFromString(cfg->snapshotImageFormat);
            if (compressed < 0) {
                virReportError(VIR_ERR_OPERATION_FAILED, "%s",
                               _("Invalid snapshot image format specified "
                                 "in configuration file"));
                goto cleanup;
            }

            if (!qemuCompressProgramAvailable(compressed)) {
                virReportError(VIR_ERR_OPERATION_FAILED, "%s",
                               _("Compression program for image format "
                                 "in configuration file isn't available"));
                goto cleanup;
            }
        }

Thanks 
Yang

Reproduced and verified steps are as following:

I can reproduce it with libvirt-1.1.1-29.el7.x86_64 and qemu-kvm-rhev-1.5.3-60.el7ev.x86_64.

Verified with libvirt-1.2.8-5.el7.x86_64 and qemu-kvm-rhev-2.1.2-4.el7.x86_64

Steps:
1. start a healthy guest
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     qe-con1                        running

2. # grep "snapshot_image_format =" /etc/libvirt/qemu.conf
snapshot_image_format = "what"
# service libvirtd restart

3. create external memory only snapshot
# virsh snapshot-create-as qe-con1 s1 --memspec file=/tmp/qe-con1.mem
error: operation failed: Invalid snapshot image format specified in configuration file

4. check vm status

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     qe-con1                        running

All the steps got the expected results.

Comment 4 Peter Krempa 2014-10-29 08:14:04 UTC
(In reply to yangyang from comment #3)
> Hi Peter,
> It seems that the patch in #1 is not pushed 
> into the latest libvirt.  However, the bug 
> is fixed in the latest libvirt. So I wonder
> which patch fixes it. Could you please paste
> them here?
> 

The patch is there for a long time now:
 $ git desc --match v* 71802685ba49a80326d69fd446d2f25844526ba8
v1.2.4-59-g7180268

Comment 5 Yang Yang 2014-10-30 02:13:15 UTC
Thanks Peter.

Moved it to verified since steps in #3 got expected results

Comment 7 errata-xmlrpc 2015-03-05 07:35:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html