Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 876829 - create external checkpoint snapshot will change the guest pmsuspended state and guest hang forever
create external checkpoint snapshot will change the guest pmsuspended state a...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.0
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Eric Blake
Virtualization Bugs
: TestOnly, Upstream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-15 00:11 EST by Huang Wenlong
Modified: 2016-04-26 10:45 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-03-05 02:19:48 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0323 normal SHIPPED_LIVE Low: libvirt security, bug fix, and enhancement update 2015-03-05 07:10:54 EST

  None (edit)
Description Huang Wenlong 2012-11-15 00:11:22 EST
Description of problem:
create external snapshot will change the guest pmsuspended state and
guest hang forever

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-0.12.1.2-2.331.el6.x86_64
libvirt-0.10.2-8.el6.x86_64
seabios-0.6.1.2-25.el6.x86_64


How reproducible:


Steps to Reproduce:
1.
add these in the guest xml then restart the guest
...
<pm>
<suspend-to-mem enabled='yes'/>
<suspend-to-disk enabled='yes'/>
</pm>
...

<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/demo2.agent'/>
<target type='virtio' name='org.qemu.guest_agent.0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
...

2. install the qemu-guest-agent in the guest then
#service qemu-ga start


3. in the host
#s3 the guest
[root@intel-q9400-4-2 ~]# virsh dompmsuspend test --target mem
Domain test successfully suspended

#check guest state
[root@intel-q9400-4-2 ~]# virsh list
Id Name State
----------------------------------------------------
8 test pmsuspended

# do external checkpoint snapshot
[root@intel-q9400-4-2 ~]# virsh snapshot-create-as test ex-s4 --diskspec
vda --memspec /tmp/ex-s4
Domain snapshot ex-s4 created

#guest state is changed to running
[root@intel-q9400-4-2 ~]# virsh list
Id Name State
----------------------------------------------------
8 test running

and guest will hang forever with running state



Actual results:
as step

Expected results:
guest state should not change and guest do not hang

Additional info:
Comment 2 Dave Allan 2012-12-06 22:11:54 EST
What's hanging, the VM?  It sounds to me like it's in a running state but its processors are stopped or something like that?
Comment 3 Eric Blake 2012-12-21 22:37:10 EST
I reproduced the problem - 'virsh list' says the guest is running, but qemu thinks otherwise:
# virsh qemu-monitor-command dom '{"execute":"query-status"}'
{"return":{"status":"post-migrate","singlestep":false,"running":false},"id":"libvirt-79"}
which matches a state of a paused guest.

It is fairly easy to work around - merely run: 'virsh suspend dom' to get libvirt to match qemu state, then 'virsh resume dom' to resume the guest again.  But yes, libvirt should be getting the state transition correct on its own; I'll look into patching that.

There's no way to do migration-to-file with pmsuspended (S3) state, without help from qemu that won't be happening in 6.4.  Right now, the mere act of migration will wake up a guest out of S3, but we have a choice of whether to wake it up into paused state or into running state.  I inspected the external memory file, and it is defaulting to the paused state.
Comment 5 Eric Blake 2013-01-23 18:29:03 EST
Upstream patch proposed:
https://www.redhat.com/archives/libvir-list/2013-January/msg01744.html
Comment 6 Eric Blake 2013-01-24 19:14:46 EST
In POST, since rebasing will pick up this upstream patch:

commit 339bdd99a17eb1420cc5cadf27c36a9637d86f10
Author: Eric Blake <eblake@redhat.com>
Date:   Tue Jan 8 21:54:45 2013 -0700

    snapshot: fix state after external snapshot of S3 domain
    
    https://bugzilla.redhat.com/show_bug.cgi?id=876829 complains that
    if a guest is put into S3 state (such as via virsh dompmsuspend)
    and then an external snapshot is taken, qemu forcefully transitions
    the domain to paused, but libvirt doesn't reflect that change
    internally.  Thus, a user has to use 'virsh suspend' to get libvirt
    back in sync with qemu state, and if the user doesn't know this
    trick, then the guest appears hung.
    
    * src/qemu/qemu_driver.c (qemuDomainSnapshotCreateActiveExternal):
    Track fact that qemu wakes up a suspended domain on migration.
Comment 9 Eric Blake 2013-06-05 18:38:54 EDT
http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-June/msg00027.html if backporting rather than rebasing
Comment 10 Jiri Denemark 2013-06-11 05:56:04 EDT
We decided not to rebase libvirt in RHEL 6.5 to avoid stability issues
we faced in 6.4. This bug has already been fixed upstream but it is
considered unsuitable for backporting to RHEL 6.5 because at least one
of the following conditions is met:

- this bug requires new API(s), which we cannot introduce without
  rebasing libvirt
- the patches required to address this bug are complex or invasive
  causing the backport to be too risky
- this bug is not important enough to justify backporting non-trivial
  patches for it

Thus I'm pushing this bug to RHEL 6.6 (and setting Upstream keyword to
indicate we have patches upstream) for now. If you don't agree with
this resolution, please, give us reasons which you think are strong
enough for us to reevaluate the decision not to backport patches for
this bug.
Comment 11 Eric Blake 2013-07-09 10:05:16 EDT
I'm trying to determine if this bug is related to bug 928762, in which case we may want to pull it back into 6.5.
Comment 12 Eric Blake 2013-07-09 10:05:58 EDT
(In reply to Eric Blake from comment #11)
> I'm trying to determine if this bug is related to bug 928762, in which case
> we may want to pull it back into 6.5.

Typo: meant bug 928672
Comment 16 Jiri Denemark 2014-04-04 17:37:11 EDT
This bug was not selected to be addressed in Red Hat Enterprise Linux 6. We will look at it again within the Red Hat Enterprise Linux 7 product.
Comment 17 Shanzhi Yu 2014-09-01 05:41:09 EDT
On RHEL7 with latest libvirt-1.2.7-2.el7.x86_64,creating snapshot for guest which is in pmsuspend status is not supported currently.

# virsh list 
 Id    Name                           State
----------------------------------------------------
 50    rhel7-qcow2                    pmsuspended

# virsh snapshot-create-as rhel7-qcow2 ss --diskspec vda --memspec /tmp/ss
error: Operation not supported: qemu doesn't support taking snapshots of PMSUSPENDED guests

Is it reasonable to forbid create external disk-only snapshot for guest in pmsuspend status?

# virsh snapshot-create-as rhel7-qcow2 ss2  --disk-only 
error: Operation not supported: qemu doesn't support taking snapshots of PMSUSPENDED guests
Comment 18 Shanzhi Yu 2014-12-09 05:20:21 EST
Since creating snapshot when guest is in pmsuspend status is forbidden for consistency, so I would change this bug to VERIFIED status.
Comment 20 errata-xmlrpc 2015-03-05 02:19:48 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html

Note You need to log in before you can comment on or make changes to this bug.