Bug 876829
Summary: | create external checkpoint snapshot will change the guest pmsuspended state and guest hang forever | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Huang Wenlong <whuang> |
Component: | libvirt | Assignee: | Eric Blake <eblake> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | cwei, dyuan, eblake, mzhan, rbalakri, shyu |
Target Milestone: | rc | Keywords: | TestOnly, Upstream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-03-05 07:19:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Huang Wenlong
2012-11-15 05:11:22 UTC
What's hanging, the VM? It sounds to me like it's in a running state but its processors are stopped or something like that? I reproduced the problem - 'virsh list' says the guest is running, but qemu thinks otherwise: # virsh qemu-monitor-command dom '{"execute":"query-status"}' {"return":{"status":"post-migrate","singlestep":false,"running":false},"id":"libvirt-79"} which matches a state of a paused guest. It is fairly easy to work around - merely run: 'virsh suspend dom' to get libvirt to match qemu state, then 'virsh resume dom' to resume the guest again. But yes, libvirt should be getting the state transition correct on its own; I'll look into patching that. There's no way to do migration-to-file with pmsuspended (S3) state, without help from qemu that won't be happening in 6.4. Right now, the mere act of migration will wake up a guest out of S3, but we have a choice of whether to wake it up into paused state or into running state. I inspected the external memory file, and it is defaulting to the paused state. Upstream patch proposed: https://www.redhat.com/archives/libvir-list/2013-January/msg01744.html In POST, since rebasing will pick up this upstream patch: commit 339bdd99a17eb1420cc5cadf27c36a9637d86f10 Author: Eric Blake <eblake> Date: Tue Jan 8 21:54:45 2013 -0700 snapshot: fix state after external snapshot of S3 domain https://bugzilla.redhat.com/show_bug.cgi?id=876829 complains that if a guest is put into S3 state (such as via virsh dompmsuspend) and then an external snapshot is taken, qemu forcefully transitions the domain to paused, but libvirt doesn't reflect that change internally. Thus, a user has to use 'virsh suspend' to get libvirt back in sync with qemu state, and if the user doesn't know this trick, then the guest appears hung. * src/qemu/qemu_driver.c (qemuDomainSnapshotCreateActiveExternal): Track fact that qemu wakes up a suspended domain on migration. http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-June/msg00027.html if backporting rather than rebasing We decided not to rebase libvirt in RHEL 6.5 to avoid stability issues we faced in 6.4. This bug has already been fixed upstream but it is considered unsuitable for backporting to RHEL 6.5 because at least one of the following conditions is met: - this bug requires new API(s), which we cannot introduce without rebasing libvirt - the patches required to address this bug are complex or invasive causing the backport to be too risky - this bug is not important enough to justify backporting non-trivial patches for it Thus I'm pushing this bug to RHEL 6.6 (and setting Upstream keyword to indicate we have patches upstream) for now. If you don't agree with this resolution, please, give us reasons which you think are strong enough for us to reevaluate the decision not to backport patches for this bug. I'm trying to determine if this bug is related to bug 928762, in which case we may want to pull it back into 6.5. (In reply to Eric Blake from comment #11) > I'm trying to determine if this bug is related to bug 928762, in which case > we may want to pull it back into 6.5. Typo: meant bug 928672 This bug was not selected to be addressed in Red Hat Enterprise Linux 6. We will look at it again within the Red Hat Enterprise Linux 7 product. On RHEL7 with latest libvirt-1.2.7-2.el7.x86_64,creating snapshot for guest which is in pmsuspend status is not supported currently. # virsh list Id Name State ---------------------------------------------------- 50 rhel7-qcow2 pmsuspended # virsh snapshot-create-as rhel7-qcow2 ss --diskspec vda --memspec /tmp/ss error: Operation not supported: qemu doesn't support taking snapshots of PMSUSPENDED guests Is it reasonable to forbid create external disk-only snapshot for guest in pmsuspend status? # virsh snapshot-create-as rhel7-qcow2 ss2 --disk-only error: Operation not supported: qemu doesn't support taking snapshots of PMSUSPENDED guests Since creating snapshot when guest is in pmsuspend status is forbidden for consistency, so I would change this bug to VERIFIED status. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html |