Bug 1697997 - libvirt: live VM snapshots can be created, but restoring fails
Summary: libvirt: live VM snapshots can be created, but restoring fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 30
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-09 12:20 UTC by Zbigniew Jędrzejewski-Szmek
Modified: 2019-05-05 02:19 UTC (History)
16 users (show)

Fixed In Version: qemu-3.1.0-7.fc30
Clone Of:
Environment:
Last Closed: 2019-05-05 02:19:28 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Zbigniew Jędrzejewski-Szmek 2019-04-09 12:20:49 UTC
Description of problem:
I created a new VM, and made a number of snapshots while running it.

While the machine is running, right-clicking on the snaphot in virt-viewer and selecting "start snapshot" doesn't show any error, but it has no effect afaics.

After selecting "virtual machine → shut down → force off", right-clicking on any of the snapshots and selecting "start snapshot" shows error:
Error running snapshot 'after system-upgrade download': internal error: process exited while connecting to monitor: 2019-04-09T12:14:10.576468Z qemu-system-x86_64: Device 'drive-virtio-disk0' does not have the requested snapshot 'after system-upgrade download'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb
    callback(*args, **kwargs)
  File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 66, in newfn
    ret = fn(self, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1123, in revert_to_snapshot
    self._backend.revertToSnapshot(snap.get_backend())
  File "/usr/lib64/python3.7/site-packages/libvirt.py", line 2034, in revertToSnapshot
    if ret == -1: raise libvirtError ('virDomainRevertToSnapshot() failed', dom=self)
libvirt.libvirtError: internal error: process exited while connecting to monitor: 2019-04-09T12:14:10.576468Z qemu-system-x86_64: Device 'drive-virtio-disk0' does not have the requested snapshot 'after system-upgrade download'

It is possible to start the machine, but it'll always use the latest state, not any of the snapshots.

Version-Release number of selected component (if applicable):
libvirt-daemon-5.1.0-4.fc30.x86_64

Additional info:
I installed the machine using 'virt-install --connect qemu:///system -n workstation --os-variant fedora29 --cdrom Fedora-Server-dvd-x86_64-29-1.2.iso --vcpus=2 --memory=2048 --disk size=12', not sure if that matters.

Comment 1 Cole Robinson 2019-04-09 19:59:16 UTC
Thanks for the report, I can reproduce. Trying to run/revert to a snapshot with running state, from an offline VM, always seems to fail with that error.

However if I start the VM, then try reverting to the running snapshot, it seems to work. Do you see that too?

Upstream is affected as well.

Comment 2 Zbigniew Jędrzejewski-Szmek 2019-04-09 20:21:53 UTC
> However if I start the VM, then try reverting to the running snapshot, it seems to work. Do you see that too?

No, it doesn't seem to work here. In fact, that was the sequence I was describing in the orignal report.

What works, is if I take the snapshot after stopping the machine. I can boot the machine from snapshot with "VM state:shutoff".

Comment 3 Cole Robinson 2019-04-09 21:56:37 UTC
Yes I missed the 'no effect' part, I see that too. This is a qemu issue. I bisected my case to:

commit d98f26073bebddcd3da0ba1b86c3a34e840c0fb8
Author: Paolo Bonzini <pbonzini>
Date:   Wed Nov 14 10:38:13 2018 +0100

    target/i386: kvm: add VMX migration blocker
    
    Nested VMX does not support live migration yet.  Add a blocker
    until that is worked out.
    
    Nested SVM only does not support it, but unfortunately it is
    enabled by default for -cpu host so we cannot really disable it.
    
    Signed-off-by: Paolo Bonzini <pbonzini>


And indeed if I try a 'Save' operation in virt-manager it throws an explicit error that migration (save) isn't available with nested VMX.

The snapshot case which uses the qemu 'savevm' must have its own bug that it isn't reporting the issue to the user and abandoning the operation, because no memory data is saved in the disk image at all (qemu-img snapshot -l $path)

So we need to fix the savevm issue and possibly revert the above commit, at least for Fedora. I can't imagine many Fedora end users are dependent on accurate VMX state restore across snapshots or save state, and now that nested VMX is on by default this will basically prevent save+snapshots from working with a default created virt-manager/gnome-boxes VM.

Comment 4 Cole Robinson 2019-04-10 00:13:19 UTC
Okay it's kind of a combo deal. The qemu commit is what's rejecting the snapshot, but libvirt's error detection for the savevm call needs fixing for this case. I'll take care of that. But all that's going to do is fix the error reporting which while very important isn't going to make snapshots actually work.

Paolo, I understand it's 'correct' to reject migration with nested VMX, but this leads to the unfriendly case of save/restore and live snapshots via virt-manager being rejected for VMs on intel machines created with app defaults (which use host-model, and nested VMX is enabled by default these days). Do you have any suggestions?

Comment 5 Cole Robinson 2019-04-10 18:31:57 UTC
These libvirt patches will fix the error detection: https://www.redhat.com/archives/libvir-list/2019-April/msg00735.html

I started a qemu-devel thread about nested VMX blocking migration: https://lists.gnu.org/archive/html/qemu-devel/2019-04/msg01702.html

Comment 6 Cole Robinson 2019-04-17 00:21:41 UTC
Paolo says that kernel support to unblock nested VMX migration is queued upstream so will be in fedora eventually. Until then let's revert the qemu patch

Comment 7 Fedora Update System 2019-04-17 13:58:47 UTC
qemu-3.1.0-7.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-daa39d4827

Comment 8 Fedora Update System 2019-04-18 18:24:28 UTC
qemu-3.1.0-7.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-daa39d4827

Comment 9 Fedora Update System 2019-05-05 02:19:28 UTC
qemu-3.1.0-7.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.