Bug 1837994 - VM gets stuck after previewing memory snapshot - Failed to set time: internal error: unable to execute QEMU agent command 'guest-set-time'
Summary: VM gets stuck after previewing memory snapshot - Failed to set time: interna...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.3.10
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.3.10
: ---
Assignee: Liran Rotenberg
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-20 10:48 UTC by Evelina Shames
Modified: 2023-10-06 20:09 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously,creating a live snapshot with memory while LiveSnapshotPerformFreezeInEngine was set to True, resulted in a virtual machine file system that is frozen when previewing or committing the snapshot with memory restore. In this release, LiveSnapshotPerformFreezeInEngine is set to False by default, making the virtual machine run successfully after creating a preview snapshot from a memory snapshot.
Clone Of:
Environment:
Last Closed: 2020-06-09 11:56:01 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Logs (589.38 KB, application/zip)
2020-05-20 10:48 UTC, Evelina Shames
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-47845 0 None None None 2022-08-21 08:19:02 UTC
Red Hat Knowledge Base (Solution) 5219611 0 None None None 2020-07-13 15:51:37 UTC
Red Hat Product Errata RHBA-2020:2401 0 None None None 2020-06-04 15:25:16 UTC
oVirt gerrit 109174 0 ovirt-4.3 ABANDONED virt: always thaw at the end of live snapshot 2020-12-10 17:35:04 UTC
oVirt gerrit 109181 0 ovirt-engine-4.3 MERGED packaging: snapshot move freeze in engine to false 2020-12-10 17:35:04 UTC

Description Evelina Shames 2020-05-20 10:48:30 UTC
Created attachment 1690165 [details]
Logs

Description of problem:
VM gets stuck after previewing memory snapshot with the following error:

VDSM:
2020-05-20 13:31:44,172+0300 ERROR (vm/f7c88d3e) [virt.vm] (vmId='f7c88d3e-60bb-4c08-be9f-1b41cb63ea41') Failed to set time: internal error: unable to execute QEMU agent command 'guest-set-time': The command guest-set-time has been disabled for this instance (vm:1621)

Attaching engine, vdsm and qemu logs.


Version-Release number of selected component (if applicable):
vdsm-4.30.46-1.el7ev.x86_64
qemu-img-rhev-2.12.0-44.el7_8.2.x86_64
ovirt-engine-4.3.10.3-0.1.master.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM from template (latest-rhel-guest-image-8.2-infra)
2. Create snapshot
3. Run VM
4. Create memory snapshop
5. Power off VM
6. Preview memory snapshot
7. Run VM

Actual results:
VM gets stuck after powering up.

Expected results:
VM should not get stuck.

Additional info:
Relevant Logs are attached.

Comment 1 Michal Skrivanek 2020-05-20 12:50:58 UTC
please get the exact qemu-guest-agent version from the guest, and the exact arguments it's running with (ps ax output or soemthing)

Comment 2 Avihai 2020-05-20 16:09:01 UTC
(In reply to Michal Skrivanek from comment #1)
> please get the exact qemu-guest-agent version from the guest, and the exact
> arguments it's running with (ps ax output or soemthing)

Root cause as explained by Liran R. is this patch[1] .

To fix:
change in LiveSnapshotPerformFreezeInEngine to false in engine-config.
Same Test passed and VM goes up without issues (tested on both templates 8.2 and 7.6) 

If other info is still necessary please re-add the NEEDINFO.

[1] https://gerrit.ovirt.org/#/c/108673

Comment 3 RHEL Program Management 2020-05-21 00:40:37 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 6 Liran Rotenberg 2020-05-21 09:11:32 UTC
The timing of the freeze and thaw using the engine when doing a snapshot with memory looks problematic.
In VDSM we thaw right after the libvirt command finishes, and executing more actions on the drivers.
When doing it only in the engine the FS is still frozen at this time.The QEMU error - 'guest-set-time' seems to relate exactly to this.

Possible workarounds:
1. Change the config of LiveSnapshotPerformFreezeInEngine to false.
2. Shutting down the VM and starting it again.
3. Doing preview, starting the VM(it will be frozen), shutting it down, committing the snapshot and start.
4. Running: # vdsm-client VM thaw vmID=<uuid> on the host the VM is running and the FS is frozen.
5. Create the snapshot without memory.

For now we should switch back the default LiveSnapshotPerformFreezeInEngine to false.
Follow-up bug: BZ 1838493

Comment 11 Michal Skrivanek 2020-05-21 09:38:06 UTC
this is just to fix unintended discrepancy between 4.4 and 4.3

Comment 14 Shir Fishbain 2020-05-24 09:06:50 UTC
Verified. The VM running successfully after taking a preview snapshot on memory snapshot.

Verified it with the following versions:
ovirt-engine-4.3.10.4-0.1.el7.noarch
qemu-img-rhev-2.12.0-44.el7_8.2.x86_64
vdsm-4.30.46-1.el7ev.x86_64

Comment 15 Eli Marcus 2020-06-01 11:40:25 UTC
Hi Liran, please review this release notes text, I need it approved as soon as possible for the Erratum going into the 4.3.10 release: 

Previously,creating a live snapshot with memory while LiveSnapshotPerformFreezeInEngine was set to True, resulted in a virtual machine file system that is frozen when previewing or committing the snapshot with memory restore.
In this release, the virtual machine runs successfully after creating a preview snapshot from a memory snapshot.

Comment 18 errata-xmlrpc 2020-06-04 15:25:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2401

Comment 19 Evelina Shames 2020-06-09 10:59:13 UTC
On upgrade, the value of LiveSnapshotPerformFreezeInEngine doesn't change.
If customer has non-HE environment, and they update their system, the value remains as it was - 'false', and that is OK because on 4.3.10.4 the value changed back to 'false'. 

If customer installs from scratch, it still OK because the value is 'false'.

But if the customer installs from scratch HE environment, the installation here works in a different way - the appliance version is 4.3.10.3 and to get 4.3.10.4, there is a need to upgrade it - and here is the problem, the value remains 'true' as it was defined in 4.3.10.3.

Moving back to 'Assigned'.

Tested on HE env:
ovirt-engine-4.3.10.4-0.1.el7.noarch after upgrade from 4.3.10.3.

Comment 20 Liran Rotenberg 2020-06-09 11:56:01 UTC
There is nothing to do with it now.

In 4.3.10.3 we had a problem, by setting LiveSnapshotPerformFreezeInEngine=true.
Users getting this version should change that value by: engine-config -s LiveSnapshotPerformFreezeInEngine=false.
The LiveSnapshotPerformFreezeInEngine is persistent through upgrades. The 4.3.10.3 had a respin, unfortunately as I understand from you, the appliance didn't respin.

We have BZ 1838493 which tagged to be in 4.3.11. In this bug, we solved the real problem.
This means, LiveSnapshotPerformFreezeInEngine can be set to either 'true' or 'false' and you won't have any problem.

Therefore, it doesn't make sense to change this value to 'false' on each upgrade(making the customer manually set this value every upgrade) nor a one time change.

Bottom line: Upgrading from 4.3.10.3 to 4.3.11 should solve any issue, also if LiveSnapshotPerformFreezeInEngine is 'true'. If the user have 4.3.10.3 and doesn't want to upgrade or upgrade to version < 4.3.11, then he needs to set LiveSnapshotPerformFreezeInEngine to 'false'.

I'm closing the bug.

Marina, FYI.


Note You need to log in before you can comment on or make changes to this bug.