Bug 1985973 - Remove the abort snapshot behavior
Summary: Remove the abort snapshot behavior
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.8.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.4.9
: 4.4.9
Assignee: Liran Rotenberg
QA Contact: Nisim Simsolo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-26 12:06 UTC by Liran Rotenberg
Modified: 2021-11-20 08:04 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.4.9-1, vdsm-4.40.90.2
Doc Type: Bug Fix
Doc Text:
Previously, while executing a snapshot without memory to a VM, it created an abort time for the overall process. Now, we have a new timeout config - `LiveSnapshotFreezeTimeout` which relevant to this flow, making an abort before switching the VM volumes and saving the VM from data inconsistency when freezing the file system.
Clone Of:
Environment:
Last Closed: 2021-11-20 08:04:20 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 116280 0 master MERGED snapshot: don't invoke abort to non-memory 2021-08-31 13:46:42 UTC
oVirt gerrit 116342 0 master MERGED snapshot: introduce non-memory timeout 2021-10-06 12:45:29 UTC
oVirt gerrit 116343 0 master MERGED core: introduce non-memory snapshot timeout 2021-10-05 16:49:10 UTC
oVirt gerrit 116978 0 ovirt-engine-4.4 MERGED core: introduce non-memory snapshot timeout 2021-10-06 13:28:16 UTC
oVirt gerrit 116987 0 ovirt-4.4.z MERGED snapshot: introduce non-memory timeout 2021-10-06 13:19:30 UTC
oVirt gerrit 116992 0 ovirt-4.4.z MERGED snapshot: don't invoke abort to non-memory 2021-10-06 14:04:46 UTC

Description Liran Rotenberg 2021-07-26 12:06:01 UTC
After a discussion, it seems the abortion of snapshot job in VDSM doesn't make much sense. The abort is mostly about hitting timeout while calling libvirt (or doing preparations to that call), an operation that should be very fast; while we mostly hit timing issues calling freeze/thaw operations. In those cases we can't really abort, and if we already passed them we probably should let the snapshot operation to finish.

Therefore, the current decision is to remove the abort mechanism.

Comment 1 Liran Rotenberg 2021-07-27 10:51:39 UTC
In further discussion we saw we have 2 main flows:

1. Snapshot with memory - In such case it makes sense to have a timeout, failing the operation and releasing the VM.
   Even so, we need to think about the timeout (currently 30 minutes by default and configurable in engine-config).
   Also, to consider timeout per snapshot. 

2. Snapshot without memory - In this case we usually desire that the snapshot will be completed.
   In this case we may consider to drop the timeout.

Comment 3 Sandro Bonazzola 2021-11-20 08:04:20 UTC
This bugzilla is included in oVirt 4.4.9 release, published on October 20th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.9 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.