Description of problem: In case the SD connection fails for a few minutes (less than the timeout configured for snapshot) the snapshot process is reported as "successful" but no option to use this snapshot (clone, review etc.) Version-Release number of selected component (if applicable): - ovirt-engine-4.4.0-0.33.master.el8ev.noarch - libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 - vdsm-4.40.13-1.el8ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Run a VM and create a snapshot with memory 2. Break the connection to the SD and reconnect it after 5 minutes Actual results: The engine reports that the snapshot process was done successfully, but no option to use this snapshot (clone, review etc.) Expected results: If the snapshot process cannot be completed after the connection is restored (continue from the fall point) - the engine should report that the process fails and update accordingly. Additional info:
Verified with: - ovirt-engine-4.4.1.7-0.3.el8ev.noarch - vdsm-4.40.22-1.el8ev.x86_64 - libvirt-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64 Steps to Reproduce: 1. Run a VM and create a snapshot with memory 2. Break the connection to the SD (iSCSI in my case) Result: - The snapshot operation failed after ~minute with "Failed to complete snapshot 'snap1' creation for VM ...." - engine.log 2020-07-05 17:07:16,326+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [] EVENT_ID: USER_CREATE_SNAPSHOT_FINISHED_FAILURE(69), Failed to complete snapshot 'snap1' creation for VM 'bpelled_test_snapshot_break_storage_connection'. - vdsm.log 2020-07-05 17:07:12,933+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') stopping in state failed (force False) (task:1265) 2020-07-05 17:07:12,933+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') ref 1 aborting True (task:1008) 2020-07-05 17:07:12,933+0300 INFO (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') aborting: Task is aborted: "value=Storage domain does not exist: ('b28faf24-4a1f-4bed-8830-21b4d1578141',) abortedcode=358" (task:1190) 2020-07-05 17:07:12,933+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') Prepare: aborted: value=Storage domain does not exist: ('b28faf24-4a1f-4bed-8830-21b4d1578141',) abortedcode=358 (task:1195) 2020-07-05 17:07:12,934+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') ref 0 aborting True (task:1008) 2020-07-05 17:07:12,934+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') Task._doAbort: force False (task:944) 2020-07-05 17:07:12,934+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') moving from state failed -> state aborting (task:624) 2020-07-05 17:07:12,934+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') _aborting: recover policy none (task:578) 2020-07-05 17:07:12,934+0300 DEBUG (jsonrpc/4) [storage.TaskManager.Task] (Task='f271b5e5-2ef3-4240-b17a-563102540548') moving from state failed -> state failed (task:624)
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.