Created attachment 1897890 [details] engine log files (relevant part) Description of problem: Following a live disk migration from a storage domain to another, ovirt-engine is broken (e.g. unavailable), and won't start/restart. Digging in engine.log I can find: ERROR [org.ovirt.engine.core.utils.serialization.json.JsonObjectDeserializer] (ServerService Thread Pool -- 45) [] Cannot deserialize { "@class" : "org.ovirt.engine.core.common.action.CreateSnapshotDiskParameters", "commandId" : [ "org.ovirt.engine.core.compat.Guid", { "uuid" : "6ae544f6-b608-4d8d-9f99-eabd5d5db0ad" } ], [...cut...] "domain" : "my.dom.ain"[truncated 5971 chars]; line: 72, column: 89] (through reference chain: org.ovirt.engine.core.common.action.CreateSnapshotDiskParameters["diskImagesMap"]) 2022-07-13 09:57:48,315+02 ERROR [org.ovirt.engine.core.bll.InitBackendServicesOnStartupBean] (ServerService Thread Pool -- 45) [] Failed to initialize backend: org.jboss.weld.exceptions.WeldException: WELD-000049: Unable to invoke public void org.ovirt.engine.core.bll.tasks.CommandContextsCacheImpl.initContextsMap() on org.ovirt.engine.core.bll.tasks.CommandContextsCacheImpl@3f52ccce [...cut...] Version-Release number of selected component (if applicable): ovirt-engine-4.5.1.3-1.el8.noarch How reproducible: Cannot reproduce, happened only once over many disk image live migrations Actual results: ovirt engine (and admin portal) hangs, won't start/restart Expected results: Disk live migration complete successful Additional info:
In the command_entities table there are many rows (65): 2 of them reference the disk live migration task. No rows in job table reference their correlationId, so removing those rows from command_entities made the engine start again. Find attached content of command_entities.
Created attachment 1897891 [details] Content of command_entities table
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
caused by the fix for bz 1958032
Verification flow: 1. Start LSM 2. Restart ovirt-engine during LSM ---> Service is up, no 'Cannot deserialize' error in logs. Version: ovirt-engine-4.5.2-0.3.el8ev But LSM gets stuck, a new bug was opened for this issue: bug 2110186
(In reply to Evelina Shames from comment #6) > But LSM gets stuck, a new bug was opened for this issue: bug 2110186 right, we restarted the engine in order to ensure that we deserialize the parameters but in light of this issue that was found when doing that (bz 2110186), I guess that's not what the user did but the parameters were deserialized for a different reason (no necessarily when the engine started, we clear the cached parameters from time to time and then deserialize them from the database as well) - therefore it makes sense to verify this bug and consider the issue during engine restart as a separate issue
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.