Created attachment 1745035 [details] reg_logs Description of problem: Live merge fails after live disks migration with the following errors and VM's disks remain in 'locked' state more that 40 min. Engine log: 2021-01-06 20:29:09,991+02 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedExecutorService-commandCoordinator-Thread-10) [snapshots_delete_d7787736-a91b-4aaa] Failed to live merge. Top volume e4caedd9-0da0-4a6f-b107-1c838b9b4d62 is still in qemu chain [4a352470-dfea-4355-951b-b5d23fe62184, e4caedd9-0da0-4a6f-b107-1c838b9b4d62, 32cbe2c0-0c6c-490b-8956-2acb2a3b2218, c61f9339-eb8b-486a-88f7-045880b4ffc8, 7660231c-d004-4064-8186-9608b4a6228c] 2021-01-06 20:29:11,951+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-57) [snapshots_delete_d7787736-a91b-4aaa] Command id: 'ec6a66d8-550e-4327-9e86-59f567a647c4 failed child command status for step 'MERGE_STATUS' 2021-01-06 20:29:12,976+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43) [snapshots_delete_d7787736-a91b-4aaa] Merging of snapshot 'c17659bf-4dbe-483a-a10b-e20c5200ea41' images '7660231c-d004-4064-8186-9608b4a6228c'..'e4caedd9-0da0-4a6f-b107-1c838b9b4d62' failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation. 2021-01-06 20:29:12,979+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43) [snapshots_delete_d7787736-a91b-4aaa] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand' with failure. 2021-01-06 20:30:00,808+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [snapshots_delete_d7787736-a91b-4aaa] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand' with failure. 2021-01-06 20:30:00,819+02 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [snapshots_delete_d7787736-a91b-4aaa] method: get, params: [250b02e5-dc44-41bc-9afe-f2014a96e910], timeElapsed: 2ms 2021-01-06 20:30:00,825+02 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [snapshots_delete_d7787736-a91b-4aaa] method: get, params: [985aa6f3-cf18-4b2d-9ced-af7d91924fac], timeElapsed: 1ms 2021-01-06 20:30:00,829+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [snapshots_delete_d7787736-a91b-4aaa] EVENT_ID: USER_REMOVE_SNAPSHOT_FINISHED_FAILURE(357), Failed to delete snapshot 'snapshot_6057_iscsi_1' for VM 'vm_TestCase6057_0619553496'. Version-Release number of selected component (if applicable): ovirt-engine-4.4.4.5-0.10.el8ev.noarch vdsm-4.40.40-1.el8ev.x86_64 How reproducible: Can reproduce it with automation test case ~70%, couldn't reproduce it manually. Steps to Reproduce: 1. Create a VM with 4 disks on block SD (all permutations) and OS installed 2. Create file_1 3. Create a snapshot of the VM with all disks 4. Create file_2 5. Create a snapshot 2 of the VM with all disks 6. Create file_3 7. Create snapshot 3 of the VM with all disks 8. Create file_4 9. Perform Live Disk Migration to another SD of the same type 10. On completion of he Disk Migration, delete snapshot 2 Actual results: Live merge fails Expected results: Live merge should succeed Additional info: Attaching regular+debug logs.
Created attachment 1745036 [details] logs-debug
The issue seems to be extending the disk while doing live merge, which is the same issue as in BZ #1796415. BZ #1796415 should be fixed and there were lots of changes in live merge code. Could you reproduce with 4.4.6? If yes, could you please upload vdsm logs?
I believe this is basically duplicate of BZ #1796415. Closing as duplicate. Please reopen if you spot it again. *** This bug has been marked as a duplicate of bug 1796415 ***
(In reply to Vojtech Juranek from comment #7) > The issue seems to be extending the disk while doing live merge, which is > the same issue as in BZ #1796415. BZ #1796415 should be fixed and there were > lots of changes in live merge code. Could you reproduce with 4.4.6? If yes, > could you please upload vdsm logs? Not reproduced :)