Description of problem: If any of the disks which are part of the snapshot is in unattached status, then the merge will run indefinitely and the snapshot will remain in locked status. The merge will fail with the error below. === 2018-02-14 06:08:52,571-05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-120) [] EVENT_ID: USER_REMOVE_SNAPSHOT(342), Correlation ID: 692a7af2-46a3-4a13-aa61-ad41ea3db4b5, Job ID: 39da3f19-7854-4350-9c24-0dc199daaccb, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: Snapshot 'snap' deletion for VM 'test_vm' was initiated by admin@internal-authz. 2018-02-14 06:08:54,753-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571 2018-02-14 06:08:54,754-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed in 'MergeVDS' method 2018-02-14 06:08:54,768-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VDSM 10.74.130.111 command MergeVDS failed: Drive image file could not be found === The other disk completed successfully. === 2018-02-14 06:08:54,753-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571 2018-02-14 06:09:20,437-05 INFO [org.ovirt.engine.core.bll.MergeCommandCallback] (DefaultQuartzScheduler9) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Merge command (jobId = 35505ae8-1999-4dc5-89c0-5b6fd006a13e) has completed for images '36a2032a-364d-403f-9b48-1c06ca9843c3'..'73f41734-b3b9-4271-838c-a5f1a8d0c8d6' === The job still exists in the database. engine=# select job_id,status from job where job_id='39da3f19-7854-4350-9c24-0dc199daaccb' ; job_id | status --------------------------------------+--------- 39da3f19-7854-4350-9c24-0dc199daaccb | STARTED (1 row) And the snapshots remains in locked status. engine=# select snapshot_id,status from snapshots where vm_id = '33778b96-efd8-4eb8-abba-229f1eebdf28'; snapshot_id | status --------------------------------------+-------- eac09eec-cc5f-4161-b2e9-85e3b19bbc53 | OK a00edf4d-1f2d-490b-aafd-68e1dfc92dc7 | LOCKED (2 rows) The engine log will be having error below error running indefinitely. 2018-02-14 06:13:32,330-05 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed invoking callback end method 'onFailed' for command '8844ec31-1089-415f-9a3c-ca8d42c5728f' with exception 'null', the callback is marked for end method retries 2018-02-14 06:13:34,337-05 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Command 'RemoveSnapshot' (id: '57b257bf-1a08-4770-8dd8-571c554e567d') waiting on child command id: '8844ec31-1089-415f-9a3c-ca8d42c5728f' type:'RemoveSnapshotSingleDiskLive' to complete Version-Release number of selected component (if applicable): rhevm-4.1.8.2-0.1.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Create a snapshot on two disks on a VM. 2. Detach one of the disk from the VM 3. Do a live merge. Actual results: Merge command is issued for the unattached disk as well and it will run indefinitely. Expected results: Don't issue the merge command to vdsm for the unattached disk since it will fail anyway. Didn't keep the snapshot as locked so that user can attempt the merge command again either online by attaching the disk again or by offline. Additional info: We have added a warning as per bug 1411572 if it contains unattached disk which is showing correctly.
Created attachment 1395875 [details] engine log
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed: [Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ] For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed: [Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ] For more info please contact: rhv-devops
Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync
sync2jira