Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1545193

Summary: Snapshot remains in locked status if one of the disk was inactive while doing live merge
Product: Red Hat Enterprise Virtualization Manager Reporter: nijin ashok <nashok>
Component: ovirt-engineAssignee: Ala Hino <ahino>
Status: CLOSED ERRATA QA Contact: Evelina Shames <eshames>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.8CC: apinnick, bcholler, lsurette, lveyde, michal.skrivanek, ratamir, rbalakri, Rhev-m-bugs, srevivo, ykaul, ylavi
Target Milestone: ovirt-4.2.2Flags: lsvaty: testing_plan_complete-
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.2.2.4 Doc Type: Bug Fix
Doc Text:
Previously, if a user tried to perform a live merge of a snapshot that included unattached disks, the live merge did not finish and the snapshot remained locked. In the current release, live merge is blocked if the snapshot includes unattached disks.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-15 17:48:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log none

Description nijin ashok 2018-02-14 12:04:35 UTC
Description of problem:

If any of the disks which are part of the snapshot is in unattached status, then the merge will run indefinitely and the snapshot will remain in locked status.

The merge will fail with the error below.

===
2018-02-14 06:08:52,571-05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-120) [] EVENT_ID: USER_REMOVE_SNAPSHOT(342), Correlation ID: 692a7af2-46a3-4a13-aa61-ad41ea3db4b5, Job ID: 39da3f19-7854-4350-9c24-0dc199daaccb, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: Snapshot 'snap' deletion for VM 'test_vm' was initiated by admin@internal-authz.


2018-02-14 06:08:54,753-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571
2018-02-14 06:08:54,754-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed in 'MergeVDS' method
2018-02-14 06:08:54,768-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VDSM 10.74.130.111 command MergeVDS failed: Drive image file could not be found
===

The other disk completed successfully.


===
2018-02-14 06:08:54,753-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571


2018-02-14 06:09:20,437-05 INFO  [org.ovirt.engine.core.bll.MergeCommandCallback] (DefaultQuartzScheduler9) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Merge command (jobId = 35505ae8-1999-4dc5-89c0-5b6fd006a13e) has completed for images '36a2032a-364d-403f-9b48-1c06ca9843c3'..'73f41734-b3b9-4271-838c-a5f1a8d0c8d6'
===

The job still exists in the database.

engine=# select job_id,status from job where job_id='39da3f19-7854-4350-9c24-0dc199daaccb' ;
                job_id                | status  
--------------------------------------+---------
 39da3f19-7854-4350-9c24-0dc199daaccb | STARTED
(1 row)

And the snapshots remains in locked status.


engine=# select snapshot_id,status from snapshots where vm_id = '33778b96-efd8-4eb8-abba-229f1eebdf28';
             snapshot_id              | status 
--------------------------------------+--------
 eac09eec-cc5f-4161-b2e9-85e3b19bbc53 | OK
 a00edf4d-1f2d-490b-aafd-68e1dfc92dc7 | LOCKED
(2 rows)



The engine log will be having error below error running indefinitely. 


2018-02-14 06:13:32,330-05 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed invoking callback end method 'onFailed' for command '8844ec31-1089-415f-9a3c-ca8d42c5728f' with exception 'null', the callback is marked for end method retries
2018-02-14 06:13:34,337-05 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Command 'RemoveSnapshot' (id: '57b257bf-1a08-4770-8dd8-571c554e567d') waiting on child command id: '8844ec31-1089-415f-9a3c-ca8d42c5728f' type:'RemoveSnapshotSingleDiskLive' to complete


Version-Release number of selected component (if applicable):

rhevm-4.1.8.2-0.1.el7.noarch

How reproducible:

100%

Steps to Reproduce:
1. Create a snapshot on two disks on a VM.
2. Detach one of the disk from the VM
3. Do a live merge.

Actual results:

Merge command is issued for the unattached disk as well and it will run indefinitely.

Expected results:

Don't issue the merge command to vdsm for the unattached disk since it will fail anyway. Didn't keep the snapshot as locked so that user can attempt the merge command again either online by attaching the disk again or by offline. 

Additional info:

We have added a warning as per bug 1411572 if it contains unattached disk which is showing correctly.

Comment 1 nijin ashok 2018-02-14 12:06:51 UTC
Created attachment 1395875 [details]
engine log

Comment 4 RHV bug bot 2018-03-16 15:03:36 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 5 Evelina Shames 2018-03-18 12:10:34 UTC
Verified.

Comment 9 errata-xmlrpc 2018-05-15 17:48:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 10 Franta Kust 2019-05-16 13:05:06 UTC
BZ<2>Jira Resync

Comment 11 Daniel Gur 2019-08-28 13:12:36 UTC
sync2jira

Comment 12 Daniel Gur 2019-08-28 13:16:49 UTC
sync2jira