Bug 1545193 - Snapshot remains in locked status if one of the disk was inactive while doing live merge
Summary: Snapshot remains in locked status if one of the disk was inactive while doing...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.8
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ovirt-4.2.2
: ---
Assignee: Ala Hino
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-14 12:04 UTC by nijin ashok
Modified: 2021-03-11 20:19 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.2.2.4
Doc Type: Bug Fix
Doc Text:
Previously, if a user tried to perform a live merge of a snapshot that included unattached disks, the live merge did not finish and the snapshot remained locked. In the current release, live merge is blocked if the snapshot includes unattached disks.
Clone Of:
Environment:
Last Closed: 2018-05-15 17:48:28 UTC
oVirt Team: Storage
Target Upstream Version:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)
engine log (143.14 KB, text/plain)
2018-02-14 12:06 UTC, nijin ashok
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:1488 0 None None None 2018-05-15 17:49:44 UTC
oVirt gerrit 88294 0 master MERGED core: Validate snapshot disks are plugged 2021-02-10 17:23:32 UTC
oVirt gerrit 88312 0 ovirt-engine-4.2 MERGED core: Validate snapshot disks are plugged 2021-02-10 17:23:32 UTC

Description nijin ashok 2018-02-14 12:04:35 UTC
Description of problem:

If any of the disks which are part of the snapshot is in unattached status, then the merge will run indefinitely and the snapshot will remain in locked status.

The merge will fail with the error below.

===
2018-02-14 06:08:52,571-05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-120) [] EVENT_ID: USER_REMOVE_SNAPSHOT(342), Correlation ID: 692a7af2-46a3-4a13-aa61-ad41ea3db4b5, Job ID: 39da3f19-7854-4350-9c24-0dc199daaccb, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: Snapshot 'snap' deletion for VM 'test_vm' was initiated by admin@internal-authz.


2018-02-14 06:08:54,753-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571
2018-02-14 06:08:54,754-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed in 'MergeVDS' method
2018-02-14 06:08:54,768-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-5) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VDSM 10.74.130.111 command MergeVDS failed: Drive image file could not be found
===

The other disk completed successfully.


===
2018-02-14 06:08:54,753-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-5-thread-6) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] START, MergeVDSCommand(HostName = 10.74.130.111, MergeVDSCommandParameters:{runAsync='true', hostId='dba8d52f-1524-4f6c-8e30-0115de2990a4', vmId='33778b96-efd8-4eb8-abba-229f1eebdf28', storagePoolId='00000001-0001-0001-0001-000000000311', storageDomainId='92227e03-01c8-4f41-999e-032175f20116', imageGroupId='313aceb7-1871-477d-b1af-294dcc4cf9d4', imageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', baseImageId='36a2032a-364d-403f-9b48-1c06ca9843c3', topImageId='73f41734-b3b9-4271-838c-a5f1a8d0c8d6', bandwidth='0'}), log id: 239eb571


2018-02-14 06:09:20,437-05 INFO  [org.ovirt.engine.core.bll.MergeCommandCallback] (DefaultQuartzScheduler9) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Merge command (jobId = 35505ae8-1999-4dc5-89c0-5b6fd006a13e) has completed for images '36a2032a-364d-403f-9b48-1c06ca9843c3'..'73f41734-b3b9-4271-838c-a5f1a8d0c8d6'
===

The job still exists in the database.

engine=# select job_id,status from job where job_id='39da3f19-7854-4350-9c24-0dc199daaccb' ;
                job_id                | status  
--------------------------------------+---------
 39da3f19-7854-4350-9c24-0dc199daaccb | STARTED
(1 row)

And the snapshots remains in locked status.


engine=# select snapshot_id,status from snapshots where vm_id = '33778b96-efd8-4eb8-abba-229f1eebdf28';
             snapshot_id              | status 
--------------------------------------+--------
 eac09eec-cc5f-4161-b2e9-85e3b19bbc53 | OK
 a00edf4d-1f2d-490b-aafd-68e1dfc92dc7 | LOCKED
(2 rows)



The engine log will be having error below error running indefinitely. 


2018-02-14 06:13:32,330-05 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Failed invoking callback end method 'onFailed' for command '8844ec31-1089-415f-9a3c-ca8d42c5728f' with exception 'null', the callback is marked for end method retries
2018-02-14 06:13:34,337-05 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler1) [692a7af2-46a3-4a13-aa61-ad41ea3db4b5] Command 'RemoveSnapshot' (id: '57b257bf-1a08-4770-8dd8-571c554e567d') waiting on child command id: '8844ec31-1089-415f-9a3c-ca8d42c5728f' type:'RemoveSnapshotSingleDiskLive' to complete


Version-Release number of selected component (if applicable):

rhevm-4.1.8.2-0.1.el7.noarch

How reproducible:

100%

Steps to Reproduce:
1. Create a snapshot on two disks on a VM.
2. Detach one of the disk from the VM
3. Do a live merge.

Actual results:

Merge command is issued for the unattached disk as well and it will run indefinitely.

Expected results:

Don't issue the merge command to vdsm for the unattached disk since it will fail anyway. Didn't keep the snapshot as locked so that user can attempt the merge command again either online by attaching the disk again or by offline. 

Additional info:

We have added a warning as per bug 1411572 if it contains unattached disk which is showing correctly.

Comment 1 nijin ashok 2018-02-14 12:06:51 UTC
Created attachment 1395875 [details]
engine log

Comment 4 RHV bug bot 2018-03-16 15:03:36 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.comINFO: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.com

Comment 5 Evelina Shames 2018-03-18 12:10:34 UTC
Verified.

Comment 9 errata-xmlrpc 2018-05-15 17:48:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 10 Franta Kust 2019-05-16 13:05:06 UTC
BZ<2>Jira Resync

Comment 11 Daniel Gur 2019-08-28 13:12:36 UTC
sync2jira

Comment 12 Daniel Gur 2019-08-28 13:16:49 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.