Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1246114

Summary: [RFE][scale] Snapshot deletion of poweredoff VM takes longer time.
Product: [oVirt] ovirt-engine Reporter: dev-unix-virtualization
Component: GeneralAssignee: Ala Hino <ahino>
Status: CLOSED CURRENTRELEASE QA Contact: Raz Tamir <ratamir>
Severity: high Docs Contact:
Priority: high    
Version: ---CC: ahino, amureini, bugs, dev-unix-virtualization, eheftman, gklein, lpeer, lsurette, mst, nsoffer, ratamir, rbalakri, Rhev-m-bugs, srevivo, tnisan, ykaul, ylavi
Target Milestone: ovirt-4.1.0-betaKeywords: FutureFeature
Target Release: 4.1.0.2Flags: rule-engine: ovirt-4.1+
gklein: testing_plan_complete-
ylavi: planning_ack+
amureini: devel_ack+
eberman: testing_ack+
Hardware: All   
OS: Unspecified   
URL: https://github.com/alhino/ovirt-site/blob/29cf794c5c6b246e32ba41227e476a719ee894a8/source/develop/release-management/features/storage/remove-snapshot.html.md
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Previously, when the Virtual Machine was powered down, deleting a snapshot could potentially be a very long process. This was due to the need to copy the data from the base snapshot to the top snapshot, where the base snapshot is usually larger than the top snapshot. Now, when deleting a snapshot when the Virtual Machine is powered down, data is copied from the top snapshot to the base snapshot, which significantly reduces the time required to delete the snapshot.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-01 14:54:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1408583    
Bug Blocks: 1256500, 1275655, 1369942, 1395146    
Attachments:
Description Flags
snaphost
none
VDSM Logs
none
Engine Logs none

Description dev-unix-virtualization 2015-07-23 13:17:56 UTC
Created attachment 1055381 [details]
snaphost

Description of problem:
Snapshot deletion of powered off VMs takes longer.


Version-Release number of selected component (if applicable):
Version 3.5.1-0.4.el6ev

How reproducible:Always


Steps to Reproduce:
1.Create a snapshot of powered off VM
2.Try to delete the snapshot
3.Snapshot will be in locked state for more than 10-15 mins which is huge when compared to live VM snapshot removal which is taking 1-2 mins.

Actual results:


Expected results:


Additional info:

Comment 1 dev-unix-virtualization 2015-07-23 13:19:42 UTC
Created attachment 1055382 [details]
VDSM Logs

Comment 2 dev-unix-virtualization 2015-07-23 13:21:17 UTC
Created attachment 1055383 [details]
Engine Logs

Comment 3 Allon Mureinik 2015-07-26 10:48:44 UTC
This needs some initial analysis to see what's taking up the time here.

Nir - can you take a look please?

Comment 4 Nir Soffer 2015-07-26 10:58:39 UTC
(In reply to dev-unix-virtualization from comment #0)
> 3.Snapshot will be in locked state for more than 10-15 mins which is huge
> when compared to live VM snapshot removal which is taking 1-2 mins.

Both tests were done on same storage and same wipe-after-delete settings?

Comment 5 dev-unix-virtualization 2015-07-26 12:48:49 UTC
Yes, There was no change w.r.t to storage and settings, Also reproducible every time on our 2 test setups. One setup is on iSCSI storage domain and one is on local storage/NFS.

Comment 6 Nir Soffer 2015-07-26 13:36:32 UTC
(In reply to dev-unix-virtualization from comment #5)
> Yes, There was no change w.r.t to storage and settings, Also reproducible
> every time on our 2 test setups. One setup is on iSCSI storage domain and
> one is on local storage/NFS.

Can you provide detailed description, how to reproduce this issue?

- Vm configuration
- Disk configuration (format, virtual size, actual size)
- What os to install
- How to create the snapshot/delete (using the rest api?)

Comment 7 dev-unix-virtualization 2015-07-26 14:29:32 UTC
RHEV-M:- Version 3.5.1-0.4.el6ev
RHEVH:- RHEV Hypervisor - 7.1 - 20150420.0.el7ev
Kernel Version:- 3.10.0 - 229.1.2.el7.x86_64
VDSM Version:vdsm-4.16.13.1-1.el7ev

CPU Model:- Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
CPU Type: Intel Westmere Family
Model:- PowerEdge R410


VM Configuration.

OS:- Windows 2012 
Virtual size:- 32GB
Actual size :- 8GB
Interface type:- VirtIO
Storage type:- iSCSI from Dell Compellent array.
Allocation Policy:- Thin Provisioned


RestAPI used to create and delete the snapshot.

For Create Snapshot:
										<VM ID>
POST REAT API: [https://rhevm.devemc.commvault.com/api/vms/8f7eccd9-156e-4790-9ef7-db090e4662cf/snapshot]

Body: [<?xml version="1.0" encoding="UTF-8" standalone="no" ?> <snapshot>

  <description>_GX_BACKUP_vm1_25_20111_392cd700</description>

</snapshot>]


For Delete Snapshot:
										<VM ID>						<Snapshot Id>
DELETE REST API: [https://rhevm.devemc.commvault.com/api/vms/8f7eccd9-156e-4790-9ef7-db090e4662cf/snapshots/6195d4a0-7d85-404e-a659-06f7852289e4]

Body: [<?xml version="1.0" encoding="UTF-8" standalone="no" ?> <action>

  <async>false</async>

  <detach>false</detach>

</action>]

Comment 8 dev-unix-virtualization 2015-09-01 22:00:30 UTC
Any update on this issue ?

Comment 9 dev-unix-virtualization 2015-09-09 14:53:37 UTC
Can you please elaborate upon what info is being asked for ? We have provided requested information in Comment#7 as a response to Comment#6.

Comment 10 Yaniv Lavi 2015-09-09 15:39:21 UTC
The need info was on the assignee, restoring it.

Comment 11 Nir Soffer 2015-09-14 14:30:16 UTC
(In reply to dev-unix-virtualization from comment #8)
> Any update on this issue ?

Cold snapshot delete is implemented differently and is less efficient
than live snapshot delete, since "qemu-img commit" was not available when
cold snapshot was developed.

We still need to investigate further to see if the the timing info in
comment 0 is normal.

Comment 12 Sandro Bonazzola 2015-10-26 12:48:41 UTC
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 13 Yaniv Lavi 2015-10-29 09:52:16 UTC
*** Bug 1206328 has been marked as a duplicate of this bug. ***

Comment 14 Maor 2015-11-11 16:40:18 UTC
*** Bug 1275652 has been marked as a duplicate of this bug. ***

Comment 15 Yaniv Lavi 2015-11-16 12:54:24 UTC
We will be rewriting this flow in RHEV 4 to make it faster and more similar to live merge. This is not a task possible for RHEV 3.6.

Comment 16 Red Hat Bugzilla Rules Engine 2015-11-30 19:12:59 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 17 Nikolai Sednev 2015-12-09 12:52:54 UTC
Happens also on FC storage domain.

Comment 18 Aharon Canan 2016-01-19 15:16:36 UTC
I just realized that this is a regression (following comment #2 on bug 1275652 which was closed as duplicate of this one)

As it affects our automation and as this is a regression , I would like to raise and try to get a fix on 3.6.

Comment 19 Red Hat Bugzilla Rules Engine 2016-01-19 15:16:38 UTC
This bug is marked for z-stream, yet the milestone is for a major version, therefore the milestone has been reset.
Please set the correct milestone or drop the z stream flag.

Comment 20 Red Hat Bugzilla Rules Engine 2016-01-19 15:16:38 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 21 Allon Mureinik 2016-01-19 15:22:01 UTC
(In reply to Aharon Canan from comment #18)
> I just realized that this is a regression (following comment #2 on bug
> 1275652 which was closed as duplicate of this one)
Cold merge is slower than live merge, even in 3.5 (as comment 2 states).
This is not a regression.

> As it affects our automation and as this is a regression , I would like to
> raise and try to get a fix on 3.6.
The solution is to rewrite the flow. Hardly a 3.6 item.

Comment 22 Nir Soffer 2016-11-22 21:55:39 UTC
We merged the new api, but it is not implemented yet.

Comment 23 Tal Nisan 2017-01-12 13:21:04 UTC
Ala, please update the doc text with the feature page or a relevant text