Bug 1255221

Summary: oVirt 3.6 using Cinder as external store does not remove cloned disk image - ceph backend
Product: [oVirt] ovirt-engine Reporter: Darryl Bond <darryl.bond>
Component: GeneralAssignee: Maor <mlipchuk>
Status: CLOSED CURRENTRELEASE QA Contact: Ori Gofen <ogofen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: acanan, amureini, bazulay, berrange, bugs, darryl.bond, dasmith, ecohen, fdeutsch, gklein, lsurette, mgoldboi, mlipchuk, ndipanov, pbrady, rbalakri, rbarry, rbryant, sbauza, sferdjao, sgordon, vromanso, ycui, yeylon
Target Milestone: ovirt-3.6.0-rcFlags: rule-engine: ovirt-3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-27 07:56:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1264691    
Bug Blocks: 1135132    
Attachments:
Description Flags
engine.log none

Description Darryl Bond 2015-08-20 03:02:56 UTC
Description of problem: When a cloned VM using cinder integration is created, the ceph RBD is cloned from a locked snapshot as expected.

When the VM is deleted the cloned snapshot disk remains in ceph. The snapshot cannot be deleted.


Version-Release number of selected component (if applicable): 3.6 beta


How reproducible:


Steps to Reproduce:
1. Create VM using cinder with ceph as backing store
2. Snapshot the disk and clone the VM.  The ceph rbd locked snapshot and clone is created as expected
volume-4f4ec5a3-c5b8-45f0-b462-7be7ae8dce2c                                                         10240M                                                                                                                        2           
volume-4f4ec5a3-c5b8-45f0-b462-7be7ae8dce2c@bkupsnap2015-08-19                                      10240M                                                                                                                        2           
volume-4f4ec5a3-c5b8-45f0-b462-7be7ae8dce2c@snapshot-6a8d3670-ad85-48e1-a077-8f6129094df4           10240M                                                                                                                        2 yes       
volume-a3992a89-9d1f-475a-a0c1-01867444edd7                                                         10240M openstack-volumes/volume-4f4ec5a3-c5b8-45f0-b462-7be7ae8dce2c@snapshot-6a8d3670-ad85-48e1-a077-8f6129094df4            2          
3. Delete the Vm with remove disk checked. The operation completes but the clone remains in ceph
4. Delete the snapshot - the snapshot UI goes into a greyed out state but remains in ceph as it is still locked by the clone.

Actual results:
clone remains in ceph after disk is deleted. The clone can be deleted by the Openstack UI. The snapshot will eventually disappear from ceph and the UI

An error is displayed eventually if the clone is not removed in Openstack: Error while executing action: A Request to the Server failed with the following Status Code: 500

Expected results:
The clone should be removed on deletion of the VM when the 'Remove disk' flag  s checked. The snapshot should then be able to be deleted.



Additional info:
The 3.6 engine is connecting directly to an Openstack cinder instance rather than using the docker images.

Comment 1 Ryan Barry 2015-08-20 03:06:20 UTC
Reassigning to something close to the correct component to investigate

Comment 2 Darryl Bond 2015-08-20 03:14:16 UTC
Sorry I didn't make it clearer, this is for ovirt 3.6 beta. 

There was no option to choose the correct plugin ovirt-engine-setup-plugin-dockerc under ovirt so I had to pick something I thought close.

This is definitely not about RDO or Openstack.

I am testing ovirt 3.6 with currently operational Openstack Cinder storage, hence the references to Openstack.

Comment 3 Allon Mureinik 2015-09-06 08:55:58 UTC
Moving to the engine core component - the engine orchestrates the work against Cinder, and VDSM is used "only" to run VMs against the created RBDs.

Maor - this seems like your area of expertise. Can you take a look please?

Comment 4 Allon Mureinik 2015-09-06 08:57:18 UTC
Darryl, can you please attach the oVirt engine's logs?

Comment 5 Darryl Bond 2015-09-09 06:10:33 UTC
Same again: 
1. Create Vm + disk in cinder
2. Snapshot VM
3. Clone snapshot
4. Remove cloned VM
5. Engine appears to think it successfully removed the clone
6. Clone still in ceph, original snapshot cannot be removed because of clone.

volume-1a0e89ca-4c7c-49b8-bbf4-9be76611018f                                               12288M                                                                                                            2
volume-1a0e89ca-4c7c-49b8-bbf4-9be76611018f@snapshot-a37cf782-f531-493a-b14c-959e2d6ee3b2 12288M                                                                                                            2 yes
volume-e32a72ba-8a7f-483e-b5dc-897a30589e83                                               12288M openstack-spin/volume-1a0e89ca-4c7c-49b8-bbf4-9be76611018f@snapshot-a37cf782-f531-493a-b14c-959e2d6ee3b2   2

2015-09-09 16:04:14,577 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (default task-17) [1ba58bc9] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c6fa27e4-5017-479b-bcbf-7dc653461f16=<VM, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2015-09-09 16:04:14,769 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-8-thread-2) [1ba58bc9] Running command: RemoveVmCommand internal: false. Entities affected :  ID: c6fa27e4-5017-479b-bcbf-7dc653461f16 Type: VMAction group DELETE_VM with role type USER
2015-09-09 16:04:14,792 INFO  [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand] (org.ovirt.thread.pool-8-thread-2) [1ba58bc9] START, SetVmStatusVDSCommand( SetVmStatusVDSCommandParameters:{runAsync='true', vmId='c6fa27e4-5017-479b-bcbf-7dc653461f16', status='ImageLocked', exitStatus='Normal'}), log id: fbec7b5
2015-09-09 16:04:14,797 INFO  [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand] (org.ovirt.thread.pool-8-thread-2) [1ba58bc9] FINISH, SetVmStatusVDSCommand, log id: fbec7b5
2015-09-09 16:04:14,843 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-8-thread-2) [1ba58bc9] Lock freed to object 'EngineLock:{exclusiveLocks='[c6fa27e4-5017-479b-bcbf-7dc653461f16=<VM, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2015-09-09 16:04:15,128 INFO  [org.ovirt.engine.core.bll.RemoveAllVmCinderDisksCommand] (pool-7-thread-7) [2492ac2] Running command: RemoveAllVmCinderDisksCommand internal: true. Entities affected :  ID: c6fa27e4-5017-479b-bcbf-7dc653461f16 Type: VM
2015-09-09 16:04:15,232 INFO  [org.ovirt.engine.core.bll.storage.RemoveCinderDiskCommand] (pool-7-thread-8) [2e5cd5af] Running command: RemoveCinderDiskCommand internal: true. Entities affected :  ID: 00000000-0000-0000-0000-000000000000 Type: Storage
2015-09-09 16:04:15,968 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-2) [] Correlation ID: 1ba58bc9, Job ID: 7d589080-6583-4c0b-949b-266f1610f3a8, Call Stack: null, Custom Event ID: -1, Message: VM centos-on-cinder-clone was successfully removed.
2015-09-09 16:04:17,530 INFO  [org.ovirt.engine.core.bll.RemoveAllCinderDisksCommandCallBack] (DefaultQuartzScheduler_Worker-18) [2987e5c2] Waiting for child commands to complete
2015-09-09 16:04:17,887 INFO  [org.ovirt.engine.core.bll.storage.CinderBroker] (DefaultQuartzScheduler_Worker-18) [2e5cd5af] Snapshot does not exists
2015-09-09 16:04:18,907 INFO  [org.ovirt.engine.core.bll.storage.AbstractCinderDiskCommandCallback] (DefaultQuartzScheduler_Worker-40) [634dfa14] Volume/Snapshot has been successfully deleted from Cinder. ID: e32a72ba-8a7f-483e-b5dc-897a30589e83
2015-09-09 16:04:18,907 INFO  [org.ovirt.engine.core.bll.storage.RemoveCinderDiskCommand] (DefaultQuartzScheduler_Worker-40) [634dfa14] Ending command 'org.ovirt.engine.core.bll.storage.RemoveCinderDiskCommand' successfully.
2015-09-09 16:04:18,935 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-40) [634dfa14] Correlation ID: 2e5cd5af, Call Stack: null, Custom Event ID: -1, Message: Disk Centos-on-Cinder_Disk1 was successfully removed from domain cinder-repository (User admin@internal).
2015-09-09 16:04:21,998 INFO  [org.ovirt.engine.core.bll.RemoveAllCinderDisksCommandCallBack] (DefaultQuartzScheduler_Worker-15) [2492ac2] All commands have completed, status 'SUCCEEDED'
2015-09-09 16:04:23,000 INFO  [org.ovirt.engine.core.bll.RemoveAllVmCinderDisksCommand] (DefaultQuartzScheduler_Worker-12) [2492ac2] Ending command 'org.ovirt.engine.core.bll.RemoveAllVmCinderDisksCommand' successfully.
2015-09-09 16:04:23,005 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (DefaultQuartzScheduler_Worker-12) [2492ac2] Ending command 'org.ovirt.engine.core.bll.RemoveVmCommand' successfully.
2015-09-09 16:04:23,010 WARN  [org.ovirt.engine.core.bll.RemoveAllVmCinderDisksCommand] (DefaultQuartzScheduler_Worker-12) [] VmCommand::EndVmCommand: Vm is null - not performing endAction on Vm

Comment 6 Maor 2015-09-09 06:36:54 UTC
Thanks for the logs Darryl, can you please attach the full engine log

Comment 7 Maor 2015-09-09 06:50:47 UTC
I managed to reproduce this in my env.
It looks like the engine try to delete the disk as a snapshot.
Working on a fix...

Comment 8 Maor 2015-09-09 11:51:06 UTC
Darryl, which openstack version are you using with your Cinder?
Is it Kilo or Juno?

Comment 9 Darryl Bond 2015-09-09 21:22:40 UTC
Created attachment 1071936 [details]
engine.log

Engine.log
1. Connect to cinder on existing Openstack system (Icehouse)
2. Create VM with disk in Cinder
3. Install Centos & boot
4. Delete VM
5. Observe successful removal of RBD in ceph
6. Create Vm
7. Install Centos & boot
8. Clone VM
9. Boot clone
10. Delete clone VM
11. Observe failure to remove Cloned RBD

Comment 10 Maor 2015-09-10 05:24:37 UTC
(In reply to Darryl Bond from comment #9)
> Created attachment 1071936 [details]
> engine.log
> 
> Engine.log
> 1. Connect to cinder on existing Openstack system (Icehouse)
> 2. Create VM with disk in Cinder
> 3. Install Centos & boot
> 4. Delete VM
> 5. Observe successful removal of RBD in ceph
> 6. Create Vm
> 7. Install Centos & boot

I assume you created a snapshot here before the clone, and cloned the VM from this snapshot (based on comment 5) ?

> 8. Clone VM
> 9. Boot clone
> 10. Delete clone VM
> 11. Observe failure to remove Cloned RBD


Thanks Darryl,

I've uploaded a patch which should fix the delete operation of the cloned VM from snapshot.
I was just worried about the locked snapshot you mentioned before, because of the existing cloned volumes left in Cinder.
This was an issue I've encountered also with openstack-Juno but I it was fixed in openstack-Kilo (That is why I asked you)
I've also opened a bugzilla on this issue so it will be easier to track:
https://bugzilla.redhat.com/1261733

Comment 11 Darryl Bond 2015-09-10 06:05:45 UTC
yes, created a snapshot of the VM before cloning it. I checked ceph and there was an RBD snapshot created as expected. Once the clone was made the snapshot was protected correctly for the RBD clone to be created.

I look forward to the docker images to be fixed so I can test it properly on kilo rather than hook it up to our existing Openstack cinder.

Comment 12 Allon Mureinik 2015-09-13 10:49:39 UTC
Maor, do we have a doctext about the Kilo requirement?

Comment 13 Maor 2015-10-06 14:05:07 UTC
(In reply to Allon Mureinik from comment #12)
> Maor, do we have a doctext about the Kilo requirement?

Added a doctext as part of the RFE https://bugzilla.redhat.com/1185826

Comment 14 Allon Mureinik 2015-10-06 16:45:57 UTC
(In reply to Maor from comment #13)
> (In reply to Allon Mureinik from comment #12)
> > Maor, do we have a doctext about the Kilo requirement?
> 
> Added a doctext as part of the RFE https://bugzilla.redhat.com/1185826
Ack, Setting requires-doctext-. Thanks!

Comment 15 Ori Gofen 2015-11-17 13:36:51 UTC
Verified on rhevm-3.6-0.2

Comment 16 Sandro Bonazzola 2015-11-27 07:56:54 UTC
Since oVirt 3.6.0 has been released, moving from verified to closed current release.