Created attachment 732338 [details] engine.log * iSCSI/FCP DC with template and VM based on template. * Open the child VM's device with python to cause VM removal to fail: [root@orange-vdse ~]# python Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48) [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> v = open('/dev/<vg name>/<lv name>', "r") * Try to remove the VM. * Removal fails for: Thread-2140::ERROR::2013-04-07 15:36:04,046::dispatcher::67::Storage.Dispatcher.Protect::(run) {'status': {'message': 'Cannot remove Logical Volume: (\'d3842 6a3-fe2d-42ab-800e-0ce7c5ffb95a\', "{\'58d60d63-393e-4b5b-b688-a11e5ccf59b7\': ImgsPar(imgs=[\'750bfe61-34f3-454d-92f8-80fce9419e13\'], parent=\'ae0f4f90-b15 c-4f78-8d3a-7f9b048d2ff9\')}")', 'code': 551}} * There is no roll-forward in engine, the VM continues to exist. * Unlock the device and try to remove the VM again. * Removal fails for: Thread-2187::ERROR::2013-04-07 15:37:23,671::hsm::1450::Storage.HSM::(deleteImage) Empty or not found image 750bfe61-34f3-454d-92f8-80fce9419e13 in SD d38426a3-fe2d-42ab-800e-0ce7c5ffb95a. {'ae0f4f90-b15c-4f78-8d3a-7f9b048d2ff9': ImgsPar(imgs=['659a325e-bef7-4cad-a859-0549ea04dcad'], parent='00000000-0000-0000-0000-000000000000'), '32d2282e-50ec-4a6e-964a-be518b0abec6': ImgsPar(imgs=['13eb9b31-6a97-48c6-8a1c-d18182bee867'], parent='00000000-0000-0000-0000-000000000000')} * Since this image now exists in engine and not in VDSM, it is impossible to remove the VM, or the template it's based on, or its domain.
Created attachment 732339 [details] vdsm.log
I suspect this is a similar scenario as BZ916554 which is a duplicate of BZ884635. In the log it can be noticed that we get the following error: IRSGenericException: IRSErrorException: Image does not exist in domain: 'image=750bfe61-34f3-454d-92f8-80fce9419e13, domain=d38426a3-fe2d-42ab-800e-0ce7c5ffb95a' Once BZ884635 will be fixed, it should be solved.
Should we close as duplicate?
I added bug 884635 as a blocker - IMHO, this should be left open since it describes a different scenario, which may pass/fail QA independently of the original bug.
Patch was reverted as it breaks QE automation tests. Need to revisit once automations are fixed.
moved back to POST since bug was reverted on sf14. please move to modified after syncing with qe on fixed tests.
moved back to ON_DEV per development request.
Moving to modified based on fixed described in the external tracker that was pushed for another bug. Liron - please document the behavior change.
removal to a running vm succeeded. the disk remains in the system and is manually removable. verified on RHEVM-3.2-SF17.1: vdsm-4.10.2-21.0.el6ev.x86_64 rhevm-3.2.0-11.28.el6ev.noarch
3.2 has been released