Cause:
When attempting to delete a vm and failing to create a deletion task for the "first" image, rollforward wasn't done - the disk was marked as illegal and the vm wasn't deleted.
Consequence:
vm left, the user possibly can't remove it, disk possibly can't be removed it it doesn't exist on the storage domain.
Fix:
when disk doesn't exist on the storage domain, it will be removed from the engine when attempting to delete it.
when attempting to remove a vm, the vm will be removed as the first step, any of it's disks that are failed to be removed would be floating in illegal status and can be removed afterwards.
Result:
when attempting to remove a vm - it will be removed, it's disks which weren't removed would remain floating with illegal status.
Created attachment 732338[details]
engine.log
* iSCSI/FCP DC with template and VM based on template.
* Open the child VM's device with python to cause VM removal to fail:
[root@orange-vdse ~]# python
Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> v = open('/dev/<vg name>/<lv name>', "r")
* Try to remove the VM.
* Removal fails for:
Thread-2140::ERROR::2013-04-07 15:36:04,046::dispatcher::67::Storage.Dispatcher.Protect::(run) {'status': {'message': 'Cannot remove Logical Volume: (\'d3842
6a3-fe2d-42ab-800e-0ce7c5ffb95a\', "{\'58d60d63-393e-4b5b-b688-a11e5ccf59b7\': ImgsPar(imgs=[\'750bfe61-34f3-454d-92f8-80fce9419e13\'], parent=\'ae0f4f90-b15
c-4f78-8d3a-7f9b048d2ff9\')}")', 'code': 551}}
* There is no roll-forward in engine, the VM continues to exist.
* Unlock the device and try to remove the VM again.
* Removal fails for:
Thread-2187::ERROR::2013-04-07 15:37:23,671::hsm::1450::Storage.HSM::(deleteImage) Empty or not found image 750bfe61-34f3-454d-92f8-80fce9419e13 in SD d38426a3-fe2d-42ab-800e-0ce7c5ffb95a. {'ae0f4f90-b15c-4f78-8d3a-7f9b048d2ff9': ImgsPar(imgs=['659a325e-bef7-4cad-a859-0549ea04dcad'], parent='00000000-0000-0000-0000-000000000000'), '32d2282e-50ec-4a6e-964a-be518b0abec6': ImgsPar(imgs=['13eb9b31-6a97-48c6-8a1c-d18182bee867'], parent='00000000-0000-0000-0000-000000000000')}
* Since this image now exists in engine and not in VDSM, it is impossible to remove the VM, or the template it's based on, or its domain.
I suspect this is a similar scenario as BZ916554
which is a duplicate of BZ884635.
In the log it can be noticed that we get the following error:
IRSGenericException: IRSErrorException: Image does not exist in domain: 'image=750bfe61-34f3-454d-92f8-80fce9419e13, domain=d38426a3-fe2d-42ab-800e-0ce7c5ffb95a'
Once BZ884635 will be fixed, it should be solved.
I added bug 884635 as a blocker - IMHO, this should be left open since it describes a different scenario, which may pass/fail QA independently of the original bug.
removal to a running vm succeeded. the disk remains in the system and is manually removable.
verified on RHEVM-3.2-SF17.1:
vdsm-4.10.2-21.0.el6ev.x86_64
rhevm-3.2.0-11.28.el6ev.noarch