Description of problem:
Under some conditions, if a delete operation fails, the task_state remains as "deleting" and the vm_state in the original state. This is confusing, and at-first-glance, is not distinguishable from a "wedged in progress" type of error.
Difficult/unknown. Without instrumenting the code to synthesize failure conditions, a load/stress-test environment may aide in reproducing.
Steps to Reproduce:
See https://bugzilla.redhat.com/show_bug.cgi?id=957267 for one scenario that this occurs.
Other approaches might be to initiate a delete on a instance with a remote compute node that is unavailable or with a disabled libvirt service.
VMs with a variety of VM_STATE values but task_state =deleting.
Not prescribed. Perhaps a VM_STATE that is clearly consistent with the task_state?
This issue should be evaluated with care to ensure that, unless there are extenuating circumstances that make it invalid to do so, subsequent attempts to delete the instance do succeed.
Some of the remaining scenarios involve unavailable services upon time of deletion. Whether or not it is correct to force deletion of the database record is somewhat debatable (orphan libvirt domains, images, etc..). In any case, the fact that the VM was actually in state deleting was by design so closing as "not a bug" seems reasonable. Subsequent issues should be reported by the specific cause of the error state.