Description of problem: OSP 10 unable to remove instance in RBD logs "ondisk = -16 ((16) Device or resource busy)" and "-1 librbd: cannot obtain exclusive lock - not removing" Version-Release number of selected component (if applicable): OSP 10 Red Hat Ceph Storage 2.1 - 10.2.3-17.el7cp This looks like to me OSP 10 Nova RBD driver issue not librbd issue because the customer is able to delete the RBD image from RBD command. I will add the logs and more information in the next comment. Most probably we need to send this bug to OSP team but I am creating this bug with us to cross verify if I am not missing anything.
Looks like Nova RBD driver is holding the lock and delete thread is not able to take the lock and failing to remove the image.
Please confirm whether or not rbd-mirror daemon is active on a secondary cluster and actively mirroring the images that cannot be deleted. The fact that "rbd status" shows zero watchers on the image is suspicious since the exclusive-lock code showed that the lock owner was alive. Perhaps the qemu process was not shutdown cleanly so the image had a watcher for 30 seconds until the timed out. If you can re-create this scenario, try to run to collect a series of "rbd status" dumps (with timestamps) so it can be aligned w/ the Nova logs.
*** This bug has been marked as a duplicate of bug 1636190 ***