Created attachment 853217 [details] the cinder & compute logs Description of problem: While reproducing Bug 1033652, the cinder wasn't able to delete the snapshots of the volume attached to the instance. The system was installed with GlusterFS back end configured in the Packstack answer file. Both the Cinder & the Nova Compute servers had fuse installed: fuse-libs-2.8.3-4.el6.x86_64 glusterfs-fuse-3.4.0.57rhs-1.el6_5.x86_64 fuse-2.8.3-4.el6.x86_64 And the SElinux was configured: # getsebool virt_use_fusefs virt_use_fusefs --> on According to the steps: 1. Created a volume from an image: # cinder create --image-id 52572739-a5e7-4232-a184-e267934cdd15 30 +---------------------+--------------------------------------+ | Property | Value | +---------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2014-01-21T12:29:52.175178 | | display_description | None | | display_name | None | | id | 83fc7617-7a95-4c6b-b631-28bbf991c120 | | image_id | 52572739-a5e7-4232-a184-e267934cdd15 | | metadata | {} | | size | 30 | | snapshot_id | None | | source_volid | None | | status | creating | | volume_type | None | +---------------------+--------------------------------------+ 2. launched an instance from the volume named 'verify_bug' 3. create a snapshot from the instance named 'verify_bug_snap' # cinder snapshot-list +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ | ID | Volume ID | Status | Display Name | Size | +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ 84c59525-63a9-4ebb-9125-e26e97bc1f51 | 83fc7617-7a95-4c6b-b631-28bbf991c120 | available | snapshot for verify_bug_snap | 30 | +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ From the nova compute server: # ll /var/lib/nova/mnt/600bd85f165b39eac20b9779f0281317 -rw-rw-rw-. 1 qemu qemu 32212254720 Jan 21 14:35 volume-83fc7617-7a95-4c6b-b631-28bbf991c120 -rw-r--r--. 1 qemu qemu 7602176 Jan 21 2014 volume-83fc7617-7a95-4c6b-b631-28bbf991c120.84c59525-63a9-4ebb-9125-e26e97bc1f51 -rw-r--r--. 1 165 165 223 Jan 21 14:36 volume-83fc7617-7a95-4c6b-b631-28bbf991c120.info The content of the info file is: # cat /var/lib/nova/mnt/600bd85f165b39eac20b9779f0281317/volume-83fc7617-7a95-4c6b-b631-28bbf991c120.info { "84c59525-63a9-4ebb-9125-e26e97bc1f51": "volume-83fc7617-7a95-4c6b-b631-28bbf991c120.84c59525-63a9-4ebb-9125-e26e97bc1f51", "active": "volume-83fc7617-7a95-4c6b-b631-28bbf991c120.84c59525-63a9-4ebb-9125-e26e97bc1f51" } 4. Delete the snapshot: # cinder snapshot-delete 84c59525-63a9-4ebb-9125-e26e97bc1f51 # cinder snapshot-list +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ | ID | Volume ID | Status | Display Name | Size | +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ | 84c59525-63a9-4ebb-9125-e26e97bc1f51 | 83fc7617-7a95-4c6b-b631-28bbf991c120 | deleting | snapshot for verify_bug_snap | 30 | +--------------------------------------+--------------------------------------+----------------+------------------------------+------+ Version-Release number of selected component (if applicable): python-novaclient-2.15.0-2.el6ost.noarch python-nova-2013.2.1-2.el6ost.noarch openstack-nova-compute-2013.2.1-2.el6ost.noarch openstack-nova-common-2013.2.1-2.el6ost.noarch libvirt-client-0.10.2-29.el6_5.2.x86_64 libvirt-0.10.2-29.el6_5.2.x86_64 libvirt-python-0.10.2-29.el6_5.2.x86_64 python-cinderclient-1.0.7-2.el6ost.noarch openstack-cinder-2013.2.1-5.el6ost.noarch python-cinder-2013.2.1-5.el6ost.noarch How reproducible: 100% Steps to Reproduce: 1. Create a volume from an image 2. Boot an instance from the volume 3. Create a snapshot from the instance. 4. Delete the snapshot Actual results: The snapshot deletion is stuck, and if interrupted it moves to error, thus the user can't delete the volume, as well. Expected results: The user can delete the snapshot. Additional info: the cinder & compute logs are attached.
This bug blocks the following bugs: https://bugzilla.redhat.com/show_bug.cgi?id=1033652 https://bugzilla.redhat.com/show_bug.cgi?id=1040711
The basic problem here is that Cinder has a fixed time out when waiting for snapshot_delete operations on the Nova side to complete. If they take too long (even when things are functioning correctly) Cinder will prematurely fail the operation. To fix this, we need to have Nova send back updates of job percent complete while the block job is in-progress. Cinder can then reset its timeout window based on these updates. (This should be doable without changing how the APIs work between Cinder and Nova today.) For testing in the meantime: The longest operations are when deleting the only snapshot that exists, because in that case the whole base disk image has to be copied into the snapshot file. Deletions of snapshots when other snapshots exist should be much quicker, which will let you avoid this bug while testing other pieces of this feature.
*** Bug 1101504 has been marked as a duplicate of this bug. ***
This likely indicates using a version of libvirt which had known bugs in it in this area. Closing pending further info on reproduction.