Created attachment 809435 [details] logs Description of problem: I configured cinder to work with gluster and we fail to create a snapshot (#1236966) and when I try to delete the snapshot I fail with an error Version-Release number of selected component (if applicable): openstack-cinder-2013.2-0.9.b3.el6ost.noarch glusterfs-fuse-3.4.0.33rhs-1.el6rhs.x86_64 How reproducible: 100% Steps to Reproduce: 1. configure cinder to work with gluster 2. try to create a snapshot (we will fail) 3. try to delete the snapshot Actual results: we fail to delete the snapshot Expected results: we should be able to delete the snapshot Additional info: 2013-10-08 19:54:38.955 7297 ERROR cinder.openstack.common.rpc.amqp [req-da5b9f8c-fca9-4615-a5e4-0caec2012bd7 c02995f25ba44cfab1a3cbd419f045a1 c77235c29fd0431a8e6628ef6d18e07f] Exception during message handling 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp Traceback (most recent call last): 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py", line 441, in _process_data 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp **args) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/dispatcher.py", line 148, in dispatch 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp return getattr(proxyobj, method)(ctxt, **kwargs) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 377, in delete_snapshot 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp {'status': 'error_deleting'}) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp self.gen.next() 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 365, in delete_snapshot 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp self.driver.delete_snapshot(snapshot_ref) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 524, in delete_snapshot 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp snap_info = self._read_info_file(info_path) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 489, in _read_info_file 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp return json.loads(self._read_file(info_path)) 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 479, in _read_file 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp with open(filename, 'r') as f: 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp IOError: [Errno 2] No such file or directory: u'/var/lib/cinder/mnt/4a31bc6e5fb9244971075aa23d364364/volume-e78978af-0f46-4caf-948b-218afb5de6ef.info' 2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp https://bugs.launchpad.net/cinder/+bug/1236975
Probably the same issue as bug 1016798.
I'm not sure this is the same bug, even though I had "cinder/cinder" as owner of the remote gluster share, the snapshots (and only the snapshots, not the volumes) appear to be created by root: [root@gfidente-rhos-on_qa a3082d8779198266d36d261ff40ac12b(keystone_admin)]# ls -la drwxr-xr-x. 3 cinder cinder 4096 Oct 10 16:57 . drwxr-xr-x. 3 cinder cinder 4096 Oct 10 14:19 .. -rw-rw-rw-. 1 cinder cinder 1073741824 Sep 23 14:34 volume-1be1a352-e847-4991-8b5f-c2adde86f6ba -rw-rw-rw-. 1 cinder cinder 1073741824 Sep 23 16:45 volume-3ef968fa-5027-464b-b234-cd8ffc474f3c [...] -rw-rw-rw-. 1 root root 1073741824 Oct 10 16:48 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4 -rw-r--r--. 1 root root 197120 Oct 10 16:57 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.04afa002-238b-4f4d-8b30-927607f5e05e -rw-r--r--. 1 root root 197120 Oct 10 16:49 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.866d2eed-d9bb-4504-9be3-3ab027e4a1f4 -rw-r--r--. 1 cinder cinder 349 Oct 10 16:57 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.info [root@gfidente-rhos-on_qa a3082d8779198266d36d261ff40ac12b(keystone_admin)]#
(In reply to Giulio Fidente from comment #2) > I'm not sure this is the same bug, even though I had "cinder/cinder" as > owner of the remote gluster share, the snapshots (and only the snapshots, > not the volumes) appear to be created by root: The snapshot files are created as root, but it looks like the .info file maybe was not created at all (since it couldn't be written due to perms), which causes the error at deletion time. This is why I'm thinking it is caused by the same permissions issue.
(In reply to Eric Harney from comment #3) > (In reply to Giulio Fidente from comment #2) > > I'm not sure this is the same bug, even though I had "cinder/cinder" as > > owner of the remote gluster share, the snapshots (and only the snapshots, > > not the volumes) appear to be created by root: > > The snapshot files are created as root, but it looks like the .info file > maybe was not created at all (since it couldn't be written due to perms), > which causes the error at deletion time. This is why I'm thinking it is > caused by the same permissions issue. I still see a problem here :) I think this should be fixed regardless with two possible solutions. 1. if the .info file is needed for the delete process than we should roll back on the create because else we are left with a snapshot that we cannot delete without a manual intervention in the storage and locally (this is for any reason the .info fail cannot be created). 2. if we do not actually need it, than the code should continue regardless of the .info file existence, delete the snapshot anyway and log as WARN.
(In reply to Dafna Ron from comment #4) > (In reply to Eric Harney from comment #3) > > (In reply to Giulio Fidente from comment #2) > I still see a problem here :) > > I think this should be fixed regardless with two possible solutions. > 1. if the .info file is needed for the delete process than we should roll > back on the create because else we are left with a snapshot that we cannot > delete without a manual intervention in the storage and locally (this is for > any reason the .info fail cannot be created). Cinder does not support the notion of rollback for the create failing in this case. What should happen is the snapshot ends up in "error" state and then can be removed. But, this is probably not possible currently since we require the info file to be present at delete time.
yes. hence, I still think its a bug. if the .info file was not created for any reason we cannot delete the snapshot even if its in an error status - i see that as a bug
It is now possible to delete the snapshot if creation fails in this manner.
verified the bug on: python-cinderclient-1.0.7-2.el6ost.noarch python-cinder-2013.2.1-4.el6ost.noarch openstack-cinder-2013.2.1-4.el6ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2014-0046.html