Bug 1016806 - cinder: can't delete snapshots when cinder is configured to work with gluster
Summary: cinder: can't delete snapshots when cinder is configured to work with gluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 4.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: z1
: 4.0
Assignee: Eric Harney
QA Contact: Yogev Rabl
URL:
Whiteboard: storage
Depends On: 1016798
Blocks: 1045196
TreeView+ depends on / blocked
 
Reported: 2013-10-08 17:45 UTC by Dafna Ron
Modified: 2016-04-26 14:18 UTC (History)
6 users (show)

Fixed In Version: openstack-cinder-2013.2.1-2.el6ost
Doc Type: Known Issue
Doc Text:
Previously, the Block Storage service did not check first if it had the required permissions to write to a GlusterFS share before deleting a snapshot. As a result, if the Block Storage service did not have write permissions to a GlusterFS share, any attempts to delete a snapshot on the share would fail. No indication would be given to the user of why the attempt failed, and the volume/snapshot data could be left in an inconsistent state. With this fix, the Block Storage service now checks if it has write permissions to a GlusterFS share before deleting a snapshot. Any attempt to delete a snapshot would fail with the correct notification (before any data is modified) if the Block Storage service does not have write permissions to the share.
Clone Of:
Environment:
Last Closed: 2014-01-23 14:22:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (177.27 KB, text/plain)
2013-10-08 17:45 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1236975 0 None None None Never
OpenStack gerrit 57335 0 None None None Never
Red Hat Product Errata RHBA-2014:0046 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform 4 Bug Fix and Enhancement Advisory 2014-01-23 00:51:59 UTC

Description Dafna Ron 2013-10-08 17:45:49 UTC
Created attachment 809435 [details]
logs

Description of problem:

I configured cinder to work with gluster and we fail to create a snapshot (#1236966) and when I try to delete the snapshot I fail with an error 

Version-Release number of selected component (if applicable):

openstack-cinder-2013.2-0.9.b3.el6ost.noarch
glusterfs-fuse-3.4.0.33rhs-1.el6rhs.x86_64

How reproducible:

100%

Steps to Reproduce:
1. configure cinder to work with gluster
2. try to create a snapshot (we will fail) 
3. try to delete the snapshot 

Actual results:

we fail to delete the snapshot 

Expected results:

we should be able to delete the snapshot 

Additional info:


2013-10-08 19:54:38.955 7297 ERROR cinder.openstack.common.rpc.amqp [req-da5b9f8c-fca9-4615-a5e4-0caec2012bd7 c02995f25ba44cfab1a3cbd419f045a1 c77235c29fd0431a8e6628ef6d18e07f] Exception during message handling
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp Traceback (most recent call last):
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py", line 441, in _process_data
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     **args)
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/dispatcher.py", line 148, in dispatch
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     return getattr(proxyobj, method)(ctxt, **kwargs)
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 377, in delete_snapshot
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     {'status': 'error_deleting'})
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     self.gen.next()
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 365, in delete_snapshot
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     self.driver.delete_snapshot(snapshot_ref)
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 524, in delete_snapshot
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     snap_info = self._read_info_file(info_path)
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 489, in _read_info_file
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     return json.loads(self._read_file(info_path))
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp   File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/glusterfs.py", line 479, in _read_file
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp     with open(filename, 'r') as f:
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp IOError: [Errno 2] No such file or directory: u'/var/lib/cinder/mnt/4a31bc6e5fb9244971075aa23d364364/volume-e78978af-0f46-4caf-948b-218afb5de6ef.info'
2013-10-08 19:54:38.955 7297 TRACE cinder.openstack.common.rpc.amqp 


https://bugs.launchpad.net/cinder/+bug/1236975

Comment 1 Eric Harney 2013-10-08 17:50:34 UTC
Probably the same issue as bug 1016798.

Comment 2 Giulio Fidente 2013-10-10 14:00:47 UTC
I'm not sure this is the same bug, even though I had "cinder/cinder" as owner of the remote gluster share, the snapshots (and only the snapshots, not the volumes) appear to be created by root:

  [root@gfidente-rhos-on_qa a3082d8779198266d36d261ff40ac12b(keystone_admin)]# ls -la

  drwxr-xr-x. 3 cinder cinder       4096 Oct 10 16:57 .
  drwxr-xr-x. 3 cinder cinder       4096 Oct 10 14:19 ..
  -rw-rw-rw-. 1 cinder cinder 1073741824 Sep 23 14:34 volume-1be1a352-e847-4991-8b5f-c2adde86f6ba
  -rw-rw-rw-. 1 cinder cinder 1073741824 Sep 23 16:45 volume-3ef968fa-5027-464b-b234-cd8ffc474f3c
  
  [...]
  
  -rw-rw-rw-. 1 root   root   1073741824 Oct 10 16:48 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4
  -rw-r--r--. 1 root   root       197120 Oct 10 16:57 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.04afa002-238b-4f4d-8b30-927607f5e05e
  -rw-r--r--. 1 root   root       197120 Oct 10 16:49 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.866d2eed-d9bb-4504-9be3-3ab027e4a1f4
  -rw-r--r--. 1 cinder cinder        349 Oct 10 16:57 volume-f7e752e2-8134-4714-a12b-feeaff2b25d4.info
  
  [root@gfidente-rhos-on_qa a3082d8779198266d36d261ff40ac12b(keystone_admin)]#

Comment 3 Eric Harney 2013-10-10 14:28:33 UTC
(In reply to Giulio Fidente from comment #2)
> I'm not sure this is the same bug, even though I had "cinder/cinder" as
> owner of the remote gluster share, the snapshots (and only the snapshots,
> not the volumes) appear to be created by root:

The snapshot files are created as root, but it looks like the .info file maybe was not created at all (since it couldn't be written due to perms), which causes the error at deletion time.  This is why I'm thinking it is caused by the same permissions issue.

Comment 4 Dafna Ron 2013-10-10 15:04:58 UTC
(In reply to Eric Harney from comment #3)
> (In reply to Giulio Fidente from comment #2)
> > I'm not sure this is the same bug, even though I had "cinder/cinder" as
> > owner of the remote gluster share, the snapshots (and only the snapshots,
> > not the volumes) appear to be created by root:
> 
> The snapshot files are created as root, but it looks like the .info file
> maybe was not created at all (since it couldn't be written due to perms),
> which causes the error at deletion time.  This is why I'm thinking it is
> caused by the same permissions issue.

I still see a problem here :) 

I think this should be fixed regardless with two possible solutions. 
1. if the .info file is needed for the delete process than we should roll back on the create because else we are left with a snapshot that we cannot delete without a manual intervention in the storage and locally (this is for any reason the .info fail cannot be created). 
2. if we do not actually need it, than the code should continue regardless of the .info file existence, delete the snapshot anyway and log as WARN.

Comment 5 Eric Harney 2013-10-10 16:31:33 UTC
(In reply to Dafna Ron from comment #4)
> (In reply to Eric Harney from comment #3)
> > (In reply to Giulio Fidente from comment #2)
> I still see a problem here :) 
> 
> I think this should be fixed regardless with two possible solutions. 
> 1. if the .info file is needed for the delete process than we should roll
> back on the create because else we are left with a snapshot that we cannot
> delete without a manual intervention in the storage and locally (this is for
> any reason the .info fail cannot be created). 

Cinder does not support the notion of rollback for the create failing in this case.  What should happen is the snapshot ends up in "error" state and then can be removed.  But, this is probably not possible currently since we require the info file to be present at delete time.

Comment 6 Dafna Ron 2013-10-10 16:34:18 UTC
yes. hence, I still think its a bug. 
if the .info file was not created for any reason we cannot delete the snapshot even if its in an error status - i see that as a bug

Comment 11 Eric Harney 2014-01-09 22:24:52 UTC
It is now possible to delete the snapshot if creation fails in this manner.

Comment 13 Yogev Rabl 2014-01-14 14:26:46 UTC
verified the bug on: 
python-cinderclient-1.0.7-2.el6ost.noarch
python-cinder-2013.2.1-4.el6ost.noarch
openstack-cinder-2013.2.1-4.el6ost.noarch

Comment 16 Lon Hohberger 2014-02-04 17:19:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-0046.html


Note You need to log in before you can comment on or make changes to this bug.