888987 – Volume stuck in error_deleting

Bug 888987 - Volume stuck in error_deleting

Summary: Volume stuck in error_deleting

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-cinderclient
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	snapshot4
Target Release:	2.1
Assignee:	Eric Harney
QA Contact:	Attila Fazekas
Docs Contact:
URL:
Whiteboard:
Depends On:	896153
Blocks:
TreeView+	depends on / blocked

Reported:	2012-12-19 23:35 UTC by Graeme Gillies
Modified:	2016-04-26 23:13 UTC (History)
CC List:	2 users (show)
Fixed In Version:	1.0.0.20-3
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-03-21 19:03:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
OpenStack gerrit	20064	0	None	MERGED	Add ability to call force_delete from cinderclient	2021-01-12 06:31:14 UTC
Red Hat Product Errata	RHBA-2013:0672	0	normal	SHIPPED_LIVE	Red Hat OpenStack 2.0 (Folsom) Preview bug fix and enhancement update	2013-03-21 23:02:46 UTC

Description Graeme Gillies 2012-12-19 23:35:18 UTC

Hi,

Unfortunately somewhere along the line we had a failure to delete one of the volumes due to a permissions error. This threw some errors into the openstack log, and now when I do a cinder list --all-tenants I see

+--------------------------------------+----------------+-------------------------+------+-------------+--------------------------------------+
|                  ID                  |     Status     |       Display Name      | Size | Volume Type |             Attached to              |
+--------------------------------------+----------------+-------------------------+------+-------------+--------------------------------------+
| 3dc53676-81e7-4b5e-9b3c-f4603d2b2846 | error_deleting |        testcinder       |  1   |     None    |                                      |
+--------------------------------------+----------------+-------------------------+------+-------------+--------------------------------------+

The volume is setting there in error_deleting. I try to use cinder delete to get rid of it after fixing the problem, but it gives me a 400 error saying the volume is in a bad state.

What is the best path to recover from this?

Regards,

Graeme

Comment 2 Eric Harney 2012-12-20 17:30:30 UTC

Cinder only allows deletion when the volume is in certain states -- not including error_deleting.

When this happens, you can remove the volume by logging into the database, removing it from the volumes table (which will involve updating reservations and quota_usages as well), and also removing the LV created for it.  The Cinder volume service will need to be restarted as well.

It may be easier to update the database's state for the volume to "error" instead of "error_deleting" and try again (possibly after restarting services) if you think the delete will succeed.

I can look into how to improve this -- I think in these cases the ability to retry deletion is needed.  (One concern though, is that we can't do so in such a way that allows volumes to be deleted w/o secure delete succeeding.)

Comment 3 Graeme Gillies 2012-12-20 23:25:14 UTC

I ran the following sql statement

mysql> update volumes set status = 'error' where id = '3dc53676-81e7-4b5e-9b3c-f4603d2b2846';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Then did

cinder delete 3dc53676-81e7-4b5e-9b3c-f4603d2b2846

And that seems to have fixed it.

Thanks for the help. Up to you if you want to close this ticket or leave it open as a reminder to add a section to the doco on what to do when this happens

Regards,

Graeme

Comment 4 Flavio Percoco 2013-01-14 23:30:41 UTC

(In reply to comment #2)
> I can look into how to improve this -- I think in these cases the ability to
> retry deletion is needed.  (One concern though, is that we can't do so in
> such a way that allows volumes to be deleted w/o secure delete succeeding.)

What about a --force option that _forces_ certain commands like this one?

Comment 5 Eric Harney 2013-01-17 21:47:39 UTC

(In reply to comment #4)
> (In reply to comment #2)
> > I can look into how to improve this -- I think in these cases the ability to
> > retry deletion is needed.  (One concern though, is that we can't do so in
> > such a way that allows volumes to be deleted w/o secure delete succeeding.)
> 
> What about a --force option that _forces_ certain commands like this one?

I've considered this, but I'm not sure it's that straightforward.  A --force option would let you force a delete from a different state -- but it shouldn't always force a delete through, since whatever is causing the failure may cause a leak of some storage resource that doesn't get cleaned up, etc.  (And secure delete must be ensured.)  Manual intervention may be preferred.

So, I think a better first step would be to add error_deleting to the list of states that a delete operation is allowed from, which would at least help in a case like this one, as the user could take some action, then retry.  I'll try this out.

Comment 6 Eric Harney 2013-01-18 15:54:32 UTC

Looks like some movement upstream is happening with this.

Comment 8 Eric Harney 2013-02-08 20:07:36 UTC

The current plan here is to backport the "force_delete" operation from Grizzly which will allow users a method to clean up in situations like this, where "delete" is not allowed.

Comment 9 Eric Harney 2013-02-19 17:03:41 UTC

You can now run "cinder force-delete <vol-id>" as an admin user.

Comment 11 Attila Fazekas 2013-03-02 13:34:08 UTC

Looks like just very few thing is able to cause error_delete state nowadays.
I renamed the /bin/dd in order to reach that state.

Comment 13 errata-xmlrpc 2013-03-21 19:03:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0672.html

Note You need to log in before you can comment on or make changes to this bug.