Bug 1655510 - Attachment should return to in-use if detach fails [NEEDINFO]
Summary: Attachment should return to in-use if detach fails
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: nova-maint
QA Contact: nova-maint
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-03 10:40 UTC by Eduard Barrera
Modified: 2019-09-09 16:54 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-17 15:34:15 UTC
Target Upstream Version:
mbooth: needinfo? (ebarrera)


Attachments (Terms of Use)

Description Eduard Barrera 2018-12-03 10:40:53 UTC
Description of problem:

Some times, when a disk detachment operation doesn't complete, status stays in detaching, it should change to a failed state tp give information to the operator.

  
Version-Release number of selected component (if applicable):
OSP 13

How reproducible:
always

Steps to Reproduce:
- When volume deleted manually from itachi backenbd
- Perhaps others


Actual results:
Status stays in detaching...

Expected results:
It should fail after some time

Additional info:

Comment 3 Matthew Booth 2018-12-07 14:24:31 UTC
Firstly I think this is a bug not an RFE, and the bug report should be:

Volume remains in detaching state after failing to detach

I believe the correct behaviour should probably be to set the volume state back to in-use, and add an instance fault. I'll need to understand exactly how to reproduce the problem, though.

Please could you provide clear steps how to reproduce the initial detaching state, and corresponding DEBUG logs for cinder api, cinder volume, nova api, and nova compute?

Comment 4 Matthew Booth 2018-12-07 15:45:15 UTC
Investigate https://review.openstack.org/#/c/590439/3/nova/virt/block_device.py

Comment 5 melanie witt 2018-12-07 17:31:58 UTC
(In reply to Matthew Booth from comment #4)
> Investigate
> https://review.openstack.org/#/c/590439/3/nova/virt/block_device.py

Adding a note here based on discussion in #rhos-compute: I realized that this bug fix ^ doesn't apply to Newton as the code is quite different. The bug fix above was restoring a roll_detaching call that was erroneously removed during a different, previous bug fix. But the erroneous removal happened _after_ Newton (OSP 10).

In Newton, the roll_detaching calls during detach failures are in the compute manager. Looking at the code, I noticed that roll_detaching is _not_ called in when driver.detach_volume raises DiskNotFound (which seems like it would be raised if the volume was previously deleted manually from the storage backend):

https://github.com/openstack/nova/blob/newton-eol/nova/compute/manager.py#L4744

Here we see that roll_detaching is called when Exception is caught, but it is not called when DiskNotFound is caught. I think this might be the bug.

Comment 15 Matthew Booth 2019-01-17 15:34:15 UTC
I have closed this bug as it has been waiting for more info for at least 4 weeks. We only do this to ensure that we don't accumulate stale bugs which can't be addressed. If you are able to provide the requested information, please feel free to re-open this bug.


Note You need to log in before you can comment on or make changes to this bug.