Bug 1570314

Summary:	When creating a stack with not enough resource, volumes remain in attaching
Product:	Red Hat OpenStack	Reporter:	David Vallee Delisle <dvd>
Component:	openstack-nova	Assignee:	Lee Yarwood <lyarwood>
Status:	CLOSED ERRATA	QA Contact:	OSP DFG:Compute <osp-dfg-compute>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	12.0 (Pike)	CC:	berrange, dasmith, dvd, eglynn, geguileo, jhakimra, kchamart, lyarwood, marjones, mburns, nova-maint, pgrist, sbaker, sbauza, sferdjao, sgordon, shardy, slinaber, srevivo, vromanso
Target Milestone:	z3	Keywords:	Triaged, ZStream
Target Release:	12.0 (Pike)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-nova-16.1.4-2.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-08-20 12:55:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description David Vallee Delisle 2018-04-21 20:55:02 UTC

Description of problem:
When we create a stack too large for the number of resources available, cinder leaves the volume in attaching

~~~
2018-03-29 18:16:57Z [dvd-iperf.resourcegroup]: CREATE_FAILED  BadRequest: resources.resourcegroup.resources[0].resources.instance_1: Invalid volume: Invalid input received: Invalid volume: Volume status must be available to reserve, but the status is attaching. (HTTP 400) (Request-ID: req-bf16393c-d4cc-4cb1-bedc-cc8
2018-03-29 18:16:57Z [dvd-iperf]: CREATE_FAILED  Resource CREATE failed: BadRequest: resources.resourcegroup.resources[0].resources.instance_1: Invalid volume: Invalid input received: Invalid volume: Volume status must be available to reserve, but the status is attaching. (HTTP 400)
~~~

I'm not sure if it's a heat problem or a cinder problem.


Version-Release number of selected component (if applicable):
openstack-cinder-11.0.1-2

How reproducible:
All the time

Steps to Reproduce:
1. stack create a stack too big for the available resource
2. wait

Actual results:
2018-03-29 18:16:57Z [dvd-iperf.resourcegroup]: CREATE_FAILED  BadRequest: resources.resourcegroup.resources[0].resources.instance_1: Invalid volume: Invalid input received: Invalid volume: Volume status must be available to reserve, but the status is attaching. (HTTP 400) (Request-ID: req-bf16393c-d4cc-4cb1-bedc-cc8
2018-03-29 18:16:57Z [dvd-iperf]: CREATE_FAILED  Resource CREATE failed: BadRequest: resources.resourcegroup.resources[0].resources.instance_1: Invalid volume: Invalid input received: Invalid volume: Volume status must be available to reserve, but the status is attaching. (HTTP 400)

Expected results:
Stack creation should fail with not enough resource available and the volumes should either be deleted or be available.

Additional info:
~~~
2018-03-28 18:04:46.978 47 INFO heat.engine.resource [req-dec914ce-7f07-4384-ace9-b61dc3efc25f - admin - default default] CREATE: Server "instance_1" [45347f28-571a-4375-bacc-277b140caedb] Stack "iperf-resourceg
roup-wup7ms3m5snx-0-aiao3t3ahq4j" [cce02bae-a737-4a31-b96d-f761f1e4a095]
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource Traceback (most recent call last):
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 831, in _action_recorder
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     yield
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 939, in _do_action
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 351, in wrapper
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     step = next(subtask)
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 890, in action_handler_task
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     done = check(handler_data)
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/nova/server.py", line 869, in check_create_complete
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     check = self.client_plugin()._check_active(server_id)
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/clients/os/nova.py", line 238, in _check_active
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource     'code': fault.get('code', _('Unknown'))
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource ResourceInError: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
2018-03-28 18:04:46.978 47 ERROR heat.engine.resource 
2018-03-28 18:04:53.175 47 INFO heat.engine.scheduler [req-dec914ce-7f07-4384-ace9-b61dc3efc25f - admin - default default] Task pause timed out
2018-03-28 18:04:59.890 47 INFO heat.engine.resource [req-dec914ce-7f07-4384-ace9-b61dc3efc25f - admin - default default] CREATE: Server "instance_1" [45347f28-571a-4375-bacc-277b140caedb] Stack "iperf-resourcegroup-wup7ms3m5snx-0-aiao3t3ahq4j" [cce02bae-a737-4a31-b96d-f761f1e4a095]
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource Traceback (most recent call last):
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 831, in _action_recorder
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     yield
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 939, in _do_action
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 334, in wrapper
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     step = next(subtask)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 884, in action_handler_task
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     handler_data = handler(*args)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/nova/server.py", line 859, in handle_create
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     admin_pass=admin_pass)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 1403, in create
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     **boot_kwargs)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 802, in _boot
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     return_raw=return_raw, **kwargs)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in _create
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     resp, body = self.api.client.post(url, body=body)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 294, in post
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     return self.request(url, 'POST', **kwargs)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 83, in request
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource     raise exceptions.from_response(resp, body, url, method)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource BadRequest: Invalid volume: Invalid input received: Invalid volume: Volume status must be available to reserve, but the status is attaching. (HTTP 400) (Request-ID: req-a0c71ed3-8f06-4d2e-bac0-da94797fd664) (HTTP 400) (Request-ID: req-3681d1e3-4c90-4ed8-9434-9dbc3650a04c)
2018-03-28 18:04:59.890 47 ERROR heat.engine.resource
~~~

Comment 1 Gorka Eguileor 2018-04-23 08:36:34 UTC

The exception doesn't seem to indicate a Cinder issue, as it looks like someone is trying to reserve a volume for attaching twice, and not failing gracefully when Cinder complains that the volume is already reserved on the second call.

Cinder is mostly just following orders, so the RCA analysis should focus on the Heat template first, then on Nova, and as the last possible culprit look into Cinder.

Comment 2 Zane Bitter 2018-04-24 20:47:34 UTC

When Heat sees a Nova server go into an ERROR state while trying to create it (in this case because of "Message: No valid host was found. There are not enough hosts available., Code: 500"), it deletes the server and tries again.

Obviously it's failing the second time because the server is gone but Cinder still thinks it's attaching. Arguably this is a bug in Cinder/Nova - the server shouldn't disappear until everything it was doing has been cleaned up.

Heat could do better by returning the first failure - which it already does if things always fail by going into an ERROR state, but not when a non-retryable error (BadRequest in this case) occurs later.

Comment 3 Gorka Eguileor 2018-04-25 09:39:46 UTC

Reassigning to Nova, since according to Zane's investigation Nova is proceeding without error on instance deletion when the volume is not left in the right state.

Comment 4 David Vallee Delisle 2018-04-27 16:55:47 UTC

It's pretty easily reproducible, let me know if you want to see this in a live session.

Comment 23 errata-xmlrpc 2018-08-20 12:55:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2332