Bug 1845480 - Stack update fails due to mistral resource conflict
Summary: Stack update fails due to mistral resource conflict
Keywords:
Status: CLOSED DUPLICATE of bug 1805507
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Zane Bitter
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-09 11:06 UTC by Ketan Mehta
Modified: 2023-10-06 20:30 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-10 04:30:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ketan Mehta 2020-06-09 11:06:39 UTC
Description of problem:

In an attempt to reuse 2 compute nodes for another role, user deleted the compute nodes using 'openstack server delete <uuid>' instead of using openstack overcloud node delete.

Following which, they ran a stack update by enrolling the nodes with a different role and the stack update failed at Mistral workflow.

~~~
Stack update fails at worfkflow step2 at mistral workflow:
 
overcloud.AllNodesDeploySteps.WorkflowTasks_Step2:
  resource_type: OS::Mistral::Workflow
  physical_resource_id:
  status: CREATE_FAILED
  status_reason: |
    Conflict: resources.WorkflowTasks_Step2: Conflict (HTTP 409)
Heat Stack update failed.
Heat Stack update failed.
~~~

Following which the to be scaled nodes with new role, were removed and a normal stack update was run which failed with the same error.

Upon checking further, it looks like a DB conflict issue with mistral workflow due to which heat is exiting with 409.

~~~
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource Traceback (most recent call last):
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 921, in _action_recorder
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource     yield
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 1034, in _do_action
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 329, in wrapper
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource     step = next(subtask)
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 975, in action_handler_task
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource     handler_data = handler(*args)
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/mistral/workflow.py", line 568, in handle_create
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource     raise exception.ResourceFailure(ex, self)
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource ResourceFailure: Conflict: resources.WorkflowTasks_Step2: Conflict (HTTP 409)
2020-05-26 17:15:10.376 4116 ERROR heat.engine.resource
2020-05-26 17:15:10.389 4116 INFO heat.engine.stack [req-077da6fc-f109-49da-8a57-ac28d23fd4e1 - admin - default default] Stack CREATE FAILED (overcloud-AllNodesDeploySteps-kx3slplpjekb): Resource CREATE failed: Conflict: resources.WorkflowTasks_Step2: Conflict (HTTP 409)

2020-05-26 17:15:10.670 4120 INFO heat.engine.resource [req-e7bafcf5-0e9b-427a-b16f-259ec71066f4 - - - - -] CREATE: TemplateResource "AllNodesDeploySteps" [2664afaa-831c-4311-a5e2-8faf04f225f5] Stack "overcloud" [9f0e3bef-46eb-4668-85ff-f25e57a6d96e]
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource Traceback (most recent call last):
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 921, in _action_recorder
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     yield
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 1034, in _do_action
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 346, in wrapper
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     step = next(subtask)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 983, in action_handler_task
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     done = check(handler_data)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py", line 404, in check_create_complete
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     return self._check_status_complete(self.CREATE)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py", line 458, in _check_status_complete
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource     action=action)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource ResourceFailure: Conflict: resources.AllNodesDeploySteps.resources.WorkflowTasks_Step2: Conflict (HTTP 409)
2020-05-26 17:15:10.670 4120 ERROR heat.engine.resource
2020-05-26 17:15:10.681 4120 INFO heat.engine.stack [req-e7bafcf5-0e9b-427a-b16f-259ec71066f4 - - - - -] Stack UPDATE FAILED (overcloud): Resource CREATE failed: Conflict: resources.AllNodesDeploySteps.resources.WorkflowTasks_Step2: Conflict (HTTP 409)
~~~
 
Mistral API:

~~~
2020-05-26 17:15:10.374 4075 ERROR mistral.utils.rest_utils [req-a072e0d1-0752-4a47-9fe1-55fdb9ede8ed 89c7f3354c2f431f8f9698f2956d32d5 6daaca1798af4e2b8d25fe0d465a8c17 - default default] Error during API call: Duplicate entry for WorkflowDefinition: ['name']: DBDuplicateEntryError: Duplicate entry for WorkflowDefinition: ['name']
~~~

I also notice that for workflow: tripleo.scale.v1.delete_node , two executions are in error state.

~~~
undercloud) [stack@undercloud ~]$ mistral execution-get 053468e0-fc66-4bb0-933b-8f08cfdd98a5
 
+--------------------+--------------------------------------+
| Field                            | Value                                                                |
+--------------------+--------------------------------------+
| ID                                  | 053468e0-fc66-4bb0-933b-8f08cfdd98a5 |
| Workflow ID                | b5c628d9-e1c8-4438-8295-8bcdc7e8f323 |
| Workflow name            | tripleo.scale.v1.delete_node                  |
| Workflow namespace |                                                                            |
| Description                |                                                                            |
| Task Execution ID    | <none>                                                              |
| State                            | ERROR                                                                |
| State info                  | None                                                                  |
| Created at                  | 2020-06-08 05:34:31                                    |
| Updated at                  | 2020-06-08 05:34:59                                    |
+--------------------+--------------------------------------+
 
(undercloud) [stack@undercloud ~]$ mistral execution-get d1117cdc-6d71-4b34-a203-b9fbcdcac3c9
 
+--------------------+--------------------------------------+
| Field                            | Value                                                                |
+--------------------+--------------------------------------+
| ID                                  | d1117cdc-6d71-4b34-a203-b9fbcdcac3c9 |
| Workflow ID                | b5c628d9-e1c8-4438-8295-8bcdc7e8f323 |
| Workflow name            | tripleo.scale.v1.delete_node                  |
| Workflow namespace |                                                                            |
| Description                |                                                                            |
| Task Execution ID    | <none>                                                              |
| State                            | ERROR                                                                |
| State info                  | None                                                                  |
| Created at                  | 2020-06-08 05:35:30                                    |
| Updated at                  | 2020-06-08 05:35:56                                    |
+--------------------+--------------------------------------+
~~~

The user did not take a db backup before running openstack server delete, so there is no healthy db backup. Although we have one with the present failures. 

Looking for suggestions for next steps to take this further.

Thanks.

Version-Release number of selected component (if applicable):

RHOSP-13, director

installed-rpms |grep -i heat

heat-cfntools-1.3.0-2.el7ost.noarch                         Tue Sep 10 11:52:11 2019
openstack-heat-api-10.0.3-5.el7ost.noarch                   Sun Sep 15 18:40:40 2019
openstack-heat-api-cfn-10.0.3-5.el7ost.noarch               Sun Sep 15 18:40:43 2019
openstack-heat-common-10.0.3-5.el7ost.noarch                Sun Sep 15 18:40:37 2019
openstack-heat-engine-10.0.3-5.el7ost.noarch                Sun Sep 15 18:40:46 2019
openstack-tripleo-heat-templates-8.3.1-54.el7ost.noarch     Sun Sep 15 19:35:43 2019
puppet-heat-12.4.1-0.20190214021237.a7ed720.el7ost.noarch   Tue Sep 10 11:52:07 2019
python2-heatclient-1.14.1-1.el7ost.noarch                   Tue Sep 10 11:52:47 2019
python-heat-agent-1.5.4-1.el7ost.noarch                     Tue Sep 10 11:52:47 2019

installed-rpms |grep -i mistral

openstack-mistral-api-6.0.6-2.el7ost.noarch                 Sun Sep 15 18:41:42 2019
openstack-mistral-common-6.0.6-2.el7ost.noarch              Sun Sep 15 18:41:39 2019
openstack-mistral-engine-6.0.6-2.el7ost.noarch              Sun Sep 15 18:41:45 2019
openstack-mistral-executor-6.0.6-2.el7ost.noarch            Sun Sep 15 18:41:48 2019
puppet-mistral-12.4.0-2.el7ost.noarch                       Tue Sep 10 11:52:08 2019
python2-mistralclient-3.3.0-1.el7ost.noarch                 Tue Sep 10 11:52:55 2019
python2-mistral-lib-0.4.0-1.el7ost.noarch                   Tue Sep 10 11:52:12 2019
python-mistral-6.0.6-2.el7ost.noarch                        Sun Sep 15 18:41:39 2019


Note You need to log in before you can comment on or make changes to this bug.