Description of problem: openstack overcloud node delete exits only after 60minutes, even if the stack update operation completed before: (undercloud) [stack@undercloud-0 ~]$ time openstack overcloud node delete 4c199f44-d5c7-4733-bf14-8c2c38141f12 Deleting the following nodes from stack overcloud: - 4c199f44-d5c7-4733-bf14-8c2c38141f12 Waiting for messages on queue 'tripleo' with no timeout. Connection is already closed. real 60m14.550s user 0m1.023s sys 0m0.221s The stack update operation finished in less than 60minutes: (undercloud) [stack@undercloud-0 ~]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | edecc270-20ff-4ac3-b08d-fe8c0877cd41 | overcloud | 3c2b3888141742bd8fe464c163b3ca08 | UPDATE_COMPLETE | 2018-10-03T00:38:16Z | 2018-10-04T15:06:45Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ Version-Release number of selected component (if applicable): python-tripleoclient-heat-installer-10.5.1-0.20180906012842.el7ost.noarch python-tripleoclient-10.5.1-0.20180906012842.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with 1 controller + 2 computes 2. Remove one compute node: openstack overcloud node delete $node_uuid Actual results: The command appears to exit after 60 minutes even if the stack update finished before. Expected results: openstack overcloud node delete command exits after the stack update has finished. Additional info:
I'm wondering if this is related to the log rotation code downstream. We noticed that upstream heat/mistral do not play nicely when they are SIGHUP'd. As I do not believe we've landed the copytruncate for 14 it might be related to that.
Actually no, the process is completing but it seems like the messaging is getting lost.
Ah ha, yaql error. (undercloud) [cloud-user@undercloud heat]$ openstack workflow execution show 3cd8fdfe-8b5b-4be6-9922-163ed11b5110 -f yaml /usr/lib/python2.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.21.1) or chardet (2.2.1) doesn't match a supported version! RequestsDependencyWarning) ID: 3cd8fdfe-8b5b-4be6-9922-163ed11b5110 Workflow ID: cadaa0b6-b4a4-443f-bf04-7c5d67be0778 Workflow name: tripleo.scale.v1.delete_node Workflow namespace: '' Description: '' Task Execution ID: <none> Root Execution ID: <none> State: ERROR State info: "Failed to run task [error=Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:\n\ Traceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py\"\ , line 63, in run_task\n task.run()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 390, in run\n self._run_new()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 419, in _run_new\n self._schedule_actions()\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 483, in _schedule_actions\n input_dict = self._get_action_input()\n File\ \ \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\", line 159, in wrapper\n\ \ result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 514, in _get_action_input\n input_dict = self._evaluate_expression(self.task_spec.get_input(),\ \ ctx)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\", line\ \ 540, in _evaluate_expression\n ctx_view\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 100, in evaluate_recursively\n data[key] = _evaluate_item(data[key], context)\n\ \ File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\", line\ \ 79, in _evaluate_item\n return evaluate(item, context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 71, in evaluate\n return evaluator.evaluate(expression, context)\n File\ \ \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\", line\ \ 159, in evaluate\n cls).evaluate(trim_expr, data_context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\"\ , line 113, in evaluate\n \", data=%s]\" % (expression, str(e), data_context)\n\ YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}]\n" Created at: '2018-10-08 22:24:40' Updated at: '2018-10-08 22:32:54'
VERIFIED openstack-tripleo-common-9.4.1-0.20181012010866.67bab16.el7ost.noarch +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | 8d7f1352-a261-482b-a366-bc1b8b36085f | compute-0 | ACTIVE | - | Running | ctlplane=192.168.24.15 | | a320cf5d-ed11-4af7-9aad-829f9f82204f | controller-0 | ACTIVE | - | Running | ctlplane=192.168.24.13 | | 0674db7c-6023-4a08-a5dc-5448d16f9459 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.7 | | bd5135bb-a946-440f-a540-ff4e758b1830 | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.11 | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ (undercloud) [stack@undercloud-0 ~]$ time openstack overcloud node delete 8d7f1352-a261-482b-a366-bc1b8b36085f Deleting the following nodes from stack overcloud: - 8d7f1352-a261-482b-a366-bc1b8b36085f Waiting for messages on queue 'tripleo' with no timeout. real 19m1.223s user 0m0.981s sys 0m0.233s (undercloud) [stack@undercloud-0 ~]$ nova list /usr/lib/python2.7/site-packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning /usr/lib/python2.7/site-packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | a320cf5d-ed11-4af7-9aad-829f9f82204f | controller-0 | ACTIVE | - | Running | ctlplane=192.168.24.13 | | 0674db7c-6023-4a08-a5dc-5448d16f9459 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.7 | | bd5135bb-a946-440f-a540-ff4e758b1830 | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.11 | +--------------------------------------+--------------+--------+------------+-------------+------------------------+
Regarding comment #12 - I attempted to install using a later version of the openstack-tripleo-common rpm and did not achieve the result shown in comment #12. Stack trace still can be seen when performing a scale-in: Command: openstack overcloud node delete <uuid> Symptoms: Command continues to hang indefinitely open stack stack list shows it was successful after 10 minutes, but command never returns. [stack@undercloud (stackrc) ~]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | 00462a04-88ff-4558-a8e0-d86f16f39241 | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-12T04:31:08Z | 2018-11-12T17:56:57Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ [stack@undercloud (stackrc) ~]$ openstack workflow execution show 39b85ffd-de65-499f-ab53-9c4677c72f7d -f yaml ID: 39b85ffd-de65-499f-ab53-9c4677c72f7d Workflow ID: 4ff89508-2f97-41c8-92fb-6f1490e1ec0e Workflow name: tripleo.scale.v1.delete_node Workflow namespace: '' Description: '' Task Execution ID: <none> Root Execution ID: <none> State: ERROR State info: "Failed to run task [error=Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:\n\ Traceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py\"\ , line 63, in run_task\n task.run()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 390, in run\n self._run_new()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 419, in _run_new\n self._schedule_actions()\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 483, in _schedule_actions\n input_dict = self._get_action_input()\n File\ \ \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\", line 159, in wrapper\n\ \ result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 514, in _get_action_input\n input_dict = self._evaluate_expression(self.task_spec.get_input(),\ \ ctx)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\", line\ \ 540, in _evaluate_expression\n ctx_view\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 100, in evaluate_recursively\n data[key] = _evaluate_item(data[key], context)\n\ \ File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\", line\ \ 79, in _evaluate_item\n return evaluate(item, context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 71, in evaluate\n return evaluator.evaluate(expression, context)\n File\ \ \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\", line\ \ 159, in evaluate\n cls).evaluate(trim_expr, data_context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\"\ , line 113, in evaluate\n \", data=%s]\" % (expression, str(e), data_context)\n\ YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}]\n" Created at: '2018-11-12 17:54:00' Updated at: '2018-11-12 18:04:08' rpms installed: [stack@undercloud (stackrc) ~]$ rpm -qa | grep openstack-tripleo openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch
(In reply to Jim Bagwell from comment #15) > Regarding comment #12 - I attempted to install using a later version of the > openstack-tripleo-common rpm and did not achieve the result shown in comment > #12. > Jim, this bug has already been verified by QE team. Please open a new BZ providing the details for the failure you're seeing.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045