Bug 1570050 - tripleo.storage.v1.ceph-install fails with external ceph on yaql $.ansible_output.get('plays')[0].get('tasks')[0].get('hosts') - index out of bounds
Summary: tripleo.storage.v1.ceph-install fails with external ceph on yaql $.ansible_ou...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 13.0 (Queens)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-20 14:02 UTC by Pavel Sedlák
Modified: 2018-10-10 16:17 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.2-0.20180416194361.29a5ad5.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:52:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 563193 0 None MERGED Revert "Fixes ceph-external docker service name" 2020-08-18 08:10:02 UTC
OpenStack gerrit 563195 0 None MERGED Revert "Fixes ceph-external docker service name" 2020-08-18 08:10:02 UTC
Red Hat Product Errata RHEA-2018:2086 0 None None None 2018-06-27 13:53:11 UTC

Internal Links: 1730921

Description Pavel Sedlák 2018-04-20 14:02:10 UTC
OSP13 director deployment (3ctl+2comp+external_ceph) fails


overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution:
  resource_type: OS::Mistral::ExternalResource
  physical_resource_id: 08ff1eee-1e54-413a-8780-d70fe03bcbc1
  status: CREATE_FAILED
  status_reason: |
    resources.WorkflowTasks_Step2_Execution: ERROR


looks as issue with ceph/ansible or around it

> | 08ff1eee-1e54-413a-8780-d70fe03bcbc1 | 440409a6-d0ab-484c-8a09-cfb6013be73a | tripleo.overcloud.workflow_tasks.step2                                 |                    | Heat managed           | <none>                               | ERROR   | Failure caused by error i... | 2018-04-20 02:51:05 | 2018-04-20 02:51:14 |
> | 1821197b-4482-4b38-8db0-fa6e01c04e62 | 708340a2-a98d-4a7d-8913-4532731ae94b | tripleo.storage.v1.ceph-install                                        |                    | sub-workflow execution | e74e88d1-5dd3-409a-90f5-267d4a3184c4 | ERROR   | Failed to handle action c... | 2018-04-20 02:51:06 | 2018-04-20 02:51:13 |

log file for ceph ansible workflow is missing

in mistral/engine.log there is visible error when yaql attempts to format exception:

> 2018-04-19 22:51:13.379 17093 DEBUG mistral.expressions.yaql_expression [req-11719e3c-a88e-4895-8d14-6d78d816dd3a 6f854d02e4b846ef9e176e584fc0b4e7 e2a823f4ce31493f9a5b1c2fd84279f4 - default default] Start to evaluate YAQL expression. [expression='<% let(root => $.ansible_output.get('plays')[0].get('tasks')[0].get('hosts')) -> $.ips_list.toDict($, $root.get($).get('stdout')) %>', context={}] evaluate /usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py:109
> 2018-04-19 22:51:13.384 17093 ERROR mistral.engine.task_handler [req-11719e3c-a88e-4895-8d14-6d78d816dd3a 6f854d02e4b846ef9e176e584fc0b4e7 e2a823f4ce31493f9a5b1c2fd84279f4 - default default] Failed to handle action completion [error=Can not evaluate YAQL expression [expression=let(root => $.ansible_output.get('plays')[0].get('tasks')[0].get('hosts')) -> $.ips_list.toDict($, $root.get($).get('stdout')), error=list index out of range, data={}], wf=tripleo.storage.v1.ceph-install, task=set_ip_uuids, action=std.noop]:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 110, in _on_action_complete
>     task.on_action_complete(action_ex)
>   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 158, in wrapper
>     result = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 321, in on_action_complete
>     self.complete(state, state_info)
>   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 158, in wrapper
>     result = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 191, in complete
>     data_flow.publish_variables(self.task_ex, self.task_spec)
>   File "/usr/lib/python2.7/site-packages/mistral/workflow/data_flow.py", line 210, in publish_variables
>     task_ex.published = expr.evaluate_recursively(branch_vars, expr_ctx)
>   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively
>     data[key] = _evaluate_item(data[key], context)
>   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item
>     return evaluate(item, context)
>   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate
>     return evaluator.evaluate(expression, context)
>   File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 119, in evaluate
>     cls).evaluate(trim_expr, data_context)
>   File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 73, in evaluate
>     ", data=%s]" % (expression, str(e), data_context)
> YaqlEvaluationException: Can not evaluate YAQL expression [expression=let(root => $.ansible_output.get('plays')[0].get('tasks')[0].get('hosts')) -> $.ips_list.toDict($, $root.get($).get('stdout')), error=list index out of range, data={}]
> : YaqlEvaluationException: Can not evaluate YAQL expression [expression=let(root => $.ansible_output.get('plays')[0].get('tasks')[0].get('hosts')) -> $.ips_list.toDict($, $root.get($).get('stdout')), error=list index out of range, data={}]

little bit above that is visible in log some ansible.log structure which could correspond to the one used by above expression,
seems there is empty list of tasks:

> 2018-04-19 22:51:13.249 17093 INFO mistral.engine.engine_server [req-11719e3c-a88e-4895-8d14-6d78d816dd3a 6f854d02e4b846ef9e176e584fc0b4e7 e2a823f4ce31493f9a5b1c2fd84279f4 - default default] Received RPC request 'on_action_complete'[action_ex_id=0a52be7b-de6c-4c90-b64f-c500f2565469, result=Result [data={log_path: /tmp/ansible-mistral-actionLnEQBH/ansible.log, stderr: {
>     "plays": [
>         {
>             "play": {
>                 "id": "52540016-d429-d93f-90df-000000000005", 
>                 "name": "overcloud"
>             }, 
>             "tasks": []
>         }
>     ], 
>     "stats": {}
> }
> ..., error=None, cancel=False]]
> 2018-04-19 22:51:13.259 17093 INFO workflow_trace [req-11719e3c-a88e-4895-8d14-6d78d816dd3a 6f854d02e4b846ef9e176e584fc0b4e7 e2a823f4ce31493f9a5b1c2fd84279f4 - default default] Action 'tripleo.ansible-playbook' (0a52be7b-de6c-4c90-b64f-c500f2565469)(task=collect_nodes_uuid) [RUNNING -> SUCCESS, result = {log_path: /tmp/ansible-mistral-actionLnEQBH/ansible.log, stderr: {
>     "plays": [
>         {
>     ...]



installed is ceph-ansible.noarch               3.1.0-0.1.beta7.el7cp @rhelosp-13.0-puddle    



---
maybe unrelated, but from last time when this issue was not happening i see just these four packages changed:
> ceph-ansible.noarch
> < 3.1.0-0.1.beta6.el7cp @rhelosp-13.0-puddle
> > 3.1.0-0.1.beta7.el7cp @rhelosp-13.0-puddle
> 
> puppet-tripleo.noarch
> < 8.3.2-0.20180411174305.1f69cc5.el7ost
> > 8.3.2-0.20180411174307.el7ost
> 
> python-oslo-db-lang.noarch
> < 4.33.0-1.el7ost
> > 4.33.0-2.el7ost
> 
> python2-oslo-db.noarch
> < 4.33.0-1.el7ost
> > 4.33.0-2.el7ost

Comment 1 Pavel Sedlák 2018-04-20 14:23:00 UTC
Actually, ignore the last part about just four packages changing, seems it could be happening even few builds ago.

So there could be lot more updated things (e.g. whole tripleo) since this setup was last seen working. I recommend approaching as fresh new issue not connected to any other also open one.

Comment 9 Yogev Rabl 2018-05-16 14:08:18 UTC
verified on openstack-tripleo-heat-templates-8.0.2-17.el7ost.noarch

Comment 11 errata-xmlrpc 2018-06-27 13:52:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.