Description of problem: Trying to deploy an overcloud with the following command, time openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml -e ~/containers-prepare-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml -e ~/templates/osp15.yaml --ntp-server clock.redhat.com After about 20 minutes it fails with the following error: Creating Swift container to store the plan Creating plan from template files in: /tmp/tripleoclient-9lci373i/tripleo-heat-templates Timed out waiting for messages from Execution (ID: 75f7f917-67ab-4b9c-8ad8-210f16660c99, State: ERROR). The Workflow errored and no messages were received. Exception occured while running the command Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/websocket/_socket.py", line 81, in recv bytes_ = sock.recv(bufsize) socket.timeout: timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/plugin.py", line 153, in wait_for_messages message = self.recv() File "/usr/lib/python3.6/site-packages/tripleoclient/plugin.py", line 131, in recv return json.loads(self._ws.recv()) File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 310, in recv opcode, data = self.recv_data() File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 327, in recv_data opcode, frame = self.recv_data_frame(control_frame) File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 340, in recv_data_frame frame = self.recv_frame() File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 374, in recv_frame return self.frame_buffer.recv_frame() File "/usr/lib/python3.6/site-packages/websocket/_abnf.py", line 361, in recv_frame self.recv_header() File "/usr/lib/python3.6/site-packages/websocket/_abnf.py", line 309, in recv_header header = self.recv_strict(2) File "/usr/lib/python3.6/site-packages/websocket/_abnf.py", line 396, in recv_strict bytes_ = self.recv(min(16384, shortage)) File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 449, in _recv return recv(self.sock, bufsize) File "/usr/lib/python3.6/site-packages/websocket/_socket.py", line 84, in recv raise WebSocketTimeoutException(message) websocket._exceptions.WebSocketTimeoutException: timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 30, in run super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run return super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 184, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 919, in take_action self._deploy_tripleo_heat_templates_tmpdir(stack, parsed_args) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 374, in _deploy_tripleo_heat_templates_tmpdir new_tht_root, tht_root) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 407, in _deploy_tripleo_heat_templates validate_stack=False) File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py", line 174, in create_plan_from_templates validate_stack=validate_stack) File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py", line 87, in create_deployment_plan **workflow_input) File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py", line 77, in _create_update_deployment_plan _WORKFLOW_TIMEOUT): File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/base.py", line 61, in wait_for_messages for payload in websocket.wait_for_messages(timeout=timeout): File "/usr/lib/python3.6/site-packages/tripleoclient/plugin.py", line 158, in wait_for_messages raise exceptions.WebSocketTimeout() tripleoclient.exceptions.WebSocketTimeout Version-Release number of selected component (if applicable): OSP15 (undercloud) [stack@f16-h10-000-1029p ~]$ sudo rpm -qa | grep tripleo openstack-tripleo-puppet-elements-10.3.1-0.20190420090433.9ba1438.el8ost.noarch openstack-tripleo-image-elements-10.4.1-0.20190420043237.7d6edd9.el8ost.noarch python3-tripleoclient-heat-installer-11.4.1-0.20190423085110.290ac95.el8ost.noarch openstack-tripleo-validations-10.4.1-0.20190420030347.9d08e89.el8ost.noarch python3-tripleo-common-10.7.1-0.20190423083511.2199eeb.el8ost.noarch python3-tripleoclient-11.4.1-0.20190423085110.290ac95.el8ost.noarch ansible-tripleo-ipsec-9.1.1-0.20190422122014.8c1fdab.el8ost.noarch ansible-role-tripleo-modify-image-1.0.1-0.20190422122515.f1dfdc6.el8ost.noarch openstack-tripleo-common-10.7.1-0.20190423083511.2199eeb.el8ost.noarch openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch openstack-tripleo-common-containers-10.7.1-0.20190423083511.2199eeb.el8ost.noarch puppet-tripleo-10.4.1-0.20190420063733.7fc5500.el8ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy undercloud and introspect overcloud nodes 2. Run overcloud deplyo command 3. Actual results: Command exits with failure Expected results: Deploy should succeed Additional info: Looking at mistral engine logs on undercloud, I see 2019-05-02 18:40:15.028 1 ERROR mistral.engine.task_handler [req-6a5a1e26-0287-4424-b5b0-9485fc25152e a76551fbe21c42dd8ea80ac74eeedd76 5018fa8b4e8144dc901c4e04cd0a624b - default default] Failed to run task [error=Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=['validate']], wf=tripleo.plan_management.v1.create_deployment_plan, task=add_root_stack_name]: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/mistral/engine/task_handler.py", line 63, in run_task task.run() File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper result = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 453, in run self._run_new() File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper result = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 485, in _run_new self._schedule_actions() File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 563, in _schedule_actions action.validate_input(input_dict) File "/usr/lib/python3.6/site-packages/mistral/engine/actions.py", line 336, in validate_input self.action_def.action_class File "/usr/lib/python3.6/site-packages/mistral/engine/utils.py", line 66, in validate_input raise exc.InputException(msg % tuple(msg_props)) mistral.exceptions.InputException: Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=['validate']] : mistral.exceptions.InputException: Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=['validate']]
I believe this is a duplicate of Bug 1700044. Please let us know if it's still occurring after the fix for 1700044 has been applied. *** This bug has been marked as a duplicate of bug 1700044 ***
Hi Alex, To apply the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1700044 please advise if the following two steps are enough, 1. On undercloud, install python3-oslo-rootwrap using dnf install python3-oslo-rootwrap 2. Patch tripleo-common on undercloud at /usr/lib/python3.6/site-packages/tripleo_common/actions/ansible.py
No you have to patch the mistral container. It needs to be updated in the mistral-engine container and then the container needs to be restarted.
Hi Alex. So I patched the mistral container with https://review.opendev.org/#/c/657090/1/tripleo_common/actions/ansible.py and ran podman restart mistral_engine. Now also I see the overcloud deploy failing, but much faster (undercloud) [stack@f16-h10-000-1029p ~]$ time openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml -e ~/containers-prepare-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml -e ~/templates/osp15.yaml --ntp-server clock.redhat.com Removing the current plan files Uploading new plan files {'result': 'Failed to run task [error=Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']], wf=tripleo.swift_backup.v1.create_swift_backup_container_plan, task=set_tempurl]:\nTraceback (most recent call last):\n File "/usr/lib/python3.6/site-packages/mistral/engine/task_handler.py", line 63, in run_task\n task.run()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 453, in run\n self._run_new()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 485, in _run_new\n self._schedule_actions()\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 563, in _schedule_actions\n action.validate_input(input_dict)\n File "/usr/lib/python3.6/site-packages/mistral/engine/actions.py", line 336, in validate_input\n self.action_def.action_class\n File "/usr/lib/python3.6/site-packages/mistral/engine/utils.py", line 66, in validate_input\n raise exc.InputException(msg % tuple(msg_props))\nmistral.exceptions.InputException: Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']]\n'} Exception occured while running the command Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 30, in run super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run return super(Command, self).run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 184, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 919, in take_action self._deploy_tripleo_heat_templates_tmpdir(stack, parsed_args) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 374, in _deploy_tripleo_heat_templates_tmpdir new_tht_root, tht_root) File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_deploy.py", line 400, in _deploy_tripleo_heat_templates validate_stack=False) File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py", line 238, in update_plan_from_templates validate_stack=validate_stack) File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py", line 122, in update_deployment_plan 'Exception updating plan: {}'.format(payload['message'])) tripleoclient.exceptions.WorkflowServiceError: Exception updating plan: {'result': 'Failed to run task [error=Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']], wf=tripleo.swift_backup.v1.create_swift_backup_container_plan, task=set_tempurl]:\nTraceback (most recent call last):\n File "/usr/lib/python3.6/site-packages/mistral/engine/task_handler.py", line 63, in run_task\n task.run()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 453, in run\n self._run_new()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 485, in _run_new\n self._schedule_actions()\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 563, in _schedule_actions\n action.validate_input(input_dict)\n File "/usr/lib/python3.6/site-packages/mistral/engine/actions.py", line 336, in validate_input\n self.action_def.action_class\n File "/usr/lib/python3.6/site-packages/mistral/engine/utils.py", line 66, in validate_input\n raise exc.InputException(msg % tuple(msg_props))\nmistral.exceptions.InputException: Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']]\n'} Exception updating plan: {'result': 'Failed to run task [error=Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']], wf=tripleo.swift_backup.v1.create_swift_backup_container_plan, task=set_tempurl]:\nTraceback (most recent call last):\n File "/usr/lib/python3.6/site-packages/mistral/engine/task_handler.py", line 63, in run_task\n task.run()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 453, in run\n self._run_new()\n File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper\n result = f(*args, **kwargs)\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 485, in _run_new\n self._schedule_actions()\n File "/usr/lib/python3.6/site-packages/mistral/engine/tasks.py", line 563, in _schedule_actions\n action.validate_input(input_dict)\n File "/usr/lib/python3.6/site-packages/mistral/engine/actions.py", line 336, in validate_input\n self.action_def.action_class\n File "/usr/lib/python3.6/site-packages/mistral/engine/utils.py", line 66, in validate_input\n raise exc.InputException(msg % tuple(msg_props))\nmistral.exceptions.InputException: Invalid input [name=tripleo.parameters.update, class=tripleo_common.actions.parameters.UpdateParametersAction, unexpected=[\'validate\']]\n'} real 0m24.002s user 0m4.088s sys 0m6.133s
That error points to a mismatch in containers & tripleo-common on the undercloud. What containers are you using? See Bug 1700096 *** This bug has been marked as a duplicate of bug 1700096 ***
Tag is 20190306.1 (passed_phase1)
That's way old. you need to use a newer version of the containers that goes with the tripleo-common you have installed. We should have containers from May 9th at least available (the most recent pass of phase1)