Description of problem: The mistral workflow for introspection has a task "wait_for_introspection_to_finish" this task sometimes fails with keystoneclient exceptions when mistral tries to reach ironic-inspector: +++ 9145 2021-08-03 11:56:28.873 7 WARNING mistral.executors.default_executor [req-73f60ebb-67fb-4efd-84e5-536efe91a907 ce6b006f672e4edbbeb477fc6c54b6cf bfd60fcc63ef4f578fdedfc77d38ef38 - default default] The action raised an exception [acti on_ex_id=c1eeb6f7-2270-496b-b887-910e95b9d773, msg='BaremetalIntrospectionAction.wait_for_finish failed: Unable to establish connection to https://XX.XX.XXX.XX:13050/v1/introspection/caf7b1e8-e5de-454d-88bb-646f3d422062: ('Connectio n aborted.', RemoteDisconnected('Remote end closed connection without response',))', action_cls='<class 'mistral.actions.action_factory.BaremetalIntrospectionAction'>', attributes='{'client_method_name': 'wait_for_finish'}', params= '{'uuids': ['caf7b1e8-e5de-454d-88bb-646f3d422062'], 'max_retries': 120, 'retry_interval': 10}']: mistral.exceptions.ActionException: BaremetalIntrospectionAction.wait_for_finish failed: Unable to establish connection to https://XX.XX.XXX.XXX:13050/v1/introspection/caf7b1e8-e5de-454d-88bb-646f3d422062: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',)) 9146 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor Traceback (most recent call last): 9147 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen 9148 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor chunked=chunked) 9149 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 384, in _make_request 9150 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor six.raise_from(e, None) 9151 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "<string>", line 3, in raise_from 9152 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 380, in _make_request 9153 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor httplib_response = conn.getresponse() 9154 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib64/python3.6/http/client.py", line 1346, in getresponse 9155 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor response.begin() 9156 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib64/python3.6/http/client.py", line 307, in begin 9157 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor version, status, reason = self._read_status() 9158 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor File "/usr/lib64/python3.6/http/client.py", line 276, in _read_status 9159 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor raise RemoteDisconnected("Remote end closed connection without" 9160 2021-08-03 11:56:28.873 7 ERROR mistral.executors.default_executor http.client.RemoteDisconnected: Remote end closed connection without response +++ this is the same as the upstream bug https://bugs.launchpad.net/tripleo/+bug/1836976 the fix is to add retry and delay params to the workflow task; same have been provided to the customer and we've fixed the issue Version-Release number of selected component (if applicable): cat installed-rpms | grep -i tripleo-common openstack-tripleo-common-11.4.1-1.20210104173607.el8ost.noarch Thu Jul 29 13:42:47 2021 openstack-tripleo-common-containers-11.4.1-1.20210104173607.el8ost.noarch Thu Jul 29 13:41:07 2021 python3-tripleo-common-11.4.1-1.20210104173607.el8ost.noarch Thu Jul 29 13:41:28 2021 How reproducible: Always for the customer Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created a patch for victoria to address that issue. Thanks Punit for the clear debug and fix strategy! Great work! https://review.opendev.org/c/openstack/tripleo-common/+/807249
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0986