Description of problem: We are performing a minor update of our openstack cloud. It fails with the following error: Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: ERROR:nova_wait_for_placement_service:Retry - Failed to get placement service endpoint: Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: Traceback (most recent call last): Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/docker-config-scripts/nova_wait_for_placement_service.py", line 68, in <module> Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: name='placement')[0].id Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneclient/v3/services.py", line 93, in list Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: **kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 75, in func Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: return f(*args, **new_kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 397, in list Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: self.collection_key) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 125, in _list Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: resp, body = self.client.get(url, **kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 304, in get Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: return self.request(url, 'GET', **kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 463, in request Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: resp = super(LegacyJsonAdapter, self).request(*args, **kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 189, in request Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: return self.session.request(url, method, **kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 698, in request Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: resp = send(**kwargs) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 772, in _send_request Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: raise exceptions.ConnectFailure(msg) Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: ConnectFailure: Unable to establish connection to https://mycloud.mydomain:13000/services?name=placement: HTTPSConnectionPool(host='mycloud.mydomain', port=13000): Max retries exceeded with url: /services?name=placement (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f7cb8c83950>: Failed to establish a new connection: [Errno 110] Connection timed out',)) Mar 16 15:52:33 myserver.mydomaindockerd-current[15522]: DEBUG:keystoneauth.session:REQ: curl -g -i --insecure -X GET https://mycloud.mydomain:13000 -H "Accept: application/json" -H "User-Agent: nova_wait_for_placement_service.py keystoneauth1/3.4.1 python-requests/2.14.2 CPython/2.7.5" We have 3 placement services endpoints defined. The first is on our public network, the second is the admin endpoint, and the third is the internal endpoint. The error suggests that the code is attempting to connect to the public endpoint. But traffic from the compute servers is blocked to the public api network for security reasons. looking at openstack-tripleo-heat-templates/docker_config_scripts/nova_wait_for_placement_service.py: try: # get placement service id placement_service_id = keystone.services.list( name='placement')[0].id # get placement endpoint (os_interface) placement_endpoint_url = keystone.endpoints.list( service=placement_service_id, interface=config.get('placement', 'os_interface'))[0].url if not placement_endpoint_url: LOG.error('Failed to get placement service endpoint!') else: break except Exception as e: LOG.exception('Retry - Failed to get placement service endpoint:') time.sleep(timeout) This seems to always return the first record (if my understanding of "[0].id" is correct?). This would explain why the code is attempting to connect on the public network. I think the more correct way to do this is to filter on " internal api". Having looked through the python examples for keystone.services.list, I suspect that the placement_service_id line should read something like this (apologies - I am not a python coder, so this may be incorrect - but should give you the idea!): placement_service_id = keystone.services.list( name='placement',interface='internal')[0].id Could you confirm if changing the code in this way would work as a temporary fix, and assist in having the endpoint target be filtered to the "internal" endpoint instead of the public endpoint in the tripleo RPMs for OSP13? Note that if the above is completely wrong - Help in troubleshooting the initial error and determining why this is connecting to the public endpoint would also be appreciated! Thanks
This is changed in a later release and now cherry-pick to queens via [1]. [1] https://review.opendev.org/#/c/714159/1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2718