Bug 1813999 - problem with nova placement service endpoint during a director run
Summary: problem with nova placement service endpoint during a director run
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: ---
Assignee: Rajesh Tailor
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-16 17:55 UTC by Elf Lewis
Modified: 2023-12-15 17:31 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.4.1-56.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-24 11:33:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 714159 0 None MERGED Use internal interface for keystone in "wait for placement" script 2021-02-16 06:45:10 UTC
Red Hat Product Errata RHBA-2020:2718 0 None None None 2020-06-24 11:33:46 UTC

Description Elf Lewis 2020-03-16 17:55:47 UTC
Description of problem:
We are performing a minor update of our openstack cloud.  It fails with the following error:

Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: ERROR:nova_wait_for_placement_service:Retry - Failed to get placement service endpoint:
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: Traceback (most recent call last):
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/docker-config-scripts/nova_wait_for_placement_service.py", line 68, in <module>
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     name='placement')[0].id
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneclient/v3/services.py", line 93, in list
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     **kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 75, in func
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     return f(*args, **new_kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 397, in list
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     self.collection_key)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneclient/base.py", line 125, in _list
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     resp, body = self.client.get(url, **kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 304, in get
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     return self.request(url, 'GET', **kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 463, in request
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 189, in request
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     return self.session.request(url, method, **kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 698, in request
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     resp = send(**kwargs)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:   File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 772, in _send_request
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]:     raise exceptions.ConnectFailure(msg)
Mar 16 15:52:23 myserver.mydomain dockerd-current[15522]: ConnectFailure: Unable to establish connection to https://mycloud.mydomain:13000/services?name=placement: HTTPSConnectionPool(host='mycloud.mydomain', port=13000): Max retries exceeded with url: /services?name=placement (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f7cb8c83950>: Failed to establish a new connection: [Errno 110] Connection timed out',))
Mar 16 15:52:33 myserver.mydomaindockerd-current[15522]: DEBUG:keystoneauth.session:REQ: curl -g -i --insecure -X GET https://mycloud.mydomain:13000 -H "Accept: application/json" -H "User-Agent: nova_wait_for_placement_service.py keystoneauth1/3.4.1 python-requests/2.14.2 CPython/2.7.5"

We have 3 placement services endpoints defined.  The first is on our public network, the second is the admin endpoint, and the third is the internal endpoint.  The error suggests that the code is attempting to  connect to the public endpoint.  But traffic from the compute servers is blocked to the public api network for security reasons.

looking at openstack-tripleo-heat-templates/docker_config_scripts/nova_wait_for_placement_service.py:

        try:
            # get placement service id
            placement_service_id = keystone.services.list(
                name='placement')[0].id

            # get placement endpoint (os_interface)
            placement_endpoint_url = keystone.endpoints.list(
                service=placement_service_id,
                interface=config.get('placement', 'os_interface'))[0].url
            if not placement_endpoint_url:
                LOG.error('Failed to get placement service endpoint!')
            else:
                break
        except Exception as e:
            LOG.exception('Retry - Failed to get placement service endpoint:')
        time.sleep(timeout)

This seems to always return the first record (if my understanding of "[0].id" is correct?).  This would explain why the code is attempting to connect on the public network.  I think the more correct way to do this is to filter on " internal api".  Having looked through the python examples for keystone.services.list, I suspect that the placement_service_id line should read something like this (apologies - I am not a python coder, so this may be incorrect - but should give you the idea!):

            placement_service_id = keystone.services.list(
                name='placement',interface='internal')[0].id

Could you confirm if changing the code in this way would work as a temporary fix, and assist in having the endpoint target be filtered to the "internal" endpoint instead of the public endpoint in the tripleo RPMs for OSP13?

Note that if the above is completely wrong - Help in troubleshooting the initial error and determining why this is connecting to the public endpoint would also be appreciated!

Thanks

Comment 3 Martin Schuppert 2020-03-20 16:29:06 UTC
This is changed in a later release and now cherry-pick to queens via [1].

[1] https://review.opendev.org/#/c/714159/1

Comment 13 errata-xmlrpc 2020-06-24 11:33:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2718


Note You need to log in before you can comment on or make changes to this bug.