Bug 1918408

Summary: mistral_executor container fails to properly set no_proxy environment parameter
Product: Red Hat OpenStack Reporter: Alex Stupnikov <astupnik>
Component: python-paunchAssignee: Steve Baker <sbaker>
Status: CLOSED DUPLICATE QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: medium    
Version: 16.1 (Train)CC: aschultz, kecarter
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-01-22 00:07:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Stupnikov 2021-01-20 16:22:40 UTC
Description of problem:

Customer reported strange problem to us: "openstack overcloud node import" command failed with error [1] generated by mistral-executor: it failed to connect to Zaqar endpoint because of proxy-related error. At the same time /etc/environment file contained proper no_proxy definition which covered affected address.

I was able to reproduce this problem using python3 shell inside mistral_executor container [2]. Same commands were executed properly on OS itself and inside other containers (I used nova_api, but it is irrelevant). Interesting point is that HTTP connection were not affected by the same problem.

The root cause of this issue is that "no_proxy" environment variable inside mistral_executor container was defined using double quotes. Output of env command looked the following way:

no_proxy="hostnames,CIDRs,IPaddr"

"podman inspect mistral_executor" output had the following line:
            "Env": [
            ....
                "no_proxy=\"hostnames,CIDRs,IPaddr\"",

Basically, all other containers have no_proxy Env definition without slashes and end up without reported problem.

I removed ""env_file": "/etc/environment"," line from /var/lib/tripleo-config/container-startup-config/step_4/mistral_executor.json, removed and re-created mistral_executor using paunch. Problem was solved.

However, I am struggling to understand how to properly report this issue: it looks like a bug in requests library, which affects SSL/TLS traffic (no_proxy definition with quotes works fine for HTTP traffic), but I am not sure if this problem would be fixed there or if it is a real bug and environment variables could be defined this way.

I selected python-paunch component, but I hope that it would be re-routed properly by developers. sosreport from director node is attached to support case.

[1]
> The action raised an exception [action_ex_id=7a63a683-82ef-47cc-b556-4b16f0ede738, msg='ZaqarAction.queue_post failed: HTTPSConnectionPool(host='192.168.1.7', port=13888): Max retries exceeded with url: /v2/queues/tripleo/messages (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f18569b2cc0>: Failed to establish a new connection: [Errno -2] No address found',)))

[2]
>>> import requests
>>> r = requests.get('https://192.168.1.7:13888')
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 159, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 57, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 594, in urlopen
    self._prepare_proxy(conn)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 805, in _prepare_proxy
    conn.connect()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 301, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 168, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7fb0dc2e0470>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='192.168.1.7', port=13888): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb0dc2e0470>: Failed to establish a new connection: [Errno -2] Name or service not known',)))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 510, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='192.168.1.7', port=13888): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb0dc2e0470>: Failed to establish a new connection: [Errno -2] Name or service not known',)))

Comment 1 Alex Stupnikov 2021-01-20 16:23:41 UTC
I have set High severity on purpose: although the problem has workaround, huge amount of customer are potentially affected.

Comment 2 Alex Schultz 2021-01-22 00:07:44 UTC

*** This bug has been marked as a duplicate of bug 1916070 ***