Bug 1504464 - docker-registry pod does not uniformly use hostnames - docker push fails with proxy config
Summary: docker-registry pod does not uniformly use hostnames - docker push fails with...
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer   
(Show other bugs)
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.9.0
Assignee: Fabian von Feilitzsch
QA Contact: Gan Huang
URL:
Whiteboard:
Keywords: Reopened
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-20 00:59 UTC by Paul Armstrong
Modified: 2018-05-03 20:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Kubernetes service IP was not added to no_proxy list for the docker-registry Consequence: Internal registry requests would be forced to use the proxy, preventing logins and pushes to the internal registry. Fix: Added the kubernetes service IP to the no_proxy list Result: The internal registry requests are no longer proxied, and logins and pushes to the internal registry succeed as expected.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-05-03 20:12:20 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 14:08 UTC
Red Hat Bugzilla 1527210 None CLOSED Installer does not configure Kubernetes service IP for no_proxy for the docker-registry. 2019-04-10 07:40 UTC

Internal Trackers: 1527210

Description Paul Armstrong 2017-10-20 00:59:33 UTC
Description of problem:
docker-registry pod searches for api svc at 172.30.0.1 instead of kubernetes.default.svc.cluster.local.

This occurs in a situation in which the system is deployed in an environment that requires a proxy. The installer properly configures the NO_PROXY environment variable in the pod using the domain extensions and FQDNs required. There are however some code segments still using an IP Addr in their calls. These IPs are not proxied. This particular instance manifests itself as:

- timeouts in the builder pod log
- authentication failure messages in the registry pod log.
- other issues at the command-line...

The workaround was suggested by Gerald Nunn for Red Hat Toronto. 

Version-Release number of selected component (if applicable):
3.6.x

How reproducible:
Always

Steps to Reproduce:
1. Configure environment behind a proxy
2. Run advances install ensuring these values are set...
openshift_http_proxy=http://x.x.x.x:port
openshift_https_proxy=https://x.x.x.x:port
openshift_generate_no_proxy_hosts=true

3. Check environment with an application that uses an S2I build

Actual results:
Build fails at docker push... timeout
Edited /etc/sysconfig/docker on all nodes
Bounced docker service on all nodes
Bounced master and node services on all nodes
scaled down the docker-registry pod and scaled it back up

Then launched a build.

From the log of project default, pod docker-registry-1-lcq98

10.131.0.1 - - [19/Oct/2017:17:08:29 +0000] "GET /healthz HTTP/2.0" 200 0 "" "Go-http-client/2.0"

time="2017-10-19T17:08:37.651602967Z" level=debug msg="invalid token: Get https://172.30.0.1:443/oapi/v1/users/~: Unable to connect" go.version=go1.7.6 http.request.host="docker-registry.default.svc:5000" http.request.id=af111c4b-19ed-4bf5-b7e9-d2458db29d2b http.request.method=GET http.request.remoteaddr="10.129.0.1:53402" http.request.uri="/openshift/token?account=serviceaccount&scope=repository%3Aproduct-catalog-test%2Fproduct-catalog%3Apush%2Cpull" http.request.useragent="docker/1.12.6 go/go1.8.3 kernel/3.10.0-693.2.2.el7.x86_64 os/linux arch/amd64 UpstreamClient(go-dockerclient)" instance.id=db50a99b-c3a0-4337-81a9-ea42fce79d1f openshift.logger=registry 

10.129.0.1 - - [19/Oct/2017:17:06:30 +0000] "GET /openshift/token?account=serviceaccount&scope=repository%3Aproduct-catalog-test%2Fproduct-catalog%3Apush%2Cpull HTTP/1.1" 401 0 "" "docker/1.12.6 go/go1.8.3 kernel/3.10.0-693.2.2.el7.x86_64 os/linux arch/amd64 UpstreamClient(go-dockerclient)"


Expected results:
Build succeeds

Additional info:

Workaround - add 172.30.0.1 to pods NO_PROXY environment.
Boom! Builds succeed.
Took several hours to diagnose though...

Comment 1 Ben Parees 2017-10-20 08:12:44 UTC
You can set KUBERNETES_MASTER on the registry pod to tell the registry how to reach the master, but using the kube service ip (172.30.0.1) in this case is the way we will continue to initialize our client as it is the value k8s has told us to use to reach the master.

Adding 172.30.0.1 to NO_PROXY is the correct solution.

It looks like this is the default behavior for the installer:
https://docs.openshift.org/latest/install_config/http_proxies.html#configuring-no-proxy

https://docs.openshift.org/latest/install_config/install/advanced_install.html#advanced-install-configuring-global-proxy

Given that you have openshift_generate_no_proxy_hosts=true set, it seems like this should have happened automatically, so i'm going to transfer this to the installer.

Comment 2 Ben Parees 2017-10-20 08:13:49 UTC
(possible the installer did not add the no_proxy env to the registry pod)

Comment 3 Scott Dodson 2017-10-23 12:54:09 UTC
Ben,

Has the registry always communicated with the API via IP rather than hostname?

Paul,

Can you get the NO_PROXY environment variable for the registry? We've

Comment 4 Ben Parees 2017-10-26 11:54:24 UTC
As far as i know, yes it's always used the k8s service host variable

Comment 7 Ben Parees 2017-10-30 20:33:43 UTC
Is that before or after you manually added NO_PROXY to the registry pod?

Comment 9 Ben Parees 2017-10-30 21:30:49 UTC
I don't know why you're blaming the build pod when you were able to fix the problem by editing your registry pod.

The fundamental issue here is that the ansible installer did not configure the system to add the NO_PROXY env variable to the registry pod (but apparently did add the HTTP_PROXY/HTTPS_PROXY env variables to the registry pod)

Comment 10 Paul Armstrong 2017-10-31 04:37:19 UTC
Does the builder pod use a hard-coded ip address for the registry? Yes, or no?

Comment 13 Scott Dodson 2018-01-25 15:18:55 UTC
In https://bugzilla.redhat.com/show_bug.cgi?id=1527210 we're adding the kube service ip address to the list of NO_PROXY entries which should resolve this issue as well.

https://github.com/openshift/openshift-ansible/pull/6215

Comment 15 Gan Huang 2018-02-01 09:42:45 UTC
Verified in openshift-ansible-3.9.0-0.34.0.git.0.c7d9585.el7.noarch.rpm

172.30.0.1 is added to docker-registry NO_PROXY env variable successfully.

And S2I build succeeded.

Comment 18 errata-xmlrpc 2018-03-28 14:08:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489


Note You need to log in before you can comment on or make changes to this bug.