Description of problem: spec.proxy doesn't work after setting in imageregisgtry operator Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-13-204401 True False 26h Error while reconciling 4.0.0-0.nightly-2019-02-13-204401: the cluster operator openshift-samples is failing how reproducible: always Steps to Reproduce: 1.Set invalid proxy in imageregistry operator oc edit configs.imageregistry.operator.openshift.io/instance -n openshift-image-registry proxy: http: http://invalid.rdu.redhat.com:3128 https: http://invalid.rdu.redhat.com:3128 2. Trigger a build 3. Deploy a pod Actual results: build and pod could be running with invalid proxy. Expected results: build and pod shouldn't be running Additional info:
The proxy field affects how the registry interacts with the world. Basically it's for pullthrough (reference-policy=local) and for storages that use default HTTP transport (for example, S3). What storage was used for this test?
My imagestream is pullthrough policy, and the storage is S3. In latest cluster, configure invalid proxy and valid proxy, both met below error: For deployed pod: Failed to pull image "image-registry.openshift-image-registry.svc:5000/openshift/mongodb@sha256:ee70413940eba29389647bb352bcf5041da973de137f260ac3f95745d714a077": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password For builds: $ oc build-logs ruby-hello-world-2 -n xiu Command "build-logs" is deprecated, use oc logs build/<build-name> Cloning "https://github.com/openshift/ruby-hello-world.git" ... Commit: 787f1beae9956c959c6af62ee43bfdda73769cf7 (Merge pull request #78 from bparees/v22) Author: Ben Parees <bparees.github.com> Date: Thu Jan 17 17:21:03 2019 -0500 Caching blobs under "/var/cache/blobs". Pulling docker://image-registry.openshift-image-registry.svc:5000/openshift/ruby@sha256:86b9e4119c4fff8ca5cf4b726249d1533602b6aca83b813fe10193cfbd222580 error: build error: 1 error occurred: * Error determining manifest MIME type for docker://image-registry.openshift-image-registry.svc:5000/openshift/ruby@sha256:86b9e4119c4fff8ca5cf4b726249d1533602b6aca83b813fe10193cfbd222580: unable to retrieve auth token: invalid username/password My valid proxy: Proxy: Http: http://file.rdu.redhat.com:3128 Https: http://file.rdu.redhat.com:3128 The invalid proxy: Proxy: Http: http://invalid.rdu.redhat.com:3128 Https: http://invalid.rdu.redhat.com:3128 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-17-024922 True False 179m 4.0.0-0.nightly-2019-02-17-024922
Will you get the same error if you remove the proxy configuration from the operator's config? Can you attach logs from the registry pods?
After remove proxy configuration from the imageregistry operator's config, the builds and deploy pod are running well. Here is the registry pod logs after set proxy time="2019-02-18T08:14:58.631395869Z" level=error msg="Get user failed with error: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=b4b85125-06c7-4e16-b01f-a71ad69f5e29 http.request.method=GET http.request.remoteaddr="10.131.0.33:43174" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.631452416Z" level=error msg="error checking authorization: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=b4b85125-06c7-4e16-b01f-a71ad69f5e29 http.request.method=GET http.request.remoteaddr="10.131.0.33:43174" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.631493702Z" level=warning msg="error authorizing context: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=b4b85125-06c7-4e16-b01f-a71ad69f5e29 http.request.method=GET http.request.remoteaddr="10.131.0.33:43174" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.631550734Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=6ac6c74c-1dfe-4a2b-a826-5c16c421e9cc http.request.method=GET http.request.remoteaddr="10.131.0.33:43174" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.duration=39.801847ms http.response.status=400 http.response.written=0 time="2019-02-18T08:14:58.631391382Z" level=error msg="Get user failed with error: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=ed32c376-70e2-47ba-b9d3-8d75ab485caa http.request.method=GET http.request.remoteaddr="10.129.2.34:36960" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.631729776Z" level=error msg="error checking authorization: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=ed32c376-70e2-47ba-b9d3-8d75ab485caa http.request.method=GET http.request.remoteaddr="10.129.2.34:36960" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.631766833Z" level=warning msg="error authorizing context: Get https://172.30.0.1:443/apis/user.openshift.io/v1/users/~: proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host" go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=ed32c376-70e2-47ba-b9d3-8d75ab485caa http.request.method=GET http.request.remoteaddr="10.129.2.34:36960" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 time="2019-02-18T08:14:58.63180308Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=6d31b759-6c4b-47d6-98ed-b36a00b2ef83 http.request.method=GET http.request.remoteaddr="10.129.2.34:36960" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.duration=38.650089ms http.response.status=400 http.response.written=0 time="2019-02-18T08:15:00.067780796Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=a1999c79-eabb-422f-a12c-08cf9bea5b06 http.request.method=GET http.request.remoteaddr="10.129.0.1:60918" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="65.181µs" http.response.status=200 http.response.written=0 time="2019-02-18T08:15:02.81390424Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.85:5000" http.request.id=479f4da4-287f-4bc1-bf0d-522d9690922e http.request.method=GET http.request.remoteaddr="10.129.0.1:60934" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="88.186µs" http.response.status=200 http.response.written=0
There should be Proxy: NoProxy: master.hostname.example.com,10.1.0.0/16,172.30.0.0/16 so that the registry wouldn't use proxy when it tried to contact the master API.
Setting back to on_qa since it looks like this was a test case definition/execution problem.
Oleg,I want to registry talk to master via proxy, so I only set [1] without noProxy. Also I tried adding noProxy [2], but the build logs error change to [3], and the registry logs change to [4] [1] proxy: http: http://file.rdu.redhat.com:3128 https: http://file.rdu.redhat.com:3128 [2] proxy: http: http://file.rdu.redhat.com:3128 https: http://file.rdu.redhat.com:3128 noProxy: image-registry.openshift-image-registry.svc:5000,10.129.0.0/16,172.30.0.0/16 [3] $ oc build-logs ruby-ex-4 Command "build-logs" is deprecated, use oc logs build/<build-name> Cloning "https://github.com/sclorg/ruby-ex.git" ... Commit: c00ecd7c762590f1d52c316c7d00141a745ede18 (Merge pull request #25 from pvalena/master) Author: Honza Horak <hhorak> Date: Thu Dec 13 15:35:54 2018 +0100 Caching blobs under "/var/cache/blobs". Pulling docker://image-registry.openshift-image-registry.svc:5000/openshift/ruby@sha256:86b9e4119c4fff8ca5cf4b726249d1533602b6aca83b813fe10193cfbd222580 error: build error: 1 error occurred: * Error determining manifest MIME type for docker://image-registry.openshift-image-registry.svc:5000/openshift/ruby@sha256:86b9e4119c4fff8ca5cf4b726249d1533602b6aca83b813fe10193cfbd222580: Error reading manifest sha256:86b9e4119c4fff8ca5cf4b726249d1533602b6aca83b813fe10193cfbd222580 in image-registry.openshift-image-registry.svc:5000/openshift/ruby: received unexpected HTTP status: 500 Internal Server Error [4] time="2019-02-19T02:36:57.858955187Z" level=info msg="authorized request" go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=b262a32a-87c2-4aca-a0e7-54a1c5b5908e http.request.method=GET http.request.remoteaddr="10.131.0.41:54096" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 openshift.auth.user="system:serviceaccount:openshift-monitoring:prometheus-k8s" time="2019-02-19T02:36:57.864811105Z" level=info msg="response completed" go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=b262a32a-87c2-4aca-a0e7-54a1c5b5908e http.request.method=GET http.request.remoteaddr="10.131.0.41:54096" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.contenttype="text/plain; version=0.0.4" http.response.duration=24.927084ms http.response.status=200 http.response.written=3113 time="2019-02-19T02:36:57.864966109Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=36fa074c-02fc-4a77-9804-c86aa1ba1ea7 http.request.method=GET http.request.remoteaddr="10.131.0.41:54096" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.contenttype="text/plain; version=0.0.4" http.response.duration=25.123414ms http.response.status=200 http.response.written=3113 time="2019-02-19T02:36:57.93629839Z" level=info msg="authorized request" go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=7338885c-56d0-4373-89bd-0eb6bdaf8cf5 http.request.method=GET http.request.remoteaddr="10.129.2.41:35926" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 openshift.auth.user="system:serviceaccount:openshift-monitoring:prometheus-k8s" time="2019-02-19T02:36:57.942620951Z" level=info msg="response completed" go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=7338885c-56d0-4373-89bd-0eb6bdaf8cf5 http.request.method=GET http.request.remoteaddr="10.129.2.41:35926" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.contenttype="text/plain; version=0.0.4" http.response.duration=86.525487ms http.response.status=200 http.response.written=3115 time="2019-02-19T02:36:57.942758493Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=d5591e96-5a34-41b6-903b-7261b69e1414 http.request.method=GET http.request.remoteaddr="10.129.2.41:35926" http.request.uri=/extensions/v2/metrics http.request.useragent=Prometheus/2.7.1 http.response.contenttype="text/plain; version=0.0.4" http.response.duration=86.70635ms http.response.status=200 http.response.written=3115 time="2019-02-19T02:37:00.922262852Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=7cd178a8-0b92-4ff2-9090-f7bcd7365e21 http.request.method=GET http.request.remoteaddr="10.129.0.1:45038" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="48.773µs" http.response.status=200 http.response.written=0 time="2019-02-19T02:37:03.970172267Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=f5ec9ab7-9f99-4a0c-8a1d-2fb03baeba35 http.request.method=GET http.request.remoteaddr="10.129.0.1:45094" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="65.812µs" http.response.status=200 http.response.written=0 time="2019-02-19T02:37:10.922321844Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=94936950-42b9-45bb-ab0a-5815ef9ff848 http.request.method=GET http.request.remoteaddr="10.129.0.1:45172" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="64.259µs" http.response.status=200 http.response.written=0 time="2019-02-19T02:37:13.986080758Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=cfc6f9a1-70c4-4ad3-b74f-90c6b74d398b http.request.method=GET http.request.remoteaddr="10.129.0.1:45220" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="49.498µs" http.response.status=200 http.response.written=0 time="2019-02-19T02:37:20.922109127Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=5d8419e5-a2c5-4452-9c5f-4437cbaf1b62 http.request.method=GET http.request.remoteaddr="10.129.0.1:45298" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="63.377µs" http.response.status=200 http.response.written=0 time="2019-02-19T02:37:23.970517974Z" level=info msg=response go.version=go1.10.8 http.request.host="10.129.0.233:5000" http.request.id=df02b04b-0d82-4cbd-8929-61b8213f5f19 http.request.method=GET http.request.remoteaddr="10.129.0.1:45346" http.request.uri=/healthz http.request.useragent=kube-probe/1.12+ http.response.duration="75.278µs" http.response.status=200 http.response.written=0
In the registry log [4] there is no requests from the builder, only requests to /extensions/v2/metrics and /healthz (i.e. from monitoring and load balancer). Is it the full log? BTW, the error proxyconnect tcp: dial tcp: lookup file.rdu.redhat.com on 172.30.0.10:53: no such host means that file.rdu.redhat.com is not resolvable from within the pod. Can you rsh into the pod and check that the proxy is reachable using wget/curl/...? And I don't think we have an option to talk with the master API through the external IP address. In other words, the registry will always use something like 172.30.0.1, which is not accessible to external proxies.
Here is the full log of registry pod http://pastebin.test.redhat.com/717379 . My proxy setting: proxy: http: http://file.rdu.redhat.com:3128 https: http://file.rdu.redhat.com:3128 noProxy: ip-10-0-154-218.us-east-2.compute.internal,10.1.0.0/16,172.30.0.0/16 Indeed, in registry pod can't access my proxy file.rdu.redhat.com, so the operator treat file.rdu.redhat.com as invalid configure. If the promption message is accessable in registry pod and builds, I could verify this bug as workable. $oc rsh image-registry-74bf564f9d-4slrx curl: (6) Could not resolve host: file.rdu.redhat.com; Unknown error sh-4.2$ curl squid.redhat.com:3128 curl: (6) Could not resolve host: squid.redhat.com; Unknown error sh-4.2$ curl file.rdu.redhat.com:3128 curl: (6) Could not resolve host: file.rdu.redhat.com; Unknown error
You can try with another proxy. But at least we know that the registry tries to use a proxy server if spec.proxy is set.
After set invalid proxy, the registry could recognize spec.proxy setting, meantime the build and pod aren't running due to not resolve host. So mark this bug as verified. Thanks Oleg $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-18-224151 True False 17h Cluster version is 4.0.0-0.nightly-2019-02-18-224151
Oleg, when I set proxy with correct noProxy option, if my build and pod could be running? But now you can see the build error in comment #7 [3]. proxy: http: http://invalid.rdu.redhat.com:3128 https: http://invalid.rdu.redhat.com:3128 noProxy: ip-10-0-154-218.us-east-2.compute.internal,10.1.0.0/16,172.30.0.0/16
Oleg, Thanks, I forget to remove the needinfo tag after get feedback from slack.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758