Bug 1367610

Summary: Registry token auth redirects to service IP
Product: OpenShift Container Platform Reporter: Jordan Liggitt <jliggitt>
Component: Image RegistryAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED NEXTRELEASE QA Contact: Wei Sun <wsun>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.0CC: aos-bugs, jliggitt, yinzhou, zzhao
Target Milestone: ---   
Target Release: 3.2.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 19:40:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jordan Liggitt 2016-08-17 00:51:56 UTC
Cloned from https://github.com/openshift/origin/issues/10193

Registry token auth redirects clients to a token endpoint on the same server. Currently, it uses the injected service host and port envvars to determine the redirected hostname.

That breaks in cases where the registry is reached via a route or other network method, and the client does not have visibility to the registry service IP.

To recreate:

1. Set up an OpenShift server and docker registry

2. Expose the registry via a route

3. From a machine that does not have visibility to the service IP, attempt to `docker login` to the route

The login fails with an error about an i/o timeout on the service IP (172.30....), since the service IP cannot be reached.

Comment 1 Jordan Liggitt 2016-08-17 00:54:57 UTC
Fixed in https://github.com/openshift/origin/pull/10418

Redirect detects scheme and host from the incoming request.

For example:
  docker login https://$SERVICE_IP:5000
    would redirect to
  https://$SERVICE_IP:5000/openshift/token

  docker login https://$ROUTE_HOST
    would redirect to
  https://$ROUTE_HOST/openshift/token

Comment 2 zhou ying 2016-08-17 10:33:43 UTC
I can reproduce this issue on ami: devenv-rhel7_4849, please see :

docker login -u zhouy -p 0xMxWLj8o2vsys1MXe6l2xl6edONvXSzRSIJSFEOXag -e dajkl registry-default.router.default.svc.cluster.local
Error response from daemon: no successful auth challenge for https://registry-default.router.default.svc.cluster.local/v2/ - errors: [Get https://172.30.62.42:5000/openshift/token?account=zhouy: dial tcp 172.30.62.42:5000: i/o timeout]

Comment 3 Jordan Liggitt 2016-08-17 12:01:31 UTC
How are you setting up the registry? Can you verify you are using the latest tag or the one built on that devenv, not the v1.3.0-alpha.3 tagged image of the registry?

Comment 4 zhou ying 2016-08-18 03:16:43 UTC
Checking with ami :devenv-rhel7_4858 and the latest-images, the service IP not appeared, but still not connect succeed:

docker login -u zhouy -p ceGwphyZiKkrcXIip_t1bIF5Zl8THpXCW3kiAgGYDZs -e dald registry-default.router.default.svc.cluster.local:5000
Error response from daemon: invalid registry endpoint "http://registry-default.router.default.svc.cluster.local:5000/v0/". HTTPS attempt: unable to ping registry endpoint https://registry-default.router.default.svc.cluster.local:5000/v0/
v2 ping attempt failed with error: Get https://registry-default.router.default.svc.cluster.local:5000/v2/: dial tcp 52.91.180.161:5000: i/o timeout
 v1 ping attempt failed with error: Get https://registry-default.router.default.svc.cluster.local:5000/v1/_ping: dial tcp 52.91.180.161:5000: i/o timeout. HTTP attempt: unable to ping registry endpoint http://registry-default.router.default.svc.cluster.local:5000/v0/
v2 ping attempt failed with error: Get http://registry-default.router.default.svc.cluster.local:5000/v2/: dial tcp 52.91.180.161:5000: i/o timeout
 v1 ping attempt failed with error: Get http://registry-default.router.default.svc.cluster.local:5000/v1/_ping: dial tcp 52.91.180.161:5000: i/o timeout

Comment 5 Jordan Liggitt 2016-08-18 06:28:20 UTC
debugged this on IRC some, have a few thoughts:

1. The default router only exposes ports 80 and 443, so don't include `:5000` when doing `docker login` to a route hostname

2. after debugging, it looks like the IP for the route hostname resolves to cannot be connected to (by curl or by docker). We're never reaching the token redirect this BZ addressed.

Comment 6 Jordan Liggitt 2016-08-19 14:44:35 UTC
On a cluster with a working router, I see this:

curl --cacert openshift.local.config/master/ca.crt https://docker-registry-default.router.default.svc.cluster.local/v2/ -v
* About to connect() to docker-registry-default.router.default.svc.cluster.local port 443 (#0)
*   Trying 172.30.172.105...
* Connected to docker-registry-default.router.default.svc.cluster.local (172.30.172.105) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: openshift.local.config/master/ca.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* 	subject: CN=172.17.0.3
* 	start date: Aug 19 14:38:28 2016 GMT
* 	expire date: Aug 19 14:38:29 2018 GMT
* 	common name: 172.17.0.3
* 	issuer: CN=openshift-signer@1471614332
> GET /v2/ HTTP/1.1
> User-Agent: curl/7.29.0
> Host: docker-registry-default.router.default.svc.cluster.local
> Accept: */*
> 
< HTTP/1.1 401 Unauthorized
< Content-Type: application/json; charset=utf-8
< Docker-Distribution-Api-Version: registry/2.0
< Www-Authenticate: Bearer realm="https://docker-registry-default.router.default.svc.cluster.local/openshift/token"
< Date: Fri, 19 Aug 2016 14:43:02 GMT
< Content-Length: 87
< 
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}


The redirect uses the hostname, as expected.

Can you verify DNS resolution of the router hostnames is working in your environment using curl, the re-verify the docker login scenario?

Comment 7 Jordan Liggitt 2016-08-19 14:47:33 UTC
Also, the default router does not listen on port 5000, it only listens on port 80 (for non-tls) and port 443 (for tls), so don't add :5000 to the route hostname when doing docker login.

Comment 8 zhou ying 2016-08-22 09:32:25 UTC
Hi Jordan:

   How did you create the route of docker-registry ? 
I'm still get timeout, please see:

curl --cacert /etc/origin/master/ca.crt https://docker-registry-default.0822-8ea.qe.rhcloud.com/v2/ -v
* About to connect() to docker-registry-default.0822-8ea.qe.rhcloud.com port 443 (#0)
*   Trying 52.207.230.77...
* Connection timed out
* Failed connect to docker-registry-default.0822-8ea.qe.rhcloud.com:443; Connection timed out
* Closing connection 0
curl: (7) Failed connect to docker-registry-default.0822-8ea.qe.rhcloud.com:443; Connection timed out



curl --cacert /etc/origin/master/ca.crt https://docker-registry-default.0822-8ea.qe.rhcloud.com:5000/v2/ -v
* About to connect() to docker-registry-default.0822-8ea.qe.rhcloud.com port 5000 (#0)
*   Trying 52.207.230.77...
* Connection timed out
* Failed connect to docker-registry-default.0822-8ea.qe.rhcloud.com:5000; Connection timed out
* Closing connection 0
curl: (7) Failed connect to docker-registry-default.0822-8ea.qe.rhcloud.com:5000; Connection timed out

Comment 9 zhou ying 2016-08-22 10:18:21 UTC
Hi Jordan:
   Please ignore my last comment, now the route works well , I can 'docker login' from the route:

docker login -u zhouy -p VHS5lSc_ge5otslcA6akUrJKVL774xG2l_-fv4IXHYo -e dalda registry-default.0822-8ea.qe.rhcloud.com
WARNING: login credentials saved in /root/.docker/config.json
Login Succeeded

Comment 10 zhou ying 2016-09-01 08:57:51 UTC
confirmed with fork_ami_openshift3_miminar_295, and can not reproduce this issue

Comment 12 zhou ying 2016-10-10 00:58:36 UTC
blocked by Bug 1381532

Comment 13 zhou ying 2016-10-12 07:50:43 UTC
Confirmed with OCP v3.2.1.16, the issue has fixed:

openshift version
openshift v3.2.1.16
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

[root@ip-172-18-11-91 origin]# oc get route
NAME       HOST/PORT                                  PATH      SERVICE           TERMINATION   LABELS
registry   registry-default.1012-e-n.qe.rhcloud.com             docker-registry   passthrough 

[root@dhcp-141-156 certs.d]# curl -u zhouy:oZBbeayD3ipPg9fWDgd1PMnXypWndJMWOCzXgKbzqh4   -k https://registry-default.1012-e-n.qe.rhcloud.com/v2/ 
{}

Comment 16 Scott Dodson 2016-12-14 19:40:28 UTC
This bug has been fixed in OCP 3.3 however the fix will not be backported to OSE 3.2.