Bug 1403908

Summary: docker pull cannot use registries with authentication
Product: [Fedora] Fedora Reporter: Sara Cavallari <sara.c>
Component: dockerAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: adimania, admiller, amurdaca, dwalsh, ichavero, jcajka, jchaloup, lsm5, marianne, mark, matt, miminar, nalin, riek, scot, vbatts
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: docker-1.12.5-3.git079fbe3.fc25 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-31 06:49:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sara Cavallari 2016-12-12 15:41:43 UTC
Description of problem:

"docker pull" cannot use registries with authentication, it always fails.

Version-Release number of selected component (if applicable):
docker-2:1.12.3-12.git97974ae.fc25

How reproducible:
always

Steps to Reproduce:
1. docker login [some registry which requires authentication] 
this succeeds (Login successful)
2. docker pull [some image in the registry]

Actual results:
unauthorized: The client does not have permission for manifest

Expected results:
docker fetches images successfully

Additional info:
Both a hosted registry (using Artifactory) and Google Container Registry did not work with Fedora docker package, but worked when using testing packages from 
http://yum.dockerproject.org/repo/testing/fedora/25/
(see https://docs.docker.com/engine/installation/linux/fedora/ )

Comment 1 Antonio Murdaca 2016-12-12 16:13:45 UTC
Do you have a registry on GCR or an Artifactory already an up and running I can test Docker against? I need a registry I can reproduce this against and so far I'm not able to reproduce it. 
Also, please update your Docker package from Fedora official updates-testing and re-test this.

Comment 2 Antonio Murdaca 2016-12-12 17:16:21 UTC
Another thing to test would be to test if GCR and Artifactory work w/o authentication. I fear it's not fully related to authentication.

Comment 3 Antonio Murdaca 2016-12-15 10:12:29 UTC
GCR works fine:

$ docker push gcr.io/oceanic-granite-152609/busyboxtestThe push refers to a repository [gcr.io/oceanic-granite-152609/busyboxtest]
e88b3f82283b: Pushed
latest: digest: sha256:1ded4559e2aab2ab3464aae5170ef64afc15bea324a0861b543e2c56a3f29711 size: 527

$ docker push gcr.io/oceanic-granite-152609/alpine  
The push refers to a repository [gcr.io/oceanic-granite-152609/alpine]
011b303988d2: Pushed
latest: digest: sha256:1b6c543cc889c8f5f8d7061ddd3941b04568f725960e95896a2fbc06311fa4c0 size: 528


did you login correctly to GCR? `gcloud auth login` and `docker login -e 1234 -u oauth2accesstoken -p "$(gcloud auth print-access-token)" https://gcr.io`? Are you sure you're not using the wrong project-id to push the image?

Can't say anything about Artifactory though. I don't have any test registry to try with, if you can provide one I'll test it out.

For now, this issue is either fixed in latest docker-1.12.x in F25 or you did something wrong.

Closing, feel free to comment if you have a test registry for Artifactory, I would be happy to test.

$ rpm -qa | grep docker
docker-1.12.4-2.git1b5971a.fc25.x86_64
docker-novolume-plugin-1.12.4-2.git1b5971a.fc25.x86_64
docker-common-1.12.4-2.git1b5971a.fc25.x86_64
python3-docker-py-1.10.6-1.fc25.noarch
python2-dockerfile-parse-0.0.5-7.fc25.noarch
python-docker-py-1.10.6-1.fc25.noarch
python2-docker-pycreds-0.2.1-2.fc25.noarch
python3-docker-pycreds-0.2.1-2.fc25.noarch
docker-v1.10-migrator-1.12.4-2.git1b5971a.fc25.x86_64

Comment 4 Scot Loach 2016-12-22 03:25:20 UTC
I am having a similar problem pulling images from a google cloud registry after upgrading to Fedora 25.

I get the following result:
Trying to pull repository gcr.io/<repo>/<image> ... 
unauthorized: Permission denied for "latest" from request "/v2/<repo>/<image>/manifests/latest". 


Versions:

Client:
 Version:         1.12.5
 API version:     1.24
 Package version: docker-common-1.12.5-1.git6009905.fc25.x86_64
 Go version:      go1.7.4
 Git commit:      6009905/1.12.5
 Built:           Fri Dec 16 09:26:10 2016
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.5
 API version:     1.24
 Package version: docker-common-1.12.5-1.git6009905.fc25.x86_64
 Go version:      go1.7.4
 Git commit:      6009905/1.12.5
 Built:           Fri Dec 16 09:26:10 2016
 OS/Arch:         linux/amd64



This worked fine before the upgrade.

Comment 5 matt@cobe.io 2016-12-22 13:15:00 UTC
I'd just like to add my experience to this.

As above, but I wanted to emphasize that only PULLING fails; I can push images fine to GCR. I've been through and set up gcloud and docker on a new Ubuntu and Fedora VM; and Ubuntu works fine, Fedora fails.

Comment 6 Antonio Murdaca 2016-12-22 13:21:08 UTC
can you all please describe step by step how to reproduce this? Otherwise, none of us can be able to understand and reproduce your issue.

I've been testing this with a GCR registry and everything worked fine for me with the latest docker release. So, either you made some mistakes setting up GCR authentication or there's a bug. But I can't find it if I cannot reproduce and stating that "it's not working" doesn't help either :)

So please, post a step by step repdoducer

Comment 7 matt@cobe.io 2016-12-22 13:42:53 UTC
Sorry, I'm afraid there's nothing more to reproduce than as you've said (except pulling, not pushing).

The `gcloud auth login` and `docker login -u ...` steps work fine for me; and pushing images works fine.

If I do figure anything more out I'll update you.

Comment 8 Antonio Murdaca 2016-12-22 13:48:26 UTC
(In reply to matt from comment #7)
> Sorry, I'm afraid there's nothing more to reproduce than as you've said
> (except pulling, not pushing).
> 
> The `gcloud auth login` and `docker login -u ...` steps work fine for me;
> and pushing images works fine.
> 
> If I do figure anything more out I'll update you.

Does it happen for you _all the time_? can you check if it's a V1 docker images on GCR? or it's a V2.

Last thing, if you can _consistently_ reproduce this against your GCR repo, please add "--signature-verification=false" to $OPTIONS in /etc/sysconfig/docker, restart the daemon, try again to pull, and let me know what happens.

Comment 9 Scot Loach 2016-12-22 13:52:31 UTC
Happens all the time for me and for a colleague who also upgraded to 25.
Pushing works fine, pulling never works.

Make sure you are using a private gcr repo.

I believe it's a v2 but not exactly sure how to tell.

I tried your new option and that seems to have worked around the problem for me, I am now able to pull from the repo.

Comment 10 Antonio Murdaca 2016-12-22 13:55:57 UTC
(In reply to Scot Loach from comment #9)
> Happens all the time for me and for a colleague who also upgraded to 25.
> Pushing works fine, pulling never works.
> 
> Make sure you are using a private gcr repo.
> 
> I believe it's a v2 but not exactly sure how to tell.
> 
> I tried your new option and that seems to have worked around the problem for
> me, I am now able to pull from the repo.

Great, now, I need to know more about your registries on GCR because if I setup a brand new GCR registry, I cannot reproduce and understand the issue. I _may_ know what is causing this (hence that option to the docker daemon) but to fully fix it I need to know more about the GCR setup. Could either of you provide a shell on a system I can test this?

Comment 11 Antonio Murdaca 2016-12-22 14:51:47 UTC
Found the bug, going to rebuild docker in F25, stay tuned.

Comment 12 Antonio Murdaca 2016-12-22 14:54:38 UTC
For the record, the fix is here https://github.com/containers/image/pull/191

Comment 13 Fedora Update System 2016-12-22 15:33:45 UTC
docker-1.12.5-2.gite330732.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-d2249ce42f

Comment 14 Fedora Update System 2016-12-22 16:43:42 UTC
docker-1.12.5-3.git079fbe3.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-2f28826f76

Comment 15 Fedora Update System 2016-12-23 14:52:53 UTC
docker-1.12.5-3.git079fbe3.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-2f28826f76

Comment 16 Fedora Update System 2016-12-31 06:49:51 UTC
docker-1.12.5-3.git079fbe3.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Mark Mielke 2017-01-17 12:31:57 UTC
I hit this problem with Fedora 25 as well. It did not happen in Fedora 24, and it does not happen on any of our RHEL 7 machines. I came to the conclusion that it was related to Docker 1.12 as I have seen it work on many versions of Docker include 1.8, 1.9, 1.10, and I believe 1.11. The Fedora 24 is using 1.10.3-55 and working. The RHEL 7 machines are mostly 1.10.3, or 1.9.1.

The machine that is giving me trouble is the Fedora 25 clients running 1.12.5 and now 1.12.6-4.gitf499e8b.fc25.x86_64. The symptom is that a "docker pull" against a private Artifactory repository results in this output:

unauthorized: The client does not have permission for manifest

On the server-side (Nginx as transparent reverse proxy), the problem scenario looks like this:

10.179.128.15 - - [17/Jan/2017:07:12:52 -0500] "GET /v2/ HTTP/1.1" 401 77 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.15 - - [17/Jan/2017:07:12:52 -0500] "GET /v2/ HTTP/1.1" 401 77 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.15 - - [17/Jan/2017:07:12:52 -0500] "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 403 169 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.15 - - [17/Jan/2017:07:12:52 -0500] "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 403 169 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"


The server-side for a working scenario, with Fedora 24 in this cas,e looks like:

10.179.128.11 - - [17/Jan/2017:07:20:35 -0500] "GET /v2/ HTTP/1.1" 401 77 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - mmielke [17/Jan/2017:07:20:36 -0500] "GET /v2/token?account=<snip>&scope=<snip> HTTP/1.1" 200 102 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - - [17/Jan/2017:07:20:36 -0500] "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 200 1138 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - - [17/Jan/2017:07:20:36 -0500] "GET /v2/nginx/blobs/sha256:bfff7d7419abb257acbe1971a4614c8ea6a0103e328570c2ed0ca460c0346861 HTTP/1.1" 200 7544 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - - [17/Jan/2017:07:20:36 -0500] "GET /v2/nginx/blobs/sha256:21b0401f7da7f362458bf1c30d6096d49c9c11bf267a8dce9266e2e24aa0baec HTTP/1.1" 200 41174332 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"

The main difference appears to be that in the "working" case, it handles the first /v2/ ping result of HTTP 401 with an attempt to fetch a token and login as me (= mmielke). Once it has this token, it must present it for future requests and these then succeed with HTTP 200 and all is well.

In the problem case, it seems to retrying the /v2/ without basic auth credentials and getting the same HTTP 401 result. It then proceeds to fetch the manifests file and failures with HTTP 403 because without credentials, Artifactory is not going to share the manifests file or the image layers.

I remember it briefly working for a window - but I believe the window may be related to "docker login", and it may be timing related. I remember coming to the conclusion that the Docker daemon might persist the connection after "docker login" and pass the same cookies through again which might have allowed it to work briefly but not stay working.

I presumed this must be a wider problem affecting other people, so I moved onto another problem hoping it would be gone when I came back to this. Unfortunately, it is not gone, and this lead me to this issue.

I think the problem is not fixed. Does anybody else have results for or against here?

Is there anything particular I can check to help diagnose the issue? Unfortunately, our Artifactory Enterprise instance is behind a firewall, so I cannot just let you try it out yourself...

Comment 18 Mark Mielke 2017-01-17 13:14:12 UTC
Some more detailed logging, showing "Authorization:" and "WWW-Authenticate:"...

Fedora 25 / Docker 1.12.6-4:

10.179.128.4 - - [17/Jan/2017:08:06:20 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: -" "WWW-Authenticate: Bearer realm=\x22https://prism.ciena.com/v2/token\x22,service=\x22prism.ciena.com\x22,scope=\x22repository:docker-prism.ciena.com-v2-local:pull,push\x22" 0.001s "GET /v2/ HTTP/1.1" 401 512 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.4 - - [17/Jan/2017:08:06:20 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: -" "WWW-Authenticate: Bearer realm=\x22https://prism.ciena.com/v2/token\x22,service=\x22prism.ciena.com\x22,scope=\x22repository:docker-prism.ciena.com-v2-local:pull,push\x22" 0.001s "GET /v2/ HTTP/1.1" 401 512 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.4 - - [17/Jan/2017:08:06:20 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: -" "WWW-Authenticate: -" 0.003s "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 403 483 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"
10.179.128.4 - - [17/Jan/2017:08:06:20 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: -" "WWW-Authenticate: -" 0.001s "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 403 483 "-" "docker/1.12.6 go/go1.7.4 kernel/4.9.3-200.fc25.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.12.6 \x5C(linux\x5C))"

Fedora 24 / Docker 1.10.3-55:

10.179.128.11 - - [17/Jan/2017:08:06:57 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: -" "WWW-Authenticate: Bearer realm=\x22https://prism.ciena.com/v2/token\x22,service=\x22prism.ciena.com\x22,scope=\x22repository:docker-prism.ciena.com-v2-local:pull,push\x22" 0.001s "GET /v2/ HTTP/1.1" 401 512 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - mmielke [17/Jan/2017:08:06:59 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: Basic ...base 64 encoded password..." "WWW-Authenticate: -" 1.636s "GET /v2/token?account=mmielke&scope=repository%3Anginx%3Apull&service=prism.ciena.com HTTP/1.1" 200 409 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"
10.179.128.11 - - [17/Jan/2017:08:06:59 -0500] TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "Authorization: Bearer ... base 64 encoded bearer token ..." "WWW-Authenticate: -" 0.012s "GET /v2/nginx/manifests/nginx-1.11.8-20161229 HTTP/1.1" 200 1894 "-" "docker/1.10.3 go/go1.6.3 kernel/4.8.16-200.fc24.x86_64 os/linux arch/amd64"

This clearly shows that Docker 1.10.3-55 is using Basic auth when obtaining the bearer token, while Docker 1.12.6-4 apparently couldn't be bothered? :-)

Could it be the format of the "WWW-Authenticate" response that is causing Docker to ignore it? Or is this something to do with private repositories that don't end in "docker.com"?

Comment 19 Antonio Murdaca 2017-01-17 13:21:09 UTC
Mike could you open another BZ with your debug and assign it to me? (amurdaca)

Comment 20 Antonio Murdaca 2017-01-17 13:27:16 UTC
Also, any chance you can test this (now-old) update in your F25 box and tell me if that works? https://bodhi.fedoraproject.org/updates/FEDORA-2016-2f28826f76

Comment 21 Mark Mielke 2017-01-17 13:52:27 UTC
Ok, Antonio. I opened: 1413987 . I don't think I have "assign" privileges, so you will have to do this yourself.

I will see if I can check the update and report results on the new issue.