Bug 1561989

Summary: Secret created by `oc create secret docker-registry` cannot pull image from docker.io private registry
Product: OpenShift Container Platform Reporter: Xingxing Xia <xxia>
Component: NodeAssignee: Valentin Rothberg <vrothber>
Node sub component: Kubelet QA Contact: Weinan Liu <weinliu>
Status: CLOSED WONTFIX Docs Contact:
Severity: urgent    
Priority: urgent CC: adamk, akhaire, amcdermo, aos-bugs, bmchugh, dakini, dwalsh, gburges, haowang, hgomes, jfiala, jkaur, jnovy, jokerman, kjartan.paulsen, kmendez, ksalunkh, malonso, maszulik, maupadhy, mmccomas, mvardhan, openshift-bugs-escalate, pkanthal, pstrick, public, rheinzma, rhowe, rsunog, sjenning, trogers, wmeng, wzheng, xtian, yufchang
Version: 3.10.0Keywords: OnlineStarter, Regression, Reopened
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1578088 1600539 (view as bug list) Environment:
Last Closed: 2019-07-22 20:48:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1578088, 1600539    

Description Xingxing Xia 2018-03-29 10:00:49 UTC
Description of problem:
Secret created by `oc create secret docker-registry` cannot pull image from external registry (It can pull from other user's image in internal docker-registry.default.svc:5000, though).

Version-Release number of selected component (if applicable):
v3.10.0-0.15.0 (oc v3.10.0-0.15.0 against server v3.10.0-0.15.0)
(Tried OCP v3.9.14 also reproduces)

How reproducible:
Always

Steps to Reproduce:
$ MYPASSWORD=**** MYEMAIL=****@qq.com
$ oc create secret docker-registry mydocker --docker-server=docker.io --docker-username=starxia --docker-password=$MYPASSWORD --docker-email=$MYEMAIL
$ oc secrets link default mydocker --for=pull
$ oc new-app docker.io/starxia/myprivate:hello-openshift --name myapp
$ oc edit dc myapp # Make sure imagePullPolicy is 'Always' in case node already has the image.
$ oc get pod -w

Actual results:
`oc get pod -w` shows:
myapp-1-cvlw6    0/1       ImagePullBackOff   0          8s

Expected results:
Pod should be running

Additional info:
Tried above steps in env of other version, pod can be running.

Comment 1 Xingxing Xia 2018-04-08 06:49:24 UTC
Searched https://github.com/openshift/origin/pulls, seems there is no PR for bug yet. Could you have one? Currently it blocks verification of bug 1561996 which is operated on web.
Thanks

Comment 2 Xingxing Xia 2018-04-10 09:35:20 UTC
Adding keyword TestBlocker due to it blocks verification of bug 1561996 which in turn blocks acceptance of feature card https://trello.com/c/XzkI9of3/

Comment 3 Maciej Szulik 2018-04-10 12:47:00 UTC
I've double check with latest master version (oc v3.10.0-alpha.0+09f841f-626) and this is working as expected.

Comment 4 Wang Haoran 2018-04-11 06:53:46 UTC
(In reply to Maciej Szulik from comment #3)
> I've double check with latest master version (oc
> v3.10.0-alpha.0+09f841f-626) and this is working as expected.

Hi Maciej, could you please why this is working as expected? what should we do if we want to deploy an private image for now?

Comment 5 Maciej Szulik 2018-04-11 08:44:35 UTC
The reason I moved it to QA and I'm saying it's working is that for me the steps described in comment 1 worked just fine. After creating the secret and linking it to the default SA I was able to deploy private image from dockerhub without any problems.

Comment 6 Maciej Szulik 2018-04-11 17:07:59 UTC
It looks like the secret format is correct, since the image importer was able to pick it up and import image metadata properly. I've tried both with index.docker.io and just docker.io, as well as creating secret as described in comment 1 and from a docker login created file. In all cases the latest master on an AWS VM failed to run the image. I've verified this with Juan who was able to successfully run the image locally (same as me). 

Since this is stepping Pod's team territory I'm moving it to them, since it looks like the kubelet is not picking the secrets properly.

Comment 34 Ben Parees 2018-04-24 12:56:19 UTC
*** Bug 1565944 has been marked as a duplicate of this bug. ***

Comment 37 Ben Parees 2018-04-25 14:28:19 UTC
*** Bug 1565944 has been marked as a duplicate of this bug. ***

Comment 47 Seth Jennings 2018-05-08 21:30:11 UTC
Opened this as a WIP carry PR to see if it will pass the tests
https://github.com/openshift/origin/pull/19656

This bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=1518378

Whatever we end up doing, it needs to go back to 3.9 as well.

Comment 50 Brendan Mchugh 2018-05-10 14:56:37 UTC
Customer reporting the same behaviour in Openshift Online Pro cluster and reproduced locally with:
oc v3.9.25
kubernetes v1.9.1+a0ce1bc657

Comment 51 Stefanie Forrester 2018-05-10 23:34:55 UTC
The syntax for storing docker secrets has changed and caused some customer confusion. Here's where we mention it in the release notes:

https://docs.openshift.com/container-platform/3.9/release_notes/ocp_3_9_release_notes.html#ocp-39-several-oc-secrets-subcommands-now-deprecated

Here's an example of a secret that I was able to successfully create in 3.9. So the customer will probably have better luck using this syntax:

oc create secret generic <secret-name> --from-file=.dockerconfigjson=/root/.docker/config.json --type=kubernetes.io/dockerconfigjson -o=json -n <project>

Comment 52 Kjartan Ivar Paulsen 2018-05-14 17:09:01 UTC
This is not a complete solution, at best a workaround. The same issue happens with image registry secrets created in Openshift Online Pro cluster. If this was a solution, then it means that Openshift Online Pro also needs an update to follow the new syntax as per 3.9.

Comment 53 Seth Jennings 2018-05-14 19:03:42 UTC
https://github.com/openshift/origin/pull/19656 merged.

Cloning bz for 3.9 backport.

Comment 54 Kjartan Ivar Paulsen 2018-05-14 19:49:37 UTC
I tried to follow the new syntax also as a workaround, but it did not work for me on Windows 10 (used Powershell). The credentials is stored in credential store. See below (credential store also explained here: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/): 

PS C:\Users\KjartanIvar> docker login docker.io
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username (kipdev):
Password:
Login Succeeded
PS C:\Users\KjartanIvar> cat c:\\Users\\KjartanIvar\\.docker\\config.json
{
        "auths": {
                "https://index.docker.io/v1/": {}
        },
        "HttpHeaders": {
                 "User-Agent": "Docker-Client/17.09.1-ce (windows)"
        },
        "credsStore": "wincred"
}
PS C:\Users\KjartanIvar> oc create secret generic dockeriohub2 --from-file=.dockerconfigjson=c:\Users\KjartanIvar\.docker\confi
g.json --type=kubernetes.io/dockerconfigjson -o=json -n ebc
{
    "kind": "Secret",
    "apiVersion": "v1",
    "metadata": {
        "name": "dockeriohub2",
        "namespace": "ebc",
        "selfLink": "/api/v1/namespaces/ebc/secrets/dockeriohub2",
        "uid": "17e7bc63-57a9-11e8-ab87-123713f594ec",
        "resourceVersion": "290365728",
        "creationTimestamp": "2018-05-14T19:00:39Z"
    },
    "data": {
        ".dockerconfigjson": "ewoJImF1dGhzIjogewoJCSJodHRwczovL2luZGV4LmRvY2tlci5pby92MS8iOiB7fQoJfSwKCSJIdHRwSGVhZGVycyI6IHsKCQkiVXNlci1BZ2VudCI6ICJEb2NrZXItQ2xpZW50LzE3LjA5LjEtY2UgKHdpbmRvd3MpIgoJfSwKCSJjcmVkc1N0b3JlIjogIndpbmNyZWQiCn0="
    },
    "type": "kubernetes.io/dockerconfigjson"
}

Actual revealed secret is "wincred" and there is no username in the secret. 

---
dockeriohub2 created 28 minutes ago
kubernetes.io/dockerconfigjson Hide Secret

https://index.docker.io/v1/
No username and password.
Credentials Store

wincred

There are no annotations on this resource.
--- 

I tried to add this secret to the service accounts, but I get the same issue. 
See events:

---
9:45:41 PM 	Warning 	Failed  	Error: ImagePullBackOff
47 times in the last 6 minutes
9:41:05 PM 	Normal 	Sandbox Changed  	Pod sandbox changed, it will be killed and re-created.
7 times in the last 6 minutes
9:41:05 PM 	Normal 	Back-off  	Back-off pulling image "docker.io/kipdev/ebc@sha256:ecfcb18a43c3d589c67193b977b20a23d4f14a7746ba9da0ab7a3501f481803e"
5 times in the last 6 minutes
9:40:54 PM 	Warning 	Failed  	Failed to pull image "docker.io/kipdev/ebc@sha256:ecfcb18a43c3d589c67193b977b20a23d4f14a7746ba9da0ab7a3501f481803e": rpc error: code = Unknown desc = repository docker.io/kipdev/ebc not found: does not exist or no pull access
2 times in the last 6 minutes
9:40:54 PM 	Warning 	Failed  	Error: ErrImagePull
2 times in the last 6 minutes
9:40:53 PM 	Normal 	Pulling  	pulling image "docker.io/kipdev/ebc@sha256:ecfcb18a43c3d589c67193b977b20a23d4f14a7746ba9da0ab7a3501f481803e"
2 times in the last 6 minutes
9:40:39 PM 	Normal 	Successful Mount Volume  	MountVolume.SetUp succeeded for volume "default-token-42hfg"
9:40:39 PM 	Normal 	Scheduled  	Successfully assigned ebctest3-1-55xcj to ip-172-31-51-9.ec2.internal 
---

Comment 55 Xingxing Xia 2018-05-29 01:58:59 UTC
Verified in both OCP and free-int (oc and server version v3.10.0-0.53.0 which includes above PR), the issue has been fixed. With bug's reported steps, now private image from external docker.io can be pulled and pod can be running. Please move to ON_QA

Comment 58 Xingxing Xia 2018-06-05 08:18:52 UTC
Verified again in OCP v3.10.0-0.58.0, bug is fixed. The fix is not causing bug 1583500 per https://bugzilla.redhat.com/show_bug.cgi?id=1583500#c4 , thus moving to VERIFIED

Comment 62 Ryan Howe 2018-08-16 19:54:54 UTC
To work around this issue the following steps worked for me. 

# oc version
oc v3.10.14
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://openshift.internal.test:443
openshift v3.10.14
kubernetes v1.10.0+b81c8f8

# docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-68.gitdded712.el7.x86_64
 Go version:      go1.9.2
 Git commit:      dded712/1.13.1
 Built:           Tue Jun 12 18:30:09 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-68.gitdded712.el7.x86_64
 Go version:      go1.9.2
 Git commit:      dded712/1.13.1
 Built:           Tue Jun 12 18:30:09 2018
 OS/Arch:         linux/amd64
 Experimental:    false

-------------------------------------------------------
-------------------------------------------------------

1. Add the following line to /etc/sysconfig/docker 

```
  ADD_REGISTRY='--add-registry docker.io --add-registry registry.access.redhat.com'
```
 - Did not work with when I added docker.io to [registries.search] in /etc/containers/registries.conf 


# systemctl restart docker 


2. Create secret 

# rm ~/.docker/config.json
# docker login -u $USER -p $PASS docker.io

# oc create secret generic dockertest \
    --from-file=.dockerconfigjson=~/.docker/config.json \
    --type=kubernetes.io/dockerconfigjson


3. Create pod

# oc create -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: sleep-test-pod
spec:
  containers:
  - name: sleep-test
    image: docker.io/rhowe/test-fed28
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
  imagePullSecrets:
  - name: test
restartPolicy: Never
EOF

# oc get pods -o wide 
sleep-test-pod              1/1       Running   0          8s        10.1.0.118   openshift.internal.test

Comment 63 Ryan Howe 2018-08-17 15:06:59 UTC
*** EDIT  ***

Adding docker.io [registries.search] in /etc/containers/registries.conf did work, not sure where I screwed up before. 

Also Step 3 was copied wrong. 

The secret name is dockertest not test. 
 
   imagePullSecrets:
  - name: dockertest

Comment 64 Ryan Howe 2018-08-17 15:17:34 UTC
I discovered that for the workaround to work docker.io need to be listed first for registries. If it is not listed in the list first then we hit this issue. 


Works: 
# docker info | grep "^Registries"
Registries: docker.io (secure), registry.access.redhat.com (secure), docker.io (secure)


Does not Work with Openshift Pull secret: 
# docker info | grep "^Registries"
Registries: registry.access.redhat.com (secure), docker.io (secure), docker.io (secure)

Comment 65 hgomes 2018-08-22 16:36:01 UTC
This seems to be hitting Openshift Online too.

$ oc new-project hgomes
$ oc create secret docker-registry hevs-secret --docker-server=docker.io --docker-username=songbird159 --docker-password=XXXXX --docker-email=hevs159
$ oc secrets link default hevs-secret --for=pull
$ oc new-app docker.io/songbird159/test:latest --name test
$ oc describe pod test-1-ms5sv

  Warning  Failed                 13s               kubelet, ip-172-31-61-174.eu-west-1.compute.internal  Failed to pull image "songbird159/test@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e": rpc error: code = Unknown desc = repository docker.io/songbird159/test not found: does not exist or no pull access
  Warning  Failed                 13s               kubelet, ip-172-31-61-174.eu-west-1.compute.internal  Error: ErrImagePull
  Normal   BackOff                4s (x3 over 11s)  kubelet, ip-172-31-61-174.eu-west-1.compute.internal  Back-off pulling image "songbird159/test@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e"
  Warning  Failed                 4s (x3 over 11s)  kubelet, ip-172-31-61-174.eu-west-1.compute.internal  Error: ImagePullBackOff
  Normal   SandboxChanged         3s (x4 over 13s)  kubelet, ip-172-31-61-174.eu-west-1.compute.internal  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                0s (x2 over 18s)  kubelet, ip-172-31-61-174.eu-west-1.compute.internal  pulling image "songbird159/test@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e"

--
Do we have any workarounds for Openshift Online customers?

Comment 67 Maciej Szulik 2018-09-17 10:30:03 UTC
*** Bug 1629021 has been marked as a duplicate of this bug. ***

Comment 68 Wenjing Zheng 2018-09-20 08:55:00 UTC
(In reply to Ryan Howe from comment #64)
> I discovered that for the workaround to work docker.io need to be listed
> first for registries. If it is not listed in the list first then we hit this
> issue. 
> 
> 
> Works: 
> # docker info | grep "^Registries"
> Registries: docker.io (secure), registry.access.redhat.com (secure),
> docker.io (secure)
> 
> 
> Does not Work with Openshift Pull secret: 
> # docker info | grep "^Registries"
> Registries: registry.access.redhat.com (secure), docker.io (secure),
> docker.io (secure)

I tried this workaround, but still failed :
[root@preserve-devexp-cluster-test ~]#oc get builds
ruby-hello-world-1   Source    Git@7ccd324   Failed (PullBuilderImageFailed)   7 minutes ago    5s
ruby-hello-world-2   Source    Git@7ccd324   Failed (PullBuilderImageFailed)   5 minutes ago    5s

[root@preserve-devexp-cluster-test ~]# docker info | grep "^Registries"
  WARNING: You're not using the default seccomp profile
Registries: docker.io (secure), registry.redhat.io (secure), registry.access.redhat.com (secure), docker.io (secure)

I just added ADD_REGISTRY='--add-registry docker.io locally, not in OCP cluster, is this what you mean?

Comment 79 kedar 2019-01-07 06:53:02 UTC
Hello,

Any updates on this issue.

Regards,
Kedar Salunkhe

Comment 102 Robert Heinzmann 2020-10-16 10:31:09 UTC
According to my tests, another mitigation for the "deployment from private dockerhub repo with 3.11" issue mentioned here, is to switch the deployment to image streams with "pull through" [1] (referencePolicy = local). 

In this case the pulling is done by the registry using credentials in the namespace the ImageStream resides in (and this is not hit by this bug). Thus deployments work if a docker secret for docker.io has been created in the namespace. Please be aware of https://access.redhat.com/solutions/2301081, which another fix needed to connect to dockerhub with authentication.

---
[1] https://docs.openshift.com/container-platform/3.11/install_config/registry/extended_registry_configuration.html#middleware-repository-pullthrough
[2] https://access.redhat.com/solutions/2301081