Bug 1984592 - global pull secret not working in OCP4.7.4+ for additional private registries
Summary: global pull secret not working in OCP4.7.4+ for additional private registries
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Nichita Gutu
QA Contact: XiuJuan Wang
URL:
Whiteboard:
: 2019710 2024856 (view as bug list)
Depends On:
Blocks: 2047331
TreeView+ depends on / blocked
 
Reported: 2021-07-21 17:49 UTC by daniel
Modified: 2022-03-12 04:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When global pull secret gets updated existing apiserver pods do not update their pull secret. Consequence: Existing apiserver pods do not get updated pull secrets. Fix: Changed mount point for pull secret from /var/lib/kubelet/config.json file to /var/lib/kubelet directory. Result: Updated pull secret now appears in existing apiserver pods.
Clone Of:
Environment:
Last Closed: 2022-03-12 04:36:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-openshift-apiserver-operator pull 485 0 None Merged Bug 1984592: global pull secret not working in OCP4.7.4+ for additio… 2022-03-22 17:28:41 UTC
Red Hat Knowledge Base (Solution) 6476871 0 None None None 2021-11-23 11:04:30 UTC
Red Hat Knowledge Base (Solution) 6527981 0 None None None 2021-11-23 11:05:56 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-12 04:36:22 UTC

Description daniel 2021-07-21 17:49:19 UTC
Description of problem:
Updating the global pull secret with an additional private registry secret is not allowing pulls in OCP versions >=4.7.4

Version-Release number of selected component (if applicable):
>=4.7.4

How reproducible:


Steps to Reproduce:
1. set up OCP 4.7.4 or greater with pull secret from cloud.redhat.com
2. Once completed add http auth provider 
3. export the current secret 
$ oc get secret/pull-secret -n openshift-config --template='{{index .data ".dockerconfigjson" | base64decode}}' >pullsecret.orig
4. $ cp pullsecret.orig pull.json
5. add private registry+secret to pull.json
6. reimport the secret
$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=./pull.json
7. wait for roll out and veryfy, e.g. via 
$ for node in `oc get no |awk -F " " '/Ready/ {print $1}'`; do oc debug node/$node -- chroot /host cat /var/lib/kubelet/config.json;done
8. once roll out of the secret is confirmed try importing an image from the private repo
$ oc import-image docker.io/damamo/private:2.0  --confirm

Actual results:
In versions >=4.7.4 the import fails wth 
~~~
$ oc import-image docker.io/damamo/private:2.0  --confirm
error: tag 2.0 failed: you may not have access to the container image "docker.io/damamo/private:2.0"
imagestream.image.openshift.io/private imported with errors

Name:			private
Namespace:		bla
Created:		Less than a second ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2021-07-21T13:22:07Z
Image Repository:	image-registry.openshift-image-registry.svc:5000/bla/private
Image Lookup:		local=false
Unique Images:		0
Tags:			1

2.0
  tagged from docker.io/damamo/private:2.0

  ! error: Import failed (Unauthorized): you may not have access to the container image "docker.io/damamo/private:2.0"
      Less than a second ago


~~~

Expected results:
in versions < 4.7.4
~~~

$ oc import-image docker.io/damamo/private:2.0  --confirm
imagestream.image.openshift.io/private imported

Name:			private
Namespace:		admin-1
Created:		Less than a second ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2021-07-21T17:35:47Z
Image Repository:	image-registry.openshift-image-registry.svc:5000/admin-1/private
Image Lookup:		local=false
Unique Images:		1
Tags:			1

2.0
  tagged from docker.io/damamo/private:2.0

  * docker.io/damamo/private@sha256:8e4aef7e6fee9d8decf3920363822698cfbe3883101a7f538e9635ff93807dc3
      Less than a second ago

Image Name:	private:2.0
Docker Image:	docker.io/damamo/private@sha256:8e4aef7e6fee9d8decf3920363822698cfbe3883101a7f538e9635ff93807dc3
Name:		sha256:8e4aef7e6fee9d8decf3920363822698cfbe3883101a7f538e9635ff93807dc3
Created:	Less than a second ago
Annotations:	image.openshift.io/dockerLayersOrder=ascending
Image Size:	81.44MB in 14 layers
Layers:		50.83MB	sha256:2fa61fedb54d576e17d9129a27fbd3c1ff8503b1e0c45622ba8de6a51fb6a9ef
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
		30.61MB	sha256:f1c43efdbe73995dd2fe0ca6da77f858d96b820ee8a3ecb34b51ff216c84f18a
		32B	sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Image Created:	4 months ago
Author:		<none>
Arch:		amd64
Command:	/usr/bin/bash
Working Dir:	<none>
User:		<none>
Exposes Ports:	<none>
Docker Labels:	description=image from fedora minimal includinf fio tools for checking disk performance of master etcd disk
		io.k8s.description=This image is p[art of Phased Gates review done to validate OCP 4 installation sanity
		io.k8s.display-name=fio command image to samle etcd data for PG
		io.openshift.expose-services=
		io.openshift.tags=fio
		license=MIT
		maintainer=dmoessne
		name=quay.io/dmoessne/fedora-fio
		run=podman run -it --name NAME --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=NAME -e IMAGE=IMAGE -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host IMAGE
		summary=fedora image including fio to chect etcd disk performance
		usage=podman container runlabel RUN fio --rw=write --ioengine=sync --fdatasync=1 --directory=/host/var/lib/etcd --size=40m --bs=500 --name=etcdspeedtest
		vendor=Fedora Project
		version=1.6
Environment:	DISTTAG=f33container
		FGC=f33
		container=oci
		PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

~~~


Additional info:
- when I install a version <4.7.4 and update to a later versions, even up to 4.7.19 it works, but new installs of >=4.7.4 do always fail 
- I always tried the import as an admin and as a normal user each in their own namespace

Comment 1 daniel 2021-07-21 20:56:49 UTC
additional info:
- installing OCP 4.7.13 and updating the secret after install, as above, importing an image from a private registry fails
- however, doing an install with a modified secret, i.e. adding an additional private registry, right from the start succeeds (importing the image on priv registry) for 4.7.13

Comment 2 Qi Wang 2021-08-19 03:41:53 UTC
@dmoessne Hi, could you provide the must-gather logs? I would like to see if more information behind the error "you may not have access to the container image" can get from the logs, like which tool is responsible for import-image, and which pull secret file the tool is using.
Thanks.

Comment 3 daniel 2021-08-19 16:16:19 UTC
@qiwan, sure, here you are:

$ oc version 
Client Version: 4.7.13
Server Version: 4.7.25
Kubernetes Version: v1.20.0+4593a24

Verify the pull secret really works (modified OCP pull secret with docker private registry secret added)
$ podman pull docker.io/damamo/private:2.0 --authfile /data/lab/aws/pull-test/pull.json
Trying to pull docker.io/damamo/private:2.0...
Getting image source signatures
Copying blob 2fa61fedb54d skipped: already exists  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob f1c43efdbe73 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Writing manifest to image destination
Storing signatures
bfc24d931cbc812c6c3f1864ae66e3ee3f267401c74a902a892ec36a605a2fac

$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=/data/lab/aws/pull-test/pull.json

validate secret is rolled out:
$ for node in `oc get no |awk -F " " '/Ready/ {print $1}'`; do oc debug node/$node -- chroot /host cat /var/lib/kubelet/config.json|grep docker;done
$ oc new-project test1
oc import-image docker.io/damamo/private:2.0  --confirm
~~~
Now using project "test1" on server "https://api.cluster.ocp4-csa.coe.muc.redhat.com:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app rails-postgresql-example

to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=k8s.gcr.io/serve_hostname

$ oc import-image docker.io/damamo/private:2.0  --confirm
error: tag 2.0 failed: you may not have access to the container image "docker.io/damamo/private:2.0"
imagestream.image.openshift.io/private imported with errors

Name:			private
Namespace:		test1
Created:		1 second ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2021-08-19T15:56:44Z
Image Repository:	<none>
Image Lookup:		local=false
Unique Images:		0
Tags:			1

2.0
  tagged from docker.io/damamo/private:2.0

  ! error: Import failed (Unauthorized): you may not have access to the container image "docker.io/damamo/private:2.0"
      1 second ago
~~~

--> tried as well in project openshfit, same result

-------------------------------

I somewhere read that one needs to add the secret for docker as index.docker.io and I did the above test with a changed secret, still the same


$ podman pull docker.io/damamo/private:2.0 --authfile /data/lab/aws/pull-test/pull2.json
Trying to pull docker.io/damamo/private:2.0...
Getting image source signatures
Copying blob 2fa61fedb54d skipped: already exists  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 done  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob a3ed95caeb02 skipped: already exists  
Copying blob f1c43efdbe73 skipped: already exists  
Copying blob a3ed95caeb02 .                                        
Writing manifest to image destination
Storing signatures
bfc24d931cbc812c6c3f1864ae66e3ee3f267401c74a902a892ec36a605a2fac

$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=/data/lab/aws/pull-test/pull2.json
$ for node in `oc get no |awk -F " " '/Ready/ {print $1}'`; do oc debug node/$node -- chroot /host cat /var/lib/kubelet/config.json|grep docker;done
$ oc new-project test2
$ oc import-image docker.io/damamo/private:2.0  --confirm
error: tag 2.0 failed: you may not have access to the container image "docker.io/damamo/private:2.0"
imagestream.image.openshift.io/private imported with errors

Name:			private
Namespace:		test2
Created:		Less than a second ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2021-08-19T16:01:03Z
Image Repository:	<none>
Image Lookup:		local=false
Unique Images:		0
Tags:			1

2.0
  tagged from docker.io/damamo/private:2.0

  ! error: Import failed (Unauthorized): you may not have access to the container image "docker.io/damamo/private:2.0"
      Less than a second ago

$



--> must gather too big to attach, can be found at https://drive.google.com/file/d/10OhwOAnJhwPtuCJcddWUhQJWFIgbH1Md/view?usp=sharing

Comment 4 Qi Wang 2021-08-25 20:39:23 UTC
Hi daniel,
From the document[1], importing from private registry needs to create secret using `oc create secret`. I reproduced this issue on openshift 4.7, the `oc set data` failed me too, but I tried the following command to create the secret and it worked. Have you used the following command, if it can solve your issue?

$ oc create secret generic <secret_name> --from-file=.dockerconfigjson=<file_absolute_path> --type=kubernetes.io/dockerconfigjson


[1]https://docs.openshift.com/container-platform/4.7/openshift_images/image-streams-manage.html#images-imagestream-import-images-private-registry_image-streams-managing

Comment 5 daniel 2021-08-30 09:00:33 UTC
Hi Qi,

thanks for pointing that out, however, I think it should work the other way as well [1] (as implemented by [2]) , by updating the global pull secret and as pointed out, this works up to 4.7.3 and past that verion it is no longer working, unless the altered secret is used already during install.
To be very precise:

- install, e.g. OCP 4.7.2 and update secret afterwards as outlined in [1] works
- install OCP 4.7.4+ and update afterwards secret as outlined in [1] doesn't work
- install OCP 4.7.4+ with already changed global pull secret works

It is my understanding that what you mentioned is more a per namespace implementation compared to [1] which is for a global implementation.

[1] https://docs.openshift.com/container-platform/4.7/openshift_images/managing_images/using-image-pull-secrets.html#images-update-global-pull-secret_using-image-pull-secrets
[2] https://issues.redhat.com/browse/DEVEXP-521

Comment 6 Oleg Bulatov 2021-09-01 16:04:43 UTC
daniel, can you check that you don't secrets for docker.io in your namespace?

Comment 7 Qi Wang 2021-09-03 20:32:08 UTC
Did not get completed this sprint.

Comment 8 daniel 2021-09-04 15:14:23 UTC
(In reply to Oleg Bulatov from comment #6)
> daniel, can you check that you don't secrets for docker.io in your namespace?

Hi Oleg,

I am not entirely sure if I get this correctly. I think you mean in the project I am trying the pull must be no other docker.io related secret (as it would override the global one) right ?

If so, once the secret changed and reimported, I do create a completely new project:

~~~
$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=/data/lab/aws/pull-test/pull.json

validate secret is rolled out:
$ for node in `oc get no |awk -F " " '/Ready/ {print $1}'`; do oc debug node/$node -- chroot /host cat /var/lib/kubelet/config.json|grep docker;done
[...]
$ oc new-project test1

Now using project "test1" on server "https://api.cluster.ocp4-csa.coe.muc.redhat.com:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app rails-postgresql-example

to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=k8s.gcr.io/serve_hostname

$ oc import-image docker.io/damamo/private:2.0  --confirm
error: tag 2.0 failed: you may not have access to the container image "docker.io/damamo/private:2.0"
imagestream.image.openshift.io/private imported with errors

Name:			private
Namespace:		test1
Created:		1 second ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2021-08-19T15:56:44Z
Image Repository:	<none>
Image Lookup:		local=false
Unique Images:		0
Tags:			1

2.0
  tagged from docker.io/damamo/private:2.0

  ! error: Import failed (Unauthorized): you may not have access to the container image "docker.io/damamo/private:2.0"
      1 second ago
~~~

So there shouldn't be any additional secrets. I have also not added any other secrets in any other namespace. The only thing I was doing on a freshly installed cluster was adding a new one to the global pull secret.

I hope I got that right. If not, pls let me know.

Comment 9 Qi Wang 2021-09-11 02:31:14 UTC
This bug is probably caused by the update of the global pull secret does not trigger a reboot. The Note of the documentation[1] pointed out pull-secret update will not lead to the reboot. 
I produced the bug using steps from bug Description. And after rebooting the nodes, the `oc import-image` succeeded. 

[1]https://docs.openshift.com/container-platform/4.7/openshift_images/managing_images/using-image-pull-secrets.html#images-update-global-pull-secret_using-image-pull-secrets

Comment 12 Oleg Bulatov 2021-11-23 11:04:31 UTC
*** Bug 2019710 has been marked as a duplicate of this bug. ***

Comment 14 Oleg Bulatov 2021-11-23 11:05:57 UTC
*** Bug 2024856 has been marked as a duplicate of this bug. ***

Comment 15 XiuJuan Wang 2021-12-06 05:55:13 UTC
After launch cluster from this pr, the openshift-apiserver deploy didn't take effect with this change, and could still reproduce the bug issue.
$ oc get deploy apiserver -o yaml |
volumes:
- hostPath:
path: /var/lib/kubelet/config.json
type: File
name: node-pullsecrets

Comment 16 Oleg Bulatov 2021-12-13 11:34:35 UTC
The PR has been updated.

Comment 17 XiuJuan Wang 2021-12-15 05:49:11 UTC
openshift-apiserver pod can't be running with error , that conduct the installation failed. 
$oc get pods -n openshift-apiserver
NAME                         READY   STATUS     RESTARTS   AGE
apiserver-5dd8696666-tsm8p   0/2     Init:0/1   0          102m
apiserver-5f7c97d6f6-dsdvv   0/2     Init:0/1   0          104m
apiserver-68c99d45c9-vvs8f   0/2     Init:0/1   0          103m

 Warning  FailedMount       98m                    kubelet            Unable to attach or mount volumes: unmounted volumes=[node-pullsecrets], unattached volumes=[serving-cert kube-api-access-8kvqd config image-import-ca trusted-ca-bundle audit-dir encryption-config node-pullsecrets etcd-client etcd-serving-ca audit]: timed out waiting for the condition
  Warning  FailedMount       4m39s (x47 over 100m)  kubelet            MountVolume.SetUp failed for volume "node-pullsecrets" : hostPath type check failed: /var/lib/kubelet/ is not a file

Comment 21 XiuJuan Wang 2022-01-13 03:36:04 UTC
1. export the current secret 
$ oc get secret/pull-secret -n openshift-config --template='{{index .data ".dockerconfigjson" | base64decode}}' >pullsecret.orig
2. $ cp pullsecret.orig pull.json
3. Modify registry+secret to pull.json
4. reimport the secret
$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=./pull.json
5. wait for roll out and veryfy, e.g. via 
$ for node in `oc get no |awk -F " " '/Ready/ {print $1}'`; do oc debug node/$node -- chroot /host cat /var/lib/kubelet/config.json;done
6. Diff the secret, the secret are same
$ oc -n openshift-apiserver rsh apiserver-XXXXnnnn-xxxx cat /var/lib/kubelet/config.json | jq '.auths."registry.redhat.io".auth'  > apipod-pullsecret
$ diff pull.json apipod-pullsecret
Import a image from the private registry, imported successfully.

Verified on 4.10.0-0.nightly-2022-01-11-065245 version

Comment 28 errata-xmlrpc 2022-03-12 04:36:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.