Description of problem: OpenShift builds generate their own registries.conf config that is independent of the node host. Version-Release number of selected component (if applicable): 4.2.0 How reproducible: Always Steps to Reproduce: 1. Install an OpenShift cluster in a disconnected environment, with quay.io's fedora/fedora images mirrored 2. Run a Docker strategy build with a Dockerfile that starts like `FROM quay.io/fedora/fedora:latest` Actual results: Build fails because fedora/fedora:latest cannot be pulled from the public quay.io registry. Expected results: Build can pull quay.io/fedora/fedora:latest from the mirror
openshift-controller-manager generates a `registries.conf` file that is mounted into build pods via a ConfigMap [1]. We need to do the following: 1. Watch the updates to the cluster ImageContentSourcePolicy [2]. 2. Migrate our representation of `registries.conf` to use the V2 `registries.conf` format [3]. 3. Use the image content policy data to set the mirror registries. [1] https://github.com/openshift/openshift-controller-manager/blob/master/pkg/build/controller/build/build_controller.go#L2026-L2056 [2] https://github.com/openshift/api/blob/master/operator/v1alpha1/types_image_content_source_policy.go#L56-L67 [3] https://github.com/containers/image/blob/master/pkg/sysregistriesv2/system_registries_v2.go#L137-L141
/cc Miloslav and Nalin
PRs have merged ... bot must have missed this ... manually moving to modified
Since disconnected env install is block by bug, when can install it, will verify the bug.
(In reply to wewang from comment #7) > [root@Desktop test]# oc image mirror quay.io/drahnr/fedora:latest > warning: Layer size mismatch for > sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4: had > 0, wrote 32 > warning: Layer size mismatch for > sha256:0be2a68855d7bbbba01b447a79c873f137e6fb47362e79f2fd79c72575c9b73a: had > 0, wrote 89867780 This is apparently a bug in the mirroring code: the source image uses schema1, which does not contain blob sizes. Seems harmless (cosmetic only, though). > error: unable to push manifest to > mirror-registry.qe.devcluster.openshift.com:5000/openshift/fedora1:latest: > received unexpected HTTP status: 500 Internal Server Error Yeah, that’s not very helpful. Maybe the registry contains an actual error cause. ---- > info: Mirroring completed in 46.74s (1.922MB/s) > error: one or more errors occurred while uploading images > > So i using below steps to test: > 1. docker tag quay.io/drahnr/fedora:latest > mirror-registry.qe.devcluster.openshift.com:5000/openshift/fedora2:latest > 2. docker push > mirror-registry.qe.devcluster.openshift.com:5000/openshift/fedora2:latest This is pretty likely to convert that schema1 image to schema2, and possibly change the manifest digest for other reasons. Use (skopeo copy docker://quay.io/drahnr/fedora:latest docker://mirror-registry.qe.devcluster.openshift.com:5000/openshift/fedora2:latest) instead. > $oc new-build -D $'FROM quay.io/drahnr/fedora:latest\nRUN yum install -y OOPS; mirroring is configured to only apply to digest references. This is not going to use the mirror anyway. If the image is correctly mirrored, using a digest reference (FROM quay.io/drahnr/fedora@sha256:5562f951443b829832cfc603eebc0057d5e23b2448db3192f7024dbb06abac04) should work. That would locally test that the fixes made work correctly; but it’s not going to be all that helpful for ordinary use.
OK, I'm going to try and summarize a discussion that has been going on in #warroom-disconnected (and some of which has been discussed previously in the context of https://bugzilla.redhat.com/show_bug.cgi?id=1741391) 1) Per Miloslav: Mirrors are always set up with MirrorByDigestOnly, and that completely breaks FROM image:latest in Dockerfiles. OpenShift installations don’t mind because they always use digest references, but that’s not really reasonable for builds. The idea of MirrorByDigestOnly supposedly was that we don’t want to risk having several mirrors out of sync, but breaking builds to get that seems like a pretty wrong trade-off. 2) So that means either a) we try again if we can get Oleg's https://bugzilla.redhat.com/show_bug.cgi?id=1741391 But we don't want to do that at this point. Oleg's work here is complicated, and is still at risk for 4.2. b) QE changes the scenario so the build references any input images via an image reference that uses a SHA and the mirror registry ... i.e. no use of imagestream references 3) the changes this bug's PRs revolved around updating the registries.conf used by the build, pulling in the ICSP mirror config. So all that means, if we take Wen's test of oc new-build -D $'FROM quay.io/drahnr/fedora:latest\nRUN yum install -y httpd' --strategy=docker If 'fedora:latest' is changed to 'fedora:<sha reference>' then the new registry.conf should pick up the ICSP mirror definitions and pull the quay.io image mirror that was previously set up. Wen - change your test case in this fashion, and we'll go from there. At this time, we don't want to block on
1) run with build loglevel 5 (oc start-build foo --build-loglevel=5) 2) collect the buildconfig yaml, build yaml, pod yaml, imagestream yaml, and imagecontentsourcepolicy yaml 3) collect the build logs.
(In reply to Ben Parees from comment #39) > 1) run with build loglevel 5 (oc start-build foo --build-loglevel=5) > 2) collect the buildconfig yaml, build yaml, pod yaml, imagestream yaml, and > imagecontentsourcepolicy yaml > 3) collect the build logs. Here's info: http://pastebin.test.redhat.com/799074
line 32 of your pastebin indicates that this buildconfig is going to use " docker.io/nodeshift/centos7-s2i-nodejs:latest" to substitute the FROM line of your dockerfile (which is exactly what the logs show it doing). So it's definitely not utilizing your mirror, and this also means your cluster does have access to docker.io. strategy: dockerStrategy: from: kind: DockerImage name: docker.io/nodeshift/centos7-s2i-nodejs:latest Pulling image docker.io/nodeshift/centos7-s2i-nodejs ... STEP 1: FROM docker.io/nodeshift/centos7-s2i-nodejs It's not clear to me *why* the buildconfig is being constructed that way (something to do with new-app i assume, if that's how you're creating it), but a valid test would require you to remove "from" section of the dockerStrategy from the buildconfig, or change the value of the name to: docker.io/nodeshift/centos7-s2i-nodejs@sha256:eea192da5dc21ddfbfbc1a1947ecb3c73e074e2d9516e5bed7ce66015464cce9 instead of docker.io/nodeshift/centos7-s2i-nodejs:latest But of course none of that matters if the cluster isn't actually disconnected from docker.io.
Sending back to QE to re-validate this since it seems like the validation performed wasn't correct. that said, if this fails QE we are not going to hold 4.2 for it. we'll move it to 4.3 and backport to 4.2.z as needed, if there is a code change needed.
Sorry for the confusion. We figure our why it passed in GCP disconnected cluster, proxy are added to BuildDefault to make github.com accessible. After we remove the proxy, cannot build for failed to pull mirror image with mirror rule defined. So moving this bug to assigned : (
@Wenjing has provided me access to their clusters. If today I can a) find the projects and existing mirror images they have set up b) and confirm their ICSP objects have properly set up the mirror c) and some of the existing build configs they have tried I will attempt to change the build configs do *NOT* override with an imagestream the dockerfile FROM with image:sha reference as I described in https://bugzilla.redhat.com/show_bug.cgi?id=1745192#c32 and what Ben reiterated in https://bugzilla.redhat.com/show_bug.cgi?id=1745192#c41 As I mentioned to Ben/Adam in slack yesterday, and as he noted in https://bugzilla.redhat.com/show_bug.cgi?id=1745192#c41 the key at this point is this line in the build log: Pulling image docker.io/nodeshift/centos7-s2i-nodejs ... The line needs to container the sha of the image that was mirrored. If that works, we can decided if a) we leave this in 4.2, *I* will mark this Verified, and we'll update the release notes / docs to clarify this need wrt builds ... even if the env is not disconnected, we've validated the change the PR for this bug was introducing. b) we move to 4.3, and get QE to try those precise steps above in a truly disconnected env if that is what is deemed required If for some reason that does not work, then as Ben noted, we'll address this in 4.3 and backport to 4.2.x, and add the release note about the general issue with builds in disconnected.
OK, using the AWS-Kubeconfig provided, and looking at @Wen's attempt on project wewang1, with the build config ruby-22-centos7 I'll start dumping various artifacts. Here are the ICSPs. You'll see an entry for docker.io/wewang58/ruby-22-centos7 in there. I don't have the expertise to fully know if it is correct, but it seems OK. gmontero ~/QE_bzs/disconnected $ oc get imagecontentsourcepolicy --all-namespaces -o yaml apiVersion: v1 items: - apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: creationTimestamp: "2019-09-20T01:38:51Z" generation: 1 name: image-policy-0 resourceVersion: "435" selfLink: /apis/operator.openshift.io/v1alpha1/imagecontentsourcepolicies/image-policy-0 uid: 6611d381-db47-11e9-b240-021d1471e24a spec: repositoryDigestMirrors: - mirrors: - ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/ocp/release source: quay.io/openshift-release-dev/ocp-v4.0-art-dev - apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: creationTimestamp: "2019-09-20T01:38:51Z" generation: 1 name: image-policy-1 resourceVersion: "436" selfLink: /apis/operator.openshift.io/v1alpha1/imagecontentsourcepolicies/image-policy-1 uid: 662e9c12-db47-11e9-b240-021d1471e24a spec: repositoryDigestMirrors: - mirrors: - ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/ocp/release source: registry.svc.ci.openshift.org/ocp/release - apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: creationTimestamp: "2019-09-20T02:57:07Z" generation: 1 name: image-policy-centos resourceVersion: "30937" selfLink: /apis/operator.openshift.io/v1alpha1/imagecontentsourcepolicies/image-policy-centos uid: 54ffa7de-db52-11e9-8952-02b2fa52eb60 spec: repositoryDigestMirrors: - mirrors: - ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7 source: docker.io/wewang58/ruby-22-centos7 - apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: creationTimestamp: "2019-09-20T05:36:28Z" generation: 1 name: image-policy-ruby22 resourceVersion: "75832" selfLink: /apis/operator.openshift.io/v1alpha1/imagecontentsourcepolicies/image-policy-ruby22 uid: 9820c333-db68-11e9-a378-06d03e4c03ea spec: repositoryDigestMirrors: - mirrors: - ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7 source: docker.io/centos/ruby-22-centos7 - apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: creationTimestamp: "2019-09-20T08:30:37Z" generation: 1 name: image-policy-wzheng resourceVersion: "126387" selfLink: /apis/operator.openshift.io/v1alpha1/imagecontentsourcepolicies/image-policy-wzheng uid: ec37266e-db80-11e9-9592-02b2fa52eb60 spec: repositoryDigestMirrors: - mirrors: - ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/nodeshift/centos7-s2i-nodejs source: docker.io/nodeshift/centos7-s2i-nodejs
Here is the registries.conf the build creates after accessing the ICPSs. I see the docker.io/wewang58/ruby-22-centos7 entry there, pointing to the same mirror as the ICSP: apiVersion: v1 data: registries.conf: | unqualified-search-registries = ["docker.io"] [[registry]] prefix = "" location = "docker.io/centos/ruby-22-centos7" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7" [[registry]] prefix = "" location = "docker.io/wewang58/ruby-22-centos7" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7" [[registry]] prefix = "" location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/ocp/release" [[registry]] prefix = "" location = "registry.svc.ci.openshift.org/ocp/release" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/ocp/release" kind: ConfigMap
The local imagestream oc new-build creates by default is odd, and this add validation to our theory. You'll see the import did not work. gmontero ~/QE_bzs/disconnected $ oc get is -o yaml apiVersion: v1 items: - apiVersion: image.openshift.io/v1 kind: ImageStream metadata: annotations: openshift.io/generated-by: OpenShiftNewBuild creationTimestamp: "2019-09-20T03:05:14Z" generation: 1 labels: build: ruby-22-centos7 name: ruby-22-centos7 namespace: wewang1 resourceVersion: "34828" selfLink: /apis/image.openshift.io/v1/namespaces/wewang1/imagestreams/ruby-22-centos7 uid: 7739aee2-db53-11e9-b6f4-0a580a820024 spec: lookupPolicy: local: false status: dockerImageRepository: image-registry.openshift-image-registry.svc:5000/wewang1/ruby-22-centos7 kind: List metadata: resourceVersion: "" selfLink: ""
Lastly, I edited the buildconfig to remove any From references in the strategy. source: dockerfile: FROM docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa type: Dockerfile strategy: dockerStrategy: {} type: Docker successfulBuildsHistoryLimit: 5 triggers: - github: secret: OdBXFputu9jqPMjIg111 type: GitHub - generic: secret: MoYxBFDnGKKz_pXN4gjk type: Generic - type: ConfigChange When I ran the build, the Pulling image reference is now correct, in that it has the sha: Pulling image docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa ... But the pull fails for a different reason than before: error: build error: failed to pull image: After retrying 2 times, Pull image still failed due to error: while pulling "docker://docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa" as "docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa": Error initializing source docker://wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: pinging docker registry returned: Get https://registry-1.docker.io/v2/: Forbidden I can pull that sha locally gmontero ~/QE_bzs/disconnected $ docker pull docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: Pulling from wewang58/ruby-22-centos7 Digest: sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa Status: Downloaded newer image for wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa docker.io/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa gmontero ~/QE_bzs/disconnected $ So progress, but something is still amiss. Not sure what it is at first blush.
I've moved out to 4.3. I have a draft of the "this doesn't work" release note queued up, but will spend some more time understanding Error initializing source docker://wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: pinging docker registry returned: Get https://registry-1.docker.io/v2/: Forbidden before submitting the draft.
Of course if the mirroring is correct, when does wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa get translated to the mirror location?
And is there perhaps a token that is needed to access the mirror ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000 ? Presumably it is a registry we have to log into and get a token, no? I cannot pull from that registry on my own, as I get a cert error ... though @Wen - going through this admittedly long bugzilla, I do see our exchanges about the mirror's cert being in the config map the build controller mounts into the build pod, but I'm not finding anything about the mirror's auth token. Perhaps you on Wenjing could provide the kubeadmin password for the AWS.kubeconfig file you provided, and I could see about logging into the console to get a token, and try the docker login / oc registry login, to try and at least get the token, and create the secret for that. Or if you do know that was not done perhaps you can do that in the wewang1 project for the BC I have modified. I'm guessing the answer to my question in #comment 50 is that it is "under the covers" (i.e in containers/image), and the Forbidden error stems from the fact that the mapping to the mirror has occurred "under the covers", and if in fact we do not have the token/creds for the mirror registry, that is why it is Forbidden.
(In reply to Gabe Montero from comment #50) > Of course if the mirroring is correct, when does > wewang58/ruby-22-centos7@sha256: > fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa get > translated to the mirror location? The c/image library, transparently to CRI-O or c/buildah or openshift/builder. I’m afraid that does not currently show up in logs, even if accesses to that fail. So, the > pinging docker registry returned: Get https://registry-1.docker.io/v2/: Forbidden error means “all accesses to mirrors, if any, failed in some way that is not reported; then the access to the actual docker.io registry failed with a Forbidden error.” (Is the Forbidden to docker.io consistent with the cluster setup?) We definitely need to improve that, but right now, the most practical way to figure out what is going on would, I think, be to do the mapping manually and see what `podman pull` reports (or, similarly, but as it turns out below, not quite equivalently, kick of an OpenShift build that references the mirror directly): > podman --log-level=debug pull docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa or so. Outside of the cluster, with no configuration at all, I get > ERRO[0001] error pulling image "docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa": unable to pull docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: unable to pull image: Error initializing source docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: pinging docker registry returned: Get https://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/v2/: x509: certificate signed by unknown authority and with … pull --tls-verify=false … > ERRO[0001] error pulling image "docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa": unable to pull docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: unable to pull image: Error initializing source docker://ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7@sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa: Error reading manifest sha256:fda21bc1af022fb34abbecb798a0cb1a37c82ef159b57df3e21688a6adcef9fa in ec2-18-219-50-21.us-east-2.compute.amazonaws.com:5000/wewang58/ruby-22-centos7: unauthorized: authentication required So, yes, the build must be configured to trust that registry, and authentication is probably required. If I’m reading https://github.com/openshift/builder/blob/04c78176099139a5d229578a9a98ed2e1d17a19d/pkg/build/builder/daemonless.go#L275 and surrounding code right, the build pod can actually receive all the necessary secrets (for the upstream repository as well as and all mirrors) via $PULL_DOCKERCFG_PATH, but the “pull image” path code is structured in a way that only supports passing along exactly one secret, a secret that matches the upstream repository (i.e. none of the mirrors); unlike e.g. https://github.com/openshift/builder/blob/04c78176099139a5d229578a9a98ed2e1d17a19d/pkg/build/builder/daemonless.go#L142 , which at least in principle seems to support providing multiple secrets. I can’t see anything openshift/builder is explicitly doing to manage TLS trusted CAs; https://github.com/openshift/openshift-controller-manager/blob/bf63394ad3ad412202d00792612e9b5fbfd4dd27/pkg/build/controller/strategy/util.go#L506 presumably already works for all kinds of registries and is not directly affected by the builder code, and that one should work for the mirrors in the usual way (but it might have to be explicitly configured).
Gabe, I have sent necessary authentication to you in the email. And after podman login with correct username/password, I can podman pull the image: # podman --log-level=debug pull docker://ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7:latest <snip> DEBU[0280] set names of image "e42d0dccf073123561d83ea8bbc9f0cc5e491cfd07130a464a416cdb99ced387" to [ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7:latest] DEBU[0280] saved image metadata "{}" DEBU[0280] parsed reference into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7:latest" e42d0dccf073123561d83ea8bbc9f0cc5e491cfd07130a464a416cdb99ced387
OK, I was able to log onto @Wenjing's cluster (fyi Wenjing, oc complained about the format of the kubeconfig you emailed, but I was able to pull the api server address from it and use the kubeadmin password you provided) And I was able to use the cert, id, password to pull from the mirror via podman from my system. and then validate the latest flavor of build config QE has, validate the auth/cert are getting injected into the build pod, the registry.conf file is properly created, and the basic gist of Miloslav's theory in #comment 52, though in addition to the links he noted, there is also https://github.com/openshift/builder/blob/04c78176099139a5d229578a9a98ed2e1d17a19d/pkg/build/builder/daemonless.go#L75-L80 for pulling the FROM image, and that is where we hit problems. More changes are in fact needed in openshift/builder to facilitate the disconnected scenario. Gory details: 1) the latest version of QE's BC does the trick, where they massaged the oc new-build generated BC as we discussed earlier ... this time, they had both a dockerfile: From and a dockerStrategy, but they made sure the SHA image refs, no images streams, are used, and the included the PULL secret for the mirror'ed registry. I also added BUILD_LOGLEVEL=8 to get the confirming debug we've been needing/asking for. source: dockerfile: FROM docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae type: Dockerfile strategy: dockerStrategy: env: - name: BUILD_LOGLEVEL value: "8" from: kind: DockerImage name: docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae pullSecret: name: pull type: Docker Both the removal of imagestreams and inclusion of the pull secret were items missing from previous runs. 2) I used oc debug on Wenjing's existing build pods to confirm that a) the build secret with CAs cert mounting worked. I confirmed that the disconnected cert she sent via email was in /etc/pki/ca-trust/extracted/tls-ca-bundle.pem b) the pull secret she specifed there was a dockerconfigjson secret that included an entry for her mirror. In particular, /run/secrets/openshift.io/pull/.dockerconfigjson contained: "ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000": { "auth": "ZHVtbXk6ZHVtbXk=" }, based on how that looks, I'm assuming that a token was provided for this secret vs. the username/password Wenjing provided me. c) we previously had confirmed the registries.conf file in the sysconfig configmap looked good after creating the ICSPs ... it still does with these latest runs registries.conf: | unqualified-search-registries = ["docker.io"] [[registry]] prefix = "" location = "docker.io/centos/ruby-22-centos7" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7" [[registry]] prefix = "" location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/ocp/release" [[registry]] prefix = "" location = "registry.svc.ci.openshift.org/ocp/release" mirror-by-digest-only = true [[registry.mirror]] location = "ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/ocp/release" 3) On re-running the build with loglevel 8, you can see Miloslav's theory .... Only Wenjing's creds for docker are being passed down. I'll attach the entire log file separately, but here is the key snippet (by the way, our running at loglevel 6 or greater triggered debug level logging in c/image, CRI-O, buildah, etc.): Asked to pull fresh copy of "docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae". I0923 18:16:37.240758 1 daemonless.go:544] Setting authentication for registry "docker.io" for "docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae". time="2019-09-23T18:16:37Z" level=debug msg="parsed reference into \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="parsed reference into \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="reference \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\" does not resolve to an image ID" time="2019-09-23T18:16:37Z" level=debug msg="registry \"docker.io\" is not listed in registries configuration \"/var/run/configs/openshift.io/build-system/registries.conf\", assuming it's not blocked" time="2019-09-23T18:16:37Z" level=debug msg="parsed reference into \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="parsed reference into \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="copying \"docker://centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\" to \"docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="starting to write to image \"containers-storage:[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.imagestore=/var/lib/shared]docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\" using blob cache in \"/var/cache/blobs\"" time="2019-09-23T18:16:37Z" level=debug msg="reference rewritten from 'docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae' to 'ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae'" time="2019-09-23T18:16:37Z" level=debug msg="reference rewritten from 'docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae' to 'docker.io/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae'" time="2019-09-23T18:16:37Z" level=debug msg="Trying to pull \"ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000/centos/ruby-22-centos7@sha256:a18c8706118a5c4c9f1adf045024d2abf06ba632b5674b23421019ee4d3edcae\"" time="2019-09-23T18:16:37Z" level=debug msg="Credentials not found" I'll start working on a PR.
Created attachment 1618338 [details] build log at loglevel 8
FYI ... things may also be further complicated but what old fsouza provides wrt auth config: https://github.com/openshift/builder/blob/master/vendor/github.com/fsouza/go-dockerclient/auth.go#L25-L30 Seems to be only username/password. Doesn't seem like it handles token based pull secrets like the one Wenjing provided: "ec2-18-221-93-104.us-east-2.compute.amazonaws.com:5000": { "auth": "ZHVtbXk6ZHVtbXk=" },
OK according to https://stackoverflow.com/questions/43441454/docker-login-auth-token "auth" is base64 encoded username:password And I know see that when I base64 -d the value "ZHVtbXk6ZHVtbXk=" that said, I do not believe the json serialization with the current fsouza struct will work
What is the precise type (`SecretType…`) of the secret? (Or what kind of object it is at what point in the API?) The above looks like a reasonably accurate _partial fragment_ of https://github.com/projectatomic/docker/blob/f9f056ec099cc3849d15e36f08fec50130c20073/cliconfig/configfile/file.go#L24 , with the base64 encoding that is used in ~/.docker/config.json-formatted files. (I can’t see it in the log file, so I can’t tell whether that is what it is supposed to look like, or whether it is malformed.) (Also, note that fsouza/go-dockerclient is really only relevant for Docker daemon connections, it’s not used at all for c/image and c/buildah ; if you see that as the place where data is lost, either the code is reusing the fsouza/go-dockerclient for not-strictly-related purposes, or something is very misconfigured to use Docker instead of CRI-O/buildah.)
my concern about auth getting lost in populating the fsouza struct is a red herring, with respect to the situation here. And to clarify, the use of the fsouza structs are just an ecapsulating mechanism for propagating the data through the code until the calls to set up containers/image. I was able to confirm via unit tests that the keyring stuff manages to take the data from "auth" and populate username and password i.e. https://stackoverflow.com/questions/43441454/docker-login-auth-token
Just did some booking keeping ... I had to craft https://github.com/containers/image/pull/722 so that c/image could properly handle openshift build pull secrets (where they leveraged the legacy format via .dockercfg files and the like). I linked it to this bug directly, since I assumed the openshift bugzilla bot would not work for https://github.com/containers https://github.com/openshift/builder/pull/102 is showing green with the e2e's and we are in the process of final review
OK the last of the changes needed to allow for authentication against a mirrored registry when performing builds has merge As this moves back to QE, a quick recap on the many things to reconcile when trying this again: 1) The input image reference in you builds must be by sha/digest, and that image needs to be mirrored 2) the output from the mirror command should given you the info needed to create an ImageContentSourcePolicy; that will be needed so the build controller can construct a proper registries.conf for containers/image and buildah 3) any certs needed to communicate with the mirrored registry need to be added either via a) the new global ca support introduced in 4.2, or b) via an explicit CA secret supplied on the build / build config 4) any authentication needed to communicate with the mirrored registry needs to be added as a pull secret to the build /build config If you run into an problems, running the build with loglevel 8 (i.e. set the BUILD_LOGLEVEL env var on the build config) should be gathered. With that, we should get both openshift build and containers/image/buildah debug info to see where things are breaking down.
Created attachment 1640587 [details] Build log with log level 8
According to your repro steps @Wenjing you only linked your secrete to the default SA: $oc secrets link default pullsecret --for=pull you need to link it to the builder SA in order for it to get picked up. Please retry, but where you link to the builder SA as well, and we'll go from there.
Actually, I have tried to link my secret to the builder SA. Still failed back then. But I have figure our why I fail now. If I create secret with below command, it will fail: $oc create secret docker-registry pullsecret \ --docker-server=upshift.mirror-registry.qe.devcluster.openshift.com:5000 \ --docker-username=xxxx \ --docker-password=xxxx\ --docker-email=wzheng If I create secret with below command, it will succeed: $docker login upshift.mirror-registry.qe.devcluster.openshift.com:5000 -u xxxx -p xxxx $oc create secret generic pull --from-file=.dockerconfigjson=/home/wzheng/.docker/config.json --type=kubernetes.io/dockerconfigjson Anyway, this is no related to current bug now, I will move this bug to verified on 4.3.0-0.nightly-2019-11-29-051144.
i believe that "oc create docker-registry pullsecret" creates a ".dockercfg" secret, which is a different format from a "dockerconfigjson" secret (dockerconfigjson is the newer format). I would have expected both secrets to work, however. It might be worth investigating a little further if we lost the ability for builds to use ".dockercfg" secrets at some point, or if in particular they are not working with the mirroring logic.
Actually Ben I just tried it and it creates a dockerconfigjson secret: gmontero ~ $ oc create secret docker-registry pullsecret --docker-server=upshift.mirror-registry.qe.devcluster.openshift.com:5000 --docker-username=xxxx --docker-password=xxxx --docker-email=wzheng secret/pullsecret created gmontero ~ $ oc get secret pullsecret -o yaml apiVersion: v1 data: .dockerconfigjson: eyJhdXRocyI6eyJ1cHNoaWZ0Lm1pcnJvci1yZWdpc3RyeS5xZS5kZXZjbHVzdGVyLm9wZW5zaGlmdC5jb206NTAwMCI6eyJ1c2VybmFtZSI6Inh4eHgiLCJwYXNzd29yZCI6Inh4eHgiLCJlbWFpbCI6Ind6aGVuZ0ByZWRoYXQuY29tIiwiYXV0aCI6ImVIaDRlRHA0ZUhoNCJ9fX0= kind: Secret metadata: creationTimestamp: "2019-12-02T19:09:27Z" name: pullsecret namespace: ggmtest resourceVersion: "39465" selfLink: /api/v1/namespaces/ggmtest/secrets/pullsecret uid: f2d40f7d-cae9-4de8-8a7b-6e8ffd329cc6 type: kubernetes.io/dockerconfigjson gmontero ~ $ That said, I know .dockercfg format works because I had to submit a fix to containers/image in order to get the image registry pull secret to work for builds, since that still uses .dockercfg format. Using the second form, oc create secret generic vs. oc create secret docker-registry, creates a very similar secret, except the value for the key ".dockerconfigjson" is different, since you are pointing it at your entire config.json file. That would imply to me that containers/image does not like the format of the data stored in the ".dockerconfigjson" entry when one uses "oc create secret docker-registry ..", though in looking at the params for that command, perhaps the option --generator='secret-for-docker-registry/v1': The name of the API generator to use. is needed to create the secret in a format containers/image wants. End of the day, with whatever new bug is opened, who do we assign the debug/diagnosis to ... either the containers team or us.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062