Version: ACM: v2.4.2-RC5 OCP: 4.10.9-x86_64 Steps to reproduce: Attempted to deploy SNO/spoke Used this icsp: ``` apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: name: advanced-cluster-manager-icsp spec: repositoryDigestMirrors: - mirrors: - brew.registry.redhat.io/rhacm2 source: registry.redhat.io/rhacm2 - mirrors: - brew.registry.redhat.io/openshift4/ose-oauth-proxy source: registry.access.redhat.com/openshfit4/ose-oauth-proxy ``` Result: The agent wouldn't start. Was prompting that it's unable to pull the image ( Source image rejected: A signature was required, but no signature exists) The file /etc/containers/policy.json on the agent had: { "default": [ { "type": "insecureAcceptAnything" } ], "transports": { "docker": { "registry.access.redhat.com": [ { "type": "signedBy", "keyType": "GPGKeys", "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release" } ], "registry.redhat.io": [ { "type": "signedBy", "keyType": "GPGKeys", "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release" } ] }, "docker-daemon": { "": [ { "type": "insecureAcceptAnything" } ] } } } After I updated the file to have only the following: { "default": [ { "type": "insecureAcceptAnything" } ], "transports": { "docker-daemon": { "": [ { "type": "insecureAcceptAnything" } ] } } } The image was successfully pulled and the agent started.
[core@api ~]$ sudo journalctl -u agent.service -l -- Logs begin at Thu 2022-04-07 22:34:26 UTC, end at Thu 2022-04-07 22:37:00 UTC. -- Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service... Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3136]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b> Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com podman[3249]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6... Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com podman[3249]: Error: Source image rejected: A signature was required, but no signature exists Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125 Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'. Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service. Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart. Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 1. Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service. Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service... Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3313]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b> Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com podman[3427]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6... Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com podman[3427]: Error: Source image rejected: A signature was required, but no signature exists Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125 Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'. Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service. Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart. Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 2. Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service. Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service... Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3486]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b> Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com podman[3595]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6... Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com podman[3595]: Error: Source image rejected: A signature was required, but no signature exists Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125 Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'. Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service. Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart. Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 3. Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service. Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service... Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3656]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b> Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com podman[3770]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6... Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com podman[3770]: Error: Source image rejected: A signature was required, but no signature exists Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125 Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'. Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service. Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart. Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 4. Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service. Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service... Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3830]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b> Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com podman[3945]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6... Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com podman[3945]: Error: Source image rejected: A signature was required, but no signature exists Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125 Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'. Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service. Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart. Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 5. Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service. Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service..
where is policy come from?
I am seeing this issue with 4.11 as well: - cpuArchitecture: x86_64 openshiftVersion: "4.11" rootFSUrl: http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live-rootfs.x86_64.img url: http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live.x86_64.iso version: 4.11.0-0.nightly-2022-05-10-045003
(In reply to Marius Cornea from comment #4) > I am seeing this issue with 4.11 as well: > > - cpuArchitecture: x86_64 > openshiftVersion: "4.11" > rootFSUrl: > http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift- > v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live-rootfs.x86_64.img > url: > http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift- > v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live.x86_64.iso > version: 4.11.0-0.nightly-2022-05-10-045003 The images were mirrored from https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/
In RHEL (including RHEL CoreOS, which is what the live ISO is), the contents of `/etc/containers/policy.json` come from the podman stack, specifically `containers-common`. However - a huge caveat here is that in OCP the MCO (specifically the container runtime config controller) actually overwrites all those defaults, see https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-container-runtime/_base/files/policy.yaml I suspect what happened here is the podman stack made a change to enable this, but we didn't update the MCO defaults to match the new package changes. The Live ISO does *not* include the MCO override templates (IOW it's more RHEL, not OCP), which is why you're only seeing this there. If you want to work around this today, you can include an Ignition config which writes `/etc/containers/policy.json` into the Live environment. I think though this is a conflict between mirroring and signing - the ideal case is that mirroring preserves signatures, but I don't think that happens in all cases today. In the short term, our instructions for mirroring probably need to be updated to mention disabling signature verification - that's useful for anyone operating outside of "OCP-RHEL-CoreOS". But we should also rationalize this change to the podman defaults and consider doing it for all OCP nodes by default. This relates to https://github.com/openshift/machine-config-operator/issues/1349
AFAICS, this change to enable signature verification only appeared in RHEL 8.6. But I cannot make heads or tails of the git history of containers-common here.
For a signature-requiring signature policy to work with mirrors, the signatures have to be mirrored as well. That is, sadly, not automatic and effortless, and requires - Using a registry that natively supports signatures (the OpenShift integrated registry is AFAIK the only one), or setting up a sigstore-staging mechanism + a sigstore web server and configuring the nodes to use it in /etc/containers/registries.d/*.yaml . Compare e.g. https://github.com/containers/podman/blob/056f492f59c333d521ebbbe186abde0278e815db/docs/tutorials/image_signing.md (ignore the part about creating new signatures, focus on sigstore-staging with file:// and sigstore with http:// ). - Using a mirroring mechanism that mirrors per-image signatures, e.g. (skopeo sync). I don’t think (oc adm mirror) does this; IIRC that only mirrors the OCP-specific signature for the OpenShift release image, using an OpenShift-specific config map, nothing else). Note that in the configuration to enforce original Red Hat signatures in combination with ICSP policy.json refers to the _original_ names (e.g. redhat.io/…), while registries.d/*.yaml describes sigstores attached to the _mirror_ registry (brew.registry.redhat.io/…).
I had a quick chat with Tom Sweeney and am moving this to Containers
I have encountered this issue in a ipv4 connected environment as well, attaching the must-gather file
seems like this issue happens in 4.10 too
Another flow is impacted: Late binding spoke does not reference a clusterImageSet so it requests the latest image in AgentServiceConfig osImages list to initially boot from. If 4.11 is in the osImages list and is the latest, the late binding spoke will try to boot from it and hit the same issue described in this bz, even if the target OCP is not 4.11 (ie 4.10).
Work around for day 1 cluster deploy (Not tested late binding yet) is to apply ignition override via infraenv to set policy.json to insecure. Include ignitionConfigOverride that is listed below in infraEnv: apiVersion: agent-install.openshift.io/v1beta1 kind: InfraEnv metadata: name: infraenvname namespace: infraenvnamespace spec: clusterRef: [...] ignitionConfigOverride: '{"ignition": {"version": "3.1.0"}, "storage": {"files": [{"overwrite": true, "path": "/etc/containers/policy.json", "contents": {"source":"data:text/plain;base64,ewogICAgImRlZmF1bHQiOiBbCiAgICAgICAgewogICAgICAgICAgICAidHlwZSI6ICJpbnNlY3VyZUFjY2VwdEFueXRoaW5nIgogICAgICAgIH0KICAgIF0sCiAgICAidHJhbnNwb3J0cyI6CiAgICAgICAgewogICAgICAgICAgICAiZG9ja2VyLWRhZW1vbiI6CiAgICAgICAgICAgICAgICB7CiAgICAgICAgICAgICAgICAgICAgIiI6IFt7InR5cGUiOiJpbnNlY3VyZUFjY2VwdEFueXRoaW5nIn1dCiAgICAgICAgICAgICAgICB9CiAgICAgICAgfQp9Cgo="}}]}}' ## which sets the following on the node(s): cat /etc/containers/policy.json { "default": [ { "type": "insecureAcceptAnything" } ], "transports": { "docker-daemon": { "": [{"type":"insecureAcceptAnything"}] } } } ## Tested with this image on the spoke: 4.11.0 :4.11.0-0.nightly-2022-05-20-213928
Hi Jindrich - I was wondering if there was a plan for this as I saw early on Colin had listed some options. Not a blocker right now as there is a work around, just checking.
I meant it was not a "test blocker" by my earlier statement. I would definitely consider this a blocker for release as it impacts existing users heavily. Needs to keep blocker flag.
Hi I tested with the release image (4.11.0-0.nightly-2022-06-28-160049) but the issue still persists for me. I noted some different values pasted above however. Just to clear up any confusion, I will explain the process for what we are doing. We are booting the machine using a rootfs + live ISO (like the ones that I listed above). This is where we are running into issues with the default values in policy.json that is causing the failures that Sasha initially posted. The verification above seems to be post-installation with the release image but our issue is pre-installation. I'm not really sure where containers-common comes from in this case but it appears to be different.
Verified that the fix was included in 4.11 rc.1 rchos images: https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-4.11.0-rc.1-x86_64-live.x86_64.iso https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-4.11.0-rc.1-x86_64-live-rootfs.x86_64.img
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069