Bug 2073197 - Error in Spoke/SNO agent: Source image rejected: A signature was required, but no signature exists
Summary: Error in Spoke/SNO agent: Source image rejected: A signature was required, bu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.11.0
Assignee: Jindrich Novy
QA Contact: pmali
URL:
Whiteboard:
Depends On:
Blocks: ztpfw
TreeView+ depends on / blocked
 
Reported: 2022-04-07 21:06 UTC by Alexander Chuzhoy
Modified: 2023-07-11 00:34 UTC (History)
27 users (show)

Fixed In Version: containers-common-1-21.rhaos4.11.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:05:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github stolostron backlog issues 21558 0 None None None 2022-04-08 00:37:08 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:05:41 UTC

Description Alexander Chuzhoy 2022-04-07 21:06:37 UTC
Version:
ACM: v2.4.2-RC5
OCP: 4.10.9-x86_64


Steps to reproduce:

Attempted to deploy SNO/spoke


Used this icsp:
```
    apiVersion: operator.openshift.io/v1alpha1
    kind: ImageContentSourcePolicy
    metadata:
      name: advanced-cluster-manager-icsp
    spec:
      repositoryDigestMirrors:
      - mirrors:
        - brew.registry.redhat.io/rhacm2
        source: registry.redhat.io/rhacm2
      - mirrors:
        - brew.registry.redhat.io/openshift4/ose-oauth-proxy
        source: registry.access.redhat.com/openshfit4/ose-oauth-proxy
```

Result:
The agent wouldn't start. Was prompting that it's unable to pull the image ( Source image rejected: A signature was required, but no signature exists)



The file /etc/containers/policy.json on the agent had:

{
    "default": [
        {
            "type": "insecureAcceptAnything"
        }
    ],
    "transports": {
        "docker": {
            "registry.access.redhat.com": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
                }
            ],
            "registry.redhat.io": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
                }
            ]
        },
        "docker-daemon": {
            "": [
                {
                    "type": "insecureAcceptAnything"
                }
            ]
        }
    }
}





After I updated the file to have only the following:
{
    "default": [
        {
            "type": "insecureAcceptAnything"
        }
    ],
    "transports": {
        "docker-daemon": {
	    "": [
		{
		    "type": "insecureAcceptAnything"
		}
	    ]
	}
    }
}


The image was successfully pulled and the agent started.

Comment 1 Alexander Chuzhoy 2022-04-07 22:40:09 UTC
[core@api ~]$ sudo journalctl -u agent.service -l
-- Logs begin at Thu 2022-04-07 22:34:26 UTC, end at Thu 2022-04-07 22:37:00 UTC. --
Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service...
Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3136]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b>
Apr 07 22:36:18 api.qe1.kni.lab.eng.bos.redhat.com podman[3249]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6...
Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com podman[3249]: Error: Source image rejected: A signature was required, but no signature exists
Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125
Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'.
Apr 07 22:36:19 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service.
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart.
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 1.
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service.
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service...
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3313]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b>
Apr 07 22:36:23 api.qe1.kni.lab.eng.bos.redhat.com podman[3427]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6...
Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com podman[3427]: Error: Source image rejected: A signature was required, but no signature exists
Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125
Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'.
Apr 07 22:36:24 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service.
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart.
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 2.
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service.
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service...
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3486]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b>
Apr 07 22:36:27 api.qe1.kni.lab.eng.bos.redhat.com podman[3595]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6...
Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com podman[3595]: Error: Source image rejected: A signature was required, but no signature exists
Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125
Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'.
Apr 07 22:36:28 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service.
Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart.
Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 3.
Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service.
Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service...
Apr 07 22:36:31 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3656]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b>
Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com podman[3770]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6...
Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com podman[3770]: Error: Source image rejected: A signature was required, but no signature exists
Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125
Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'.
Apr 07 22:36:32 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service.
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart.
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 4.
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service.
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service...
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com agent-fix-bz1964591[3830]: Error: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b>
Apr 07 22:36:36 api.qe1.kni.lab.eng.bos.redhat.com podman[3945]: Trying to pull registry.redhat.io/rhacm2/assisted-installer-agent-rhel8@sha256:2b6139743524958dde64cf7282da4ae7a73ef856994733b1fd3d9c58e1a5b6d6...
Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com podman[3945]: Error: Source image rejected: A signature was required, but no signature exists
Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Control process exited, code=exited status=125
Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Failed with result 'exit-code'.
Apr 07 22:36:37 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Failed to start agent.service.
Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Service RestartSec=3s expired, scheduling restart.
Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: agent.service: Scheduled restart job, restart counter is at 5.
Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Stopped agent.service.
Apr 07 22:36:40 api.qe1.kni.lab.eng.bos.redhat.com systemd[1]: Starting agent.service..

Comment 2 Michael Filanov 2022-04-10 15:03:06 UTC
where is policy come from?

Comment 5 Marius Cornea 2022-05-11 15:43:13 UTC
(In reply to Marius Cornea from comment #4)
> I am seeing this issue with 4.11 as well:
> 
>   - cpuArchitecture: x86_64
>     openshiftVersion: "4.11"
>     rootFSUrl:
> http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift-
> v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live-rootfs.x86_64.img
>     url:
> http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:8080/images/pub/openshift-
> v4/dependencies/rhcos/pre-release/latest-4.11/rhcos-live.x86_64.iso
>     version: 4.11.0-0.nightly-2022-05-10-045003

The images were mirrored from https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest-4.11/

Comment 6 Colin Walters 2022-05-16 12:34:38 UTC
In RHEL (including RHEL CoreOS, which is what the live ISO is), the contents of `/etc/containers/policy.json` come from the podman stack, specifically `containers-common`.

However - a huge caveat here is that in OCP the MCO (specifically the container runtime config controller) actually overwrites all those defaults, see

https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-container-runtime/_base/files/policy.yaml

I suspect what happened here is the podman stack made a change to enable this, but we didn't update the MCO defaults to match the new package changes.

The Live ISO does *not* include the MCO override templates (IOW it's more RHEL, not OCP), which is why you're only seeing this there.

If you want to work around this today, you can include an Ignition config which writes `/etc/containers/policy.json` into the Live environment.

I think though this is a conflict between mirroring and signing - the ideal case is that mirroring preserves signatures, but I don't think that happens in all cases today.

In the short term, our instructions for mirroring probably need to be updated to mention disabling signature verification - that's useful for anyone operating outside of "OCP-RHEL-CoreOS".

But we should also rationalize this change to the podman defaults and consider doing it for all OCP nodes by default.  This relates to https://github.com/openshift/machine-config-operator/issues/1349

Comment 7 Colin Walters 2022-05-16 13:03:43 UTC
AFAICS, this change to enable signature verification only appeared in RHEL 8.6.  But I cannot make heads or tails of the git history of containers-common here.

Comment 8 Miloslav Trmač 2022-05-16 14:04:48 UTC
For a signature-requiring signature policy to work with mirrors, the signatures have to be mirrored as well. That is, sadly, not automatic and effortless, and requires
- Using a registry that natively supports signatures (the OpenShift integrated registry is AFAIK the only one), or setting up a sigstore-staging mechanism + a sigstore web server and configuring the nodes to use it in /etc/containers/registries.d/*.yaml . Compare e.g. https://github.com/containers/podman/blob/056f492f59c333d521ebbbe186abde0278e815db/docs/tutorials/image_signing.md (ignore the part about creating new signatures, focus on sigstore-staging with file:// and sigstore with http:// ). 
- Using a mirroring mechanism that mirrors per-image signatures, e.g. (skopeo sync). I don’t think (oc adm mirror) does this; IIRC that only mirrors the OCP-specific signature for the OpenShift release image, using an OpenShift-specific config map, nothing else).

Note that in the configuration to enforce original Red Hat signatures in combination with ICSP policy.json refers to the _original_ names (e.g. redhat.io/…), while registries.d/*.yaml describes sigstores attached to the _mirror_ registry (brew.registry.redhat.io/…).

Comment 10 Derrick Ornelas 2022-05-16 14:11:32 UTC
I had a quick chat with Tom Sweeney and am moving this to Containers

Comment 14 epassaro 2022-05-18 11:58:17 UTC
I have encountered this issue in a ipv4 connected environment as well, attaching the must-gather file

Comment 16 Shelly Miron 2022-05-19 11:53:58 UTC
seems like this issue happens in 4.10 too

Comment 17 Chad Crum 2022-05-19 15:35:16 UTC
Another flow is impacted:

Late binding spoke does not reference a clusterImageSet so it requests the latest image in AgentServiceConfig osImages list to initially boot from.

If 4.11 is in the osImages list and is the latest, the late binding spoke will try to boot from it and hit the same issue described in this bz, even if the target OCP is not 4.11 (ie 4.10).

Comment 19 Chad Crum 2022-05-25 01:04:00 UTC
Work around for day 1 cluster deploy (Not tested late binding yet) is to apply ignition override via infraenv to set policy.json to insecure.

Include ignitionConfigOverride that is listed below in infraEnv:


apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: infraenvname
  namespace: infraenvnamespace
spec:  clusterRef:
[...]
  ignitionConfigOverride: '{"ignition": {"version": "3.1.0"}, "storage": {"files": [{"overwrite": true, "path": "/etc/containers/policy.json", "contents": {"source":"data:text/plain;base64,ewogICAgImRlZmF1bHQiOiBbCiAgICAgICAgewogICAgICAgICAgICAidHlwZSI6ICJpbnNlY3VyZUFjY2VwdEFueXRoaW5nIgogICAgICAgIH0KICAgIF0sCiAgICAidHJhbnNwb3J0cyI6CiAgICAgICAgewogICAgICAgICAgICAiZG9ja2VyLWRhZW1vbiI6CiAgICAgICAgICAgICAgICB7CiAgICAgICAgICAgICAgICAgICAgIiI6IFt7InR5cGUiOiJpbnNlY3VyZUFjY2VwdEFueXRoaW5nIn1dCiAgICAgICAgICAgICAgICB9CiAgICAgICAgfQp9Cgo="}}]}}'


## which sets the following on the node(s):
cat /etc/containers/policy.json 
{
    "default": [
        {
            "type": "insecureAcceptAnything"
        }
    ],
    "transports":
        {
            "docker-daemon":
                {
                    "": [{"type":"insecureAcceptAnything"}]
                }
        }
}

## Tested with this image on the spoke:

4.11.0    :4.11.0-0.nightly-2022-05-20-213928

Comment 20 Chad Crum 2022-06-08 12:06:15 UTC
Hi Jindrich - I was wondering if there was a plan for this as I saw early on Colin had listed some options.

Not a blocker right now as there is a work around, just checking.

Comment 25 Chad Crum 2022-06-17 13:24:16 UTC
I meant it was not a "test blocker" by my earlier statement. I would definitely consider this a blocker for release as it impacts existing users heavily.

Needs to keep blocker flag.

Comment 33 Trey West 2022-06-29 17:25:46 UTC
Hi I tested with the release image (4.11.0-0.nightly-2022-06-28-160049) but the issue still persists for me. I noted some different values pasted above however. Just to clear up any confusion, I will explain the process for what we are doing.

We are booting the machine using a rootfs + live ISO (like the ones that I listed above). This is where we are running into issues with the default values in policy.json that is causing the failures that Sasha initially posted. The verification above seems to be post-installation with the release image but our issue is pre-installation. I'm not really sure where containers-common comes from in this case but it appears to be different.

Comment 40 errata-xmlrpc 2022-08-10 11:05:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.