Bug 1903632 - After upgrading a customer openshift cluster to 4.6.4 the openshift marketplace pods are in ImagePullBackOff state
Summary: After upgrading a customer openshift cluster to 4.6.4 the openshift marketpla...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.6
Hardware: x86_64
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Dave Gordon
QA Contact: Jian Zhang
URL:
Whiteboard:
: 1929238 1939361 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-02 14:29 UTC by Andy Bartlett
Modified: 2024-03-25 17:21 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:28:52 UTC
Target Upstream Version:
Embargoed:
tflannag: needinfo-
davegord: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5671401 0 None None None 2020-12-28 09:30:54 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:29:06 UTC

Internal Links: 2009373

Description Andy Bartlett 2020-12-02 14:29:22 UTC
Description of problem:
After upgrading a customer openshift cluster to 4.6.4 the openshift marketplace pods are in ImagePullBackOff state.
Cluster update history:
    history:
    - completionTime: "2020-11-30T19:33:15Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:6681fc3f83dda0856b43cecd25f2d226c3f90e8a42c7144dbc499f6ee0a086fc
      startedTime: "2020-11-30T18:18:45Z"
      state: Completed
      verified: false
      version: 4.6.4
    - completionTime: "2020-11-30T18:13:45Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:bae5510f19324d8e9c313aaba767e93c3a311902f5358fe2569e380544d9113e
      startedTime: "2020-11-30T17:33:17Z"
      state: Completed
      verified: false
      version: 4.5.19
    - completionTime: "2020-11-02T20:17:11Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:adb5ef06c54ff75ca9033f222ac5e57f2fd82e49bdd84f737d460ae542c8af60
      startedTime: "2020-11-02T19:31:57Z"
      state: Completed
      verified: true
      version: 4.5.16
    - completionTime: "2020-07-29T12:30:00Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:8f923b7b8efdeac619eb0e7697106c1d17dd3d262c49d8742b38600417cf7d1d
      startedTime: "2020-07-29T12:06:21Z"
      state: Completed
      verified: false
      version: 4.5.2

[root@control-host-01 ~]# oc get pods -n openshift-marketplace
NAME                                    READY   STATUS             RESTARTS   AGE
certified-operators-4psvb               0/1     ImagePullBackOff   0          37h
certified-operators-s87bq               0/1     ImagePullBackOff   0          37h
community-operators-22pbq               0/1     ImagePullBackOff   0          37h
community-operators-2dps8               0/1     ImagePullBackOff   0          37h
marketplace-operator-5bbff88564-wzb4x   1/1     Running            0          37h
redhat-marketplace-7j4fr                0/1     ImagePullBackOff   0          37h
redhat-marketplace-fjb2b                0/1     ImagePullBackOff   0          37h
redhat-operators-6ztm9                  0/1     ImagePullBackOff   0          37h
redhat-operators-k8bmh                  0/1     ImagePullBackOff   0          37h


[root@control-host-01 ~]# oc get events -n openshift-marketplace
LAST SEEN   TYPE      REASON    OBJECT                          MESSAGE
140m        Normal    Pulling   pod/certified-operators-4psvb   Pulling image "registry.redhat.io/redhat/certified-operator-index:v4.6"
4h25m       Warning   Failed    pod/certified-operators-4psvb   Failed to pull image "registry.redhat.io/redhat/certified-operator-index:v4.6": rpc error: code = Unknown desc = Source image rejected: A signature was required, but no signature exists
15m         Warning   Failed    pod/certified-operators-4psvb   Error: ErrImagePull
5m26s       Normal    BackOff   pod/certified-operators-4psvb   Back-off pulling image "registry.redhat.io/redhat/certified-operator-index:v4.6"
14s         Warning   Failed    pod/certified-operators-4psvb   Error: ImagePullBackOff
79m         Normal    Pulling   pod/certified-operators-s87bq   Pulling image "registry.redhat.io/redhat/certified-operator-index:v4.6"
3h9m        Warning   Failed    pod/certified-operators-s87bq   Failed to pull image "registry.redhat.io/redhat/certified-operator-index:v4.6": rpc error: code = Unknown desc = Source image rejected: A signature was required, but no signature exists
4m9s        Normal    BackOff   pod/certified-operators-s87bq   Back-off pulling image "registry.redhat.io/redhat/certified-operator-index:v4.6"
14m         Warning   Failed    pod/certified-operators-s87bq   Error: ImagePullBackOff
74m         Normal    Pulling   pod/community-operators-22pbq   Pulling image "registry.redhat.io/redhat/community-operator-index:latest"
79m         Warning   Failed    pod/community-operators-22pbq   Error: ErrImagePull
9m10s       Normal    BackOff   pod/community-operators-22pbq   Back-off pulling image "registry.redhat.io/redhat/community-operator-index:latest"
4m12s       Warning   Failed    pod/community-operators-22pbq   Error: ImagePullBackOff
110m        Normal    Pulling   pod/community-operators-2dps8   Pulling image "registry.redhat.io/redhat/community-operator-index:latest"
10m         Normal    BackOff   pod/community-operators-2dps8   Back-off pulling image "registry.redhat.io/redhat/community-operator-index:latest"
16s         Warning   Failed    pod/community-operators-2dps8   Error: ImagePullBackOff
19m         Normal    Pulling   pod/redhat-marketplace-7j4fr    Pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6"
154m        Warning   Failed    pod/redhat-marketplace-7j4fr    Failed to pull image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6": rpc error: code = Unknown desc = Source image rejected: A signature was required, but no signature exists
5h59m       Warning   Failed    pod/redhat-marketplace-7j4fr    Error: ErrImagePull
9m13s       Normal    BackOff   pod/redhat-marketplace-7j4fr    Back-off pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6"
4m13s       Warning   Failed    pod/redhat-marketplace-7j4fr    Error: ImagePullBackOff
70m         Normal    Pulling   pod/redhat-marketplace-fjb2b    Pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6"
3h20m       Warning   Failed    pod/redhat-marketplace-fjb2b    Failed to pull image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6": rpc error: code = Unknown desc = Source image rejected: A signature was required, but no signature exists
33s         Normal    BackOff   pod/redhat-marketplace-fjb2b    Back-off pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.6"
5m34s       Warning   Failed    pod/redhat-marketplace-fjb2b    Error: ImagePullBackOff
79m         Normal    Pulling   pod/redhat-operators-6ztm9      Pulling image "registry.redhat.io/redhat/redhat-operator-index:v4.6"
3h45m       Warning   Failed    pod/redhat-operators-6ztm9      Error: ErrImagePull
5m1s        Normal    BackOff   pod/redhat-operators-6ztm9      Back-off pulling image "registry.redhat.io/redhat/redhat-operator-index:v4.6"
0s          Warning   Failed    pod/redhat-operators-6ztm9      Error: ImagePullBackOff
9m15s       Normal    Pulling   pod/redhat-operators-k8bmh      Pulling image "registry.redhat.io/redhat/redhat-operator-index:v4.6"
14m         Warning   Failed    pod/redhat-operators-k8bmh      Error: ErrImagePull
34m         Normal    BackOff   pod/redhat-operators-k8bmh      Back-off pulling image "registry.redhat.io/redhat/redhat-operator-index:v4.6"
4m15s       Warning   Failed    pod/redhat-operators-k8bmh      Error: ImagePullBackOff

Version-Release number of selected component (if applicable):

OCP 4.6.4

How reproducible:
100%

Steps to Reproduce:
1. Create machineconfig_update.yaml:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-zzz-worker-registry-trust
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,(...)
        filesystem: root
        mode: 420
        path: /etc/containers/registries.d/registry.redhat.io.yaml
      - contents:
          source: data:text/plain;charset=utf-8;base64,(...)
        filesystem: root
        mode: 420
        path: /etc/containers/registries.d/registry.access.redhat.com.yaml
      - contents:
          source: data:text/plain;charset=utf-8;base64,(...)

        filesystem: root
        mode: 420
        path: /etc/containers/policy.json

2. oc create -f machineconfig_update.yaml
3. Watch the pods in the openshift-marketplace after a few minutes they will fail.

Actual results:

NAME                                    READY   STATUS             RESTARTS   AGE
certified-operators-4psvb               0/1     ImagePullBackOff   0          37h
certified-operators-s87bq               0/1     ImagePullBackOff   0          37h
community-operators-22pbq               0/1     ImagePullBackOff   0          37h
community-operators-2dps8               0/1     ImagePullBackOff   0          37h
marketplace-operator-5bbff88564-wzb4x   1/1     Running            0          37h
redhat-marketplace-7j4fr                0/1     ImagePullBackOff   0          37h
redhat-marketplace-fjb2b                0/1     ImagePullBackOff   0          37h
redhat-operators-6ztm9                  0/1     ImagePullBackOff   0          37h
redhat-operators-k8bmh                  0/1     ImagePullBackOff   0          37h

Expected results:

We need the marketplace to function.

Additional info:

Comment 1 Andy Bartlett 2020-12-02 15:04:49 UTC
I have replicated this on my AWS cluster, and I'm doing it again just now to confirm, my plan is to:

Install OCP 4.5.11
Update the machineconfig and confirm operation then upgrade to 4.5.20 again confirm operation then upgrade to 4.6.6 and see if it breaks like the last one did.

Previous test:

OCP 4.5.20 upgrade to OCP 4.6.6 then apply the machineconfig (which broke).

Regards,

Andy

Comment 2 Brian Cook 2020-12-02 15:10:04 UTC
signing the community and certified operator indexes is currently blocked by https://projects.engineering.redhat.com/browse/KODIAK-243 (in development, completing in Q1)

Comment 3 Andy Bartlett 2020-12-02 15:14:27 UTC
@Brian, In that case is the best course of action to advise the customer not to do this currently?

Comment 5 Kevin Rizza 2020-12-02 15:44:21 UTC
This seems like a docs oversight. There's not a quick fix here and it's something that we would probably need to advise the customer not to do until all of the catalogs shipped outside of the actual openshift payload themselves were signed. Even then, this would still prevent any operator images from working correctly if they were not also signed (which I assume will always be the case for the majority of community operators from the community catalog at the very least).

Comment 7 Kevin Rizza 2020-12-02 16:09:07 UTC
It works fine in OCP 4.5 because the catalogs were built at runtime by bootstrapping a container image that was included in the OCP payload. In 4.6 and later, we are now shipping the catalog separately from OCP itself. Those images come from separate sources and aren't part of the OCP build, and currently aren't signed.

Comment 9 Ben Parees 2020-12-02 17:11:22 UTC
While we should fix this (get the indexes signed, at least everything except community), i think we should also spawn an RFE to make the "signature required" feature more flexible, specifically by having allow/blocklists that allow you to bypass the signature requirement for specific registries/repositories/tags/SHAs.

Comment 10 Evan Cordell 2020-12-02 18:08:02 UTC
MCO takes as input a containers policy file - not sure if this is the canonical doc location, but these docs describe the format: https://www.mankier.com/5/containers-policy.json

By my read of it, we should be able to add exceptions for the registry images, like:

```
{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io/redhat/redhat-operator-index"": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}
```

(this is untested and should be tailored for the customer)

If this approach works, we could probably close this out with an update to the linked docs: https://access.redhat.com/verify-images-ocp4

Comment 12 Evan Cordell 2020-12-02 18:13:51 UTC
> We still want to get these images (the non-community catalog) signed though.

+1 

I'm not clear on what we plan to sign, but if it's everything from registry.redhat.io we can stop there.

If there's ever a world where an images are expected to be on registry.redhat.io but not signed (community operator images mirrored in?), there may be an RFE for tooling that can construct the policy file exceptions for users given an index.

Comment 13 Ben Parees 2020-12-02 18:30:50 UTC
I've created https://projects.engineering.redhat.com/browse/CLOUDDST-3881 to get the catalog images signed.

Comment 14 Andy Bartlett 2020-12-03 08:55:30 UTC
Thanks guys for all the great input on this topic, would be really good to get the images signed, as I am having to explain to the customer that he can used signed images in 4.5 but not in 4.6 currently (obviously that doesn't look great). Ben, I will track the JIRA ticket and this BZ using the customer case listed above.

Many thanks,

Andy

Comment 15 Evan Cordell 2020-12-03 14:15:14 UTC
> as I am having to explain to the customer that he can used signed images in 4.5 but not in 4.6 currently (obviously that doesn't look great).

Did you attempt the mitigation I suggested that checks for signatures for images except for the signatures for the index images?

That should fix the imagepullbackoff issues for the indexes, though additional exceptions may need to be added for installed content from the indexes.

Comment 17 Andy Bartlett 2020-12-03 16:16:19 UTC
@Evan, this is the config we were testing with the customer (I have removed any customer registries)

{
    "default": [
        {
            "type": "reject"
        }
    ],
    "transports": {
        "atomic": {
            "quay.io": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.access.redhat.com": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ],
            "registry.redhat.io": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ]
        },
        "docker": {
            "quay.io": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.access.redhat.com": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ],
            "registry.redhat.io": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ]
        }
    }
}

This failed with the imagepull error

Comment 18 Evan Cordell 2020-12-03 18:25:03 UTC
Could you try this and see if it addresses the issue?

{
    "default": [
        {
            "type": "reject"
        }
    ],
    "transports": {
        "atomic": {
            "quay.io": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.access.redhat.com": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ],
            "registry.redhat.io/redhat/redhat-operator-index": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.redhat.io/redhat/redhat-marketplace-index": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.redhat.io/redhat/certified-operator-index": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.redhat.io/redhat/community-operator-index": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.redhat.io": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ]
        },
        "docker": {
            "quay.io": [
                {
                    "type": "insecureAcceptAnything"
                }
            ],
            "registry.access.redhat.com": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ],
            "registry.redhat.io": [
                {
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release",
                    "keyType": "GPGKeys",
                    "type": "signedBy"
                }
            ]
        }
    }
}

Comment 24 Andreas Karis 2020-12-24 23:07:53 UTC
I just checked this in my lab. 

If the original policy looks the same as in https://access.redhat.com/verify-images-ocp4  

2 remarks about that article:
* for whatever reason that article sets default to insecureAcceptAnything and only forces validation for Red Hat images
* transport docker-daemon does not exist according to my man page:
~~~
[akaris@linux sigver2]$ man containers-policy.json | grep docker-daemon
[akaris@linux sigver2]$ 
~~~

That said, I reproduced the issue with that original policy:
~~~
cat <<'EOF' > policy.json
{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}
EOF
~~~

Then this should be changed to:
~~~
cat <<'EOF' > policy.json
{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io/redhat/redhat-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/redhat-marketplace-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/certified-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/community-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}
EOF
~~~

Note that crio seems to pull via the docker: transport, not via the atomic: transport. So this means that the exceptions need to be set for that transport (as was done in comment #10, but the example from comment #18 did not work for me).

---------------------------------------------------------------

The workaround then is to manually connect to all worker nodes with oc debug node/<node name> and run:
~~~
oc debug node/...
chroot /host

cat <<'EOF' > /etc/containers/policy.json
{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io/redhat/redhat-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/redhat-marketplace-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/certified-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/community-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}
EOF
~~~

One can test immediately if this works by pulling the images manually with crictl:
~~~
crictl pull registry.redhat.io/redhat/certified-operator-index:v4.6
~~~

---------------------------

Then, one should also push the exception list via MachineConfig:
~~~
cat <<'EOF' > policy.json
{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io/redhat/redhat-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/redhat-marketplace-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/certified-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/community-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}
EOF

cat <<EOF > registry.access.redhat.com.yaml
docker:
     registry.access.redhat.com:
         sigstore: https://access.redhat.com/webassets/docker/content/sigstore
EOF

cat <<EOF > registry.redhat.io.yaml
docker:
     registry.redhat.io:
         sigstore: https://registry.redhat.io/containers/sigstore
EOF

export ARC_REG=$( cat registry.access.redhat.com.yaml | base64 -w0 )
export RIO_REG=$( cat registry.redhat.io.yaml | base64 -w0 )
export POLICY_CONFIG=$( cat policy.json | base64 -w0 )

cat > 51-worker-rh-registry-trust.yaml <<EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 51-worker-rh-registry-trust
spec:
  config:
    ignition:
      config: {}
      security:
        tls: {}
      timeouts: {}
      version: 2.2.0
    networkd: {}
    passwd: {}
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,${ARC_REG}
          verification: {}
        filesystem: root
        mode: 420
        path: /etc/containers/registries.d/registry.access.redhat.com.yaml
      - contents:
          source: data:text/plain;charset=utf-8;base64,${RIO_REG}
          verification: {}
        filesystem: root
        mode: 420
        path: /etc/containers/registries.d/registry.redhat.io.yaml
      - contents:
          source: data:text/plain;charset=utf-8;base64,${POLICY_CONFIG}
          verification: {}
        filesystem: root
        mode: 420
        path: /etc/containers/policy.json
  osImageURL: ""
EOF

oc apply -f 51-worker-rh-registry-trust.yaml
~~~

Comment 25 Andreas Karis 2020-12-24 23:16:47 UTC
(making non-confiential private comments public)

Comment 32 Kevin Rizza 2021-02-08 19:46:00 UTC
Moving this bz to medium given that there is a workaround that should not impact clusters and this BZ is only open for tracking purposes at this point -- the OLM component does not control the production of these images and just serves as an intermediary.

Comment 33 Vu Dinh 2021-02-18 15:51:19 UTC
*** Bug 1929238 has been marked as a duplicate of this bug. ***

Comment 34 tflannag 2021-03-16 19:17:36 UTC
*** Bug 1939361 has been marked as a duplicate of this bug. ***

Comment 38 Ben Parees 2021-04-01 13:52:23 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1903632#c24 appears to outline the "correct" workaround steps.

Which as you say, are basically the same as https://access.redhat.com/verify-images-ocp4 and https://docs.openshift.com/container-platform/4.6/security/container_security/security-deploy.html#security-deploy-image-sources_security-deploy

Except that the policy.json content is slightly different, it needs to be what is contained in comment #24, namely the exceptions for the catalog images:

{
  "default": [
    {
      "type": "insecureAcceptAnything"
    }
  ],
  "transports": {
    "docker": {
      "registry.access.redhat.com": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ],
      "registry.redhat.io/redhat/redhat-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/redhat-marketplace-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/certified-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io/redhat/community-operator-index": [
        {
          "type": "insecureAcceptAnything"
        }
      ],
      "registry.redhat.io": [
        {
          "type": "signedBy",
          "keyType": "GPGKeys",
          "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
        }
      ]
    },
    "docker-daemon": {
      "": [
        {
          "type": "insecureAcceptAnything"
        }
      ]
    }
  }
}


So can we just update https://access.redhat.com/verify-images-ocp4 w/ the modified policy.json and be sure that it calls out that this means the catalog images are not signature-verified (because they are not signed yet)?

Soon the redhat operator catalog will be signed and we can remove that exception.  The other 3 will be slightly trickier as they are going to be signed with a different key that doesn't exist on rhcos today.

Comment 52 Jian Zhang 2021-09-15 05:50:21 UTC
[cloud-user@preserve-olm-env jian]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-09-14-200602   True        False         3h41m   Cluster version is 4.9.0-0.nightly-2021-09-14-200602

1, Update the container policy via the MachinConfig.
[cloud-user@preserve-olm-env jian]$ cat policy.json |base64 
ewoKICAgICJkZWZhdWx0IjogWwogICAgICAgIHsKICAgICAgICAgICAgInR5cGUiOiAiaW5zZWN1
cmVBY2NlcHRBbnl0aGluZyIKICAgICAgICB9CiAgICBdLAogICAgInRyYW5zcG9ydHMiOiB7CiAg
ICAgICAgImRvY2tlci1kYWVtb24iOiB7CiAgICAgICAgICAgICIiOiBbCiAgICAgICAgICAgICAg
ICB7CiAgICAgICAgICAgICAgICAgICAgInR5cGUiOiAiaW5zZWN1cmVBY2NlcHRBbnl0aGluZyIK
ICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXQogICAgICAgIH0sCiAgICAgICAgImRvY2tl
ciI6IHsKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvY2VydGlmaWVkLW9w
ZXJhdG9yLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0
eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlz
IiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9n
cGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAg
ICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvY29tbXVuaXR5LW9w
ZXJhdG9yLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0
eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlz
IiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9n
cGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAg
ICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvcmVkaGF0LW1hcmtl
dHBsYWNlLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0
eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlz
IiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9n
cGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAg
ICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pbyI6IFsKICAgICAgICAgICAgICAg
IHsKICAgICAgICAgICAgICAgICAgICAidHlwZSI6ICJzaWduZWRCeSIsCiAgICAgICAgICAgICAg
ICAgICAgImtleVR5cGUiOiAiR1BHS2V5cyIsCiAgICAgICAgICAgICAgICAgICAgImtleVBhdGgi
OiAiL2V0Yy9wa2kvcnBtLWdwZy9SUE0tR1BHLUtFWS1yZWRoYXQtcmVsZWFzZSIKICAgICAgICAg
ICAgICAgIH0KICAgICAgICAgICAgXQogICAgICAgIH0KICAgIH0KCn0K

[cloud-user@preserve-olm-env jian]$ cat machineconfig_update.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-zzz-worker-registry-trust
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,ewoKICAgICJkZWZhdWx0IjogWwogICAgICAgIHsKICAgICAgICAgICAgInR5cGUiOiAiaW5zZWN1cmVBY2NlcHRBbnl0aGluZyIKICAgICAgICB9CiAgICBdLAogICAgInRyYW5zcG9ydHMiOiB7CiAgICAgICAgImRvY2tlci1kYWVtb24iOiB7CiAgICAgICAgICAgICIiOiBbCiAgICAgICAgICAgICAgICB7CiAgICAgICAgICAgICAgICAgICAgInR5cGUiOiAiaW5zZWN1cmVBY2NlcHRBbnl0aGluZyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXQogICAgICAgIH0sCiAgICAgICAgImRvY2tlciI6IHsKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvY2VydGlmaWVkLW9wZXJhdG9yLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlzIiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9ncGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvY29tbXVuaXR5LW9wZXJhdG9yLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlzIiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9ncGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pby9yZWRoYXQvcmVkaGF0LW1hcmtldHBsYWNlLWluZGV4IjogWwogICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAgICAgICJ0eXBlIjogInNpZ25lZEJ5IiwKICAgICAgICAgICAgICAgICAgICAia2V5VHlwZSI6ICJHUEdLZXlzIiwKICAgICAgICAgICAgICAgICAgICAia2V5UGF0aCI6ICIvdXNyL3NoYXJlL29wZW5zaGlmdC9ncGcta2V5cy9jb250YWluZXJpc3ZzaWduLmFzYyIKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgInJlZ2lzdHJ5LnJlZGhhdC5pbyI6IFsKICAgICAgICAgICAgICAgIHsKICAgICAgICAgICAgICAgICAgICAidHlwZSI6ICJzaWduZWRCeSIsCiAgICAgICAgICAgICAgICAgICAgImtleVR5cGUiOiAiR1BHS2V5cyIsCiAgICAgICAgICAgICAgICAgICAgImtleVBhdGgiOiAiL2V0Yy9wa2kvcnBtLWdwZy9SUE0tR1BHLUtFWS1yZWRoYXQtcmVsZWFzZSIKICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgXQogICAgICAgIH0KICAgIH0KCn0K
        filesystem: root
        mode: 420
        path: /etc/containers/policy.json

[cloud-user@preserve-olm-env jian]$ oc create -f machineconfig_update.yaml 
machineconfig.machineconfiguration.openshift.io/99-zzz-worker-registry-trust created

[cloud-user@preserve-olm-env jian]$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
00-worker                                          2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
01-master-container-runtime                        2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
01-master-kubelet                                  2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
01-worker-container-runtime                        2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
01-worker-kubelet                                  2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
99-master-generated-registries                     2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
99-master-ssh                                                                                 3.2.0             4h9m
99-worker-generated-registries                     2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
99-worker-ssh                                                                                 3.2.0             4h9m
99-zzz-worker-registry-trust                                                                  3.1.0             126m
rendered-master-ade75d770442fd1258ee537395df5b35   2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             3h41m
rendered-master-b10059e938cd827d5755ad9e27f967d4   2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
rendered-worker-168fcbfa9ad94a120f7e8059fdc776df   2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             125m
rendered-worker-56a2bd19ad9b6f7d406e1bf8f397558e   2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             3h41m
rendered-worker-7826ab548d5fa3264a9c2304564dcc65   2ec816a4aa741821e664fa512ab02f465926c0ab   3.2.0             4h6m
[cloud-user@preserve-olm-env jian]$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-ade75d770442fd1258ee537395df5b35   True      False      False      3              3                   3                     0                      4h8m
worker   rendered-worker-168fcbfa9ad94a120f7e8059fdc776df   True      False      False      3              3                   3                     0                      4h8m

2, Check if the policy works
[cloud-user@preserve-olm-env jian]$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-137-191.us-east-2.compute.internal   Ready    worker   3h57m   v1.22.0-rc.0+75ee307
ip-10-0-145-105.us-east-2.compute.internal   Ready    master   4h1m    v1.22.0-rc.0+75ee307
ip-10-0-172-210.us-east-2.compute.internal   Ready    worker   3h56m   v1.22.0-rc.0+75ee307
ip-10-0-173-43.us-east-2.compute.internal    Ready    master   4h3m    v1.22.0-rc.0+75ee307
ip-10-0-201-188.us-east-2.compute.internal   Ready    master   4h1m    v1.22.0-rc.0+75ee307
ip-10-0-213-186.us-east-2.compute.internal   Ready    worker   3h56m   v1.22.0-rc.0+75ee307
[cloud-user@preserve-olm-env jian]$ oc debug node/ip-10-0-137-191.us-east-2.compute.internal
Starting pod/ip-10-0-137-191us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.137.191
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# 
sh-4.4# 
sh-4.4# cat /etc/containers/policy.json
{

    "default": [
        {
            "type": "insecureAcceptAnything"
        }
    ],
    "transports": {
        "docker-daemon": {
            "": [
                {
                    "type": "insecureAcceptAnything"
                }
            ]
        },
        "docker": {
            "registry.redhat.io/redhat/certified-operator-index": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/usr/share/openshift/gpg-keys/containerisvsign.asc"
                }
            ],
            "registry.redhat.io/redhat/community-operator-index": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/usr/share/openshift/gpg-keys/containerisvsign.asc"
                }
            ],
            "registry.redhat.io/redhat/redhat-marketplace-index": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/usr/share/openshift/gpg-keys/containerisvsign.asc"
                }
            ],
            "registry.redhat.io": [
                {
                    "type": "signedBy",
                    "keyType": "GPGKeys",
                    "keyPath": "/etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release"
                }
            ]
        }
    }

}

3, After a while, check the pods of openshift-marketplace project.

[cloud-user@preserve-olm-env jian]$ oc get pods -n openshift-marketplace
NAME                                    READY   STATUS              RESTARTS   AGE
certified-operators-5r7zd               0/1     Running             0          4s
certified-operators-s5g5n               1/1     Running             0          117m
community-operators-td4nq               1/1     Running             0          117m
marketplace-operator-6df6565cd6-ffphw   1/1     Running             0          4h3m
qe-app-registry-wmc4c                   1/1     Running             0          117m
qe-app-registry-zxwd4                   0/1     Running             0          4s
redhat-marketplace-lpc9b                1/1     Running             0          117m
redhat-operators-2fn2g                  0/1     ContainerCreating   0          4s
redhat-operators-95cnq                  1/1     Running             0          117m

[cloud-user@preserve-olm-env jian]$ oc get pods -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-s5g5n               1/1     Running   0          126m
community-operators-td4nq               1/1     Running   0          126m
marketplace-operator-6df6565cd6-ffphw   1/1     Running   0          4h12m
qe-app-registry-wmc4c                   1/1     Running   0          126m
redhat-marketplace-lpc9b                1/1     Running   0          126m
redhat-operators-95cnq                  1/1     Running   0          126m

No crash, LGTM, verify it.

Comment 60 errata-xmlrpc 2021-10-18 17:28:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.