Bug 1840385

Summary:

MCD fails when adding replicas to MachineSet and error isn't visible enough

Product:

OpenShift Container Platform

Reporter:

Naveen Malik <nmalik>

Component:

Machine Config Operator

Assignee:

Antonio Murdaca <amurdaca>

Status:

CLOSED ERRATA

QA Contact:

Michael Nguyen <mnguyen>

Severity:

high

Docs Contact:

Priority:

high

Version:

4.3.z

CC:

agarcial, amurdaca, cattias, jaharrin, jbeakley, kgarriso, pbergene, tparikh

Target Milestone:

---

Keywords:

ServiceDeliveryImpact

Target Release:

4.6.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-10-27 16:01:40 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
machine-approver-controller container logs	none

Description Naveen Malik 2020-05-26 20:29:54 UTC

Description of problem:
Attempted to scale a cluster recently upgraded from 4.3.18 to 4.3.19 and Machines were created and corresponding EC2 instances were created.  But no Nodes were created in cluster.


Version-Release number of selected component (if applicable):
4.3.19


How reproducible:
One cluster seen with this.


Steps to Reproduce:
1. Update worker MachineSet to update .spec.replicas to a larger number

Actual results:
New Nodes were never provisioned


Expected results:
New Nodes are privisoned for desired machine count.


Additional info:
Will add a must-gather link taken last week.
This happened at the time of the quay.io outage last week, was an attempt to scale up a cluster to take some of the load for quay.io.

Cluster is still online.  We want to shut it down but are keeping it in case we need some more info.  It still exhibits the problem.

Comment 3 Alberto 2020-05-28 08:04:43 UTC

Machines are stuck on provisioned which means all good creating the infra.
The issue can be either with the nodes not being able to authenticate themselves during bootstrap or something wrong with machine-config being passed.
Can we please get logs for cluster-machine-approver pod and kubectl get csr? Unfortunately cluster-machine-approver are missed from must-gather.

Comment 4 Naveen Malik 2020-05-28 12:02:28 UTC

Attaching logs.

$ oc get csr
No resources found.

Comment 5 Naveen Malik 2020-05-28 12:03:03 UTC

Created attachment 1693033 [details]
machine-approver-controller container logs

Comment 6 Alberto 2020-05-28 12:28:06 UTC

All seems ok from machine API and infrastructure pov.
Machines report "provisioned" appropriately and there's no even pending CSRs.
For some reason the software running within the instances has failed and has prevented the instances from running a kubelet, request to authenticate themselves and becoming nodes.

Form one of the machine config daemons:
ocp-v4.0-art-dev@sha256:b920c6ed07fa42565949ce61f14b3b9dd7bea64d2cfb13539fcea7068ea799f2
2020-05-19T13:06:46.445346629Z Error: error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b920c6ed07fa42565949ce61f14b3b9dd7bea64d2cfb13539fcea7068ea799f2": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b920c6ed07fa42565949ce61f14b3b9dd7bea64d2cfb13539fcea7068ea799f2: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b920c6ed07fa42565949ce61f14b3b9dd7bea64d2cfb13539fcea7068ea799f2: error pinging docker registry quay.io: Get https://quay.io/v2/: EOF
2020-05-19T13:06:46.445346629Z W0519 13:06:46.222937 2996778 run.go:40] podman failed: exit status 125; retrying...
2020-05-19T13:08:06.702105132Z I0519 13:08:06.223106 2996778 run.go:16] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/

Comment 7 Naveen Malik 2020-05-28 12:39:40 UTC

Would be good to surface this up to the Machine CR, good to know we have some indicator at least!  This happened while quay.io was having problems last week, May 19, so it's not surprising this is the issue.  I will try to kick one of the stuck machines by deleting it and report findings.

Comment 8 Naveen Malik 2020-05-28 12:54:41 UTC

New Machine came online with EC2 instance and Node just fine.  Perhaps we re-state this BZ as the pull error on node provisioning is not surfaced to the Machine CR?

Comment 10 Antonio Murdaca 2020-05-28 13:38:40 UTC

(In reply to Naveen Malik from comment #8)
> New Machine came online with EC2 instance and Node just fine.  Perhaps we
> re-state this BZ as the pull error on node provisioning is not surfaced to
> the Machine CR?

given that and the fact that the pull error is shadowing the real reason behind this bz anyway and the domain is still MAO, I'm bringing this back to MAO for further triage as MCD should just pull that image once quay works

Comment 11 Alberto 2020-05-28 13:55:05 UTC

>given that and the fact that the pull error is shadowing the real reason behind this bz anyway and the domain is still MAO, I'm bringing this back to MAO for further triage as MCD should just pull that image once quay works

I'm not sure how you mean. The pull error is not shadowing the real reason. There's no other real reason than the pull error. There's no further triage needed here. This happened because quay suffering an outage and so the kubelet coudn't join the cluster, that's the root cause.

This highlights a product shortfall at signalling this failure in a meaningful manner. The operational knowledge for managing this config resides on the mco. I'd suggest for this component to explore ways to put semantics where it might make sense on top of possible failures like this one and bubble those up to where they might be better visible for users/consumers.

Comment 12 Antonio Murdaca 2020-05-28 14:08:38 UTC

(In reply to Alberto from comment #11)
> >given that and the fact that the pull error is shadowing the real reason behind this bz anyway and the domain is still MAO, I'm bringing this back to MAO for further triage as MCD should just pull that image once quay works
> 
> I'm not sure how you mean. The pull error is not shadowing the real reason.
> There's no other real reason than the pull error. There's no further triage
> needed here. This happened because quay suffering an outage and so the
> kubelet coudn't join the cluster, that's the root cause.
> 
> This highlights a product shortfall at signalling this failure in a
> meaningful manner. The operational knowledge for managing this config
> resides on the mco. I'd suggest for this component to explore ways to put
> semantics where it might make sense on top of possible failures like this
> one and bubble those up to where they might be better visible for
> users/consumers.

Right, the MCO's cluster object should report this up to the CO itself, I'll move it back to try and assess better how we communicate the error up the stack but since it was because of the quay.io outage and not a regression, I'm moving this to 4.6 as well.

Comment 13 Alberto 2020-05-28 14:50:31 UTC

In order words, this error would have happen exactly the same in any UPI environment or such where the machine API were not involved, and it would have been even more hidden.

>Right, the MCO's cluster object should report this up to the CO itself, I'll move it back to try and assess better how we communicate the error up the stack but since it was because of the quay.io outage and not a regression, I'm moving this to 4.6 as well.

Thanks, that make sense to me. We need to work together traversing all components to make the product better exploring ways to make these failures more visible and signalling them bottom-up from mco.

Comment 14 Antonio Murdaca 2020-06-16 12:58:00 UTC

Adding UpcomingSprint as this won't make the current sprint. We'll try to work on this bug in the next sprint.

Comment 15 Antonio Murdaca 2020-06-24 09:36:50 UTC

The cluster from the must-gather reconciled eventually and we can see from logs that the pull worked (quay worked at some point) so the worker pool reconciled successfully and cleared any previous error that we _were reporting_.

Unfortunately, what we report to the MCP and MCO Cluster object is just:

```
failed to run pivot: failed to start machine-config-daemon-host.service: exit status 1
```

Which isn't helping us understand that "pulling from quay.io" was the issue.

I've attached a PR that aims at bubbling up any error we encounter during an os update, that should take care of reporting the correct error. I want to just stress the obvious: we clear the error reporting in MCO if the reconcile was successful (like in the must-gather) but I think we all agree on that behaviour.

Comment 18 Michael Nguyen 2020-07-08 13:40:08 UTC

RED HAT CONFIDENTIAL

Verified on 4.6.0-0.nightly-2020-07-07-083718

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-07-07-083718   True        False         5h28m   Cluster version is 4.6.0-0.nightly-2020-07-07-083718

1. Modify the osimageurl to have a different sha256

    $ oc -n openshift-machine-config-operator edit cm/machine-config-osimageurl

2.  Wait for errors to occur during upgrade and watch the MCD log

    $  oc -n openshift-machine-config-operator  logs  machine-config-daemon-26hck -c machine-config-daemon

...snip...
I0707 20:17:42.354418  221218 run.go:18] Running: podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4
Error: error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: Error reading manifest sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown
W0707 20:17:43.479307  221218 run.go:44] podman failed: running podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 failed: Error: error pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: Error reading manifest sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown
: exit status 125; retrying...

3. Verify the error is rolled up to the Machine Config Pool

    $ oc get mcp/worker -o yaml


apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  creationTimestamp: "2020-07-07T14:31:10Z"
  generation: 7
  labels:
    machineconfiguration.openshift.io/mco-built-in: ""
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/mco-built-in: {}
      f:spec:
        .: {}
        f:configuration: {}
        f:machineConfigSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:machineconfiguration.openshift.io/role: {}
        f:nodeSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:node-role.kubernetes.io/worker: {}
        f:paused: {}
    manager: machine-config-operator
    operation: Update
    time: "2020-07-07T14:31:10Z"
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:configuration:
          f:name: {}
          f:source: {}
      f:status:
        .: {}
        f:conditions: {}
        f:configuration:
          .: {}
          f:name: {}
          f:source: {}
        f:degradedMachineCount: {}
        f:machineCount: {}
        f:observedGeneration: {}
        f:readyMachineCount: {}
        f:unavailableMachineCount: {}
        f:updatedMachineCount: {}
    manager: machine-config-controller
    operation: Update
    time: "2020-07-07T20:09:29Z"
  name: worker
  resourceVersion: "145806"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker
  uid: 28a56f99-651b-4d2d-baed-bd1a3f29071d
spec:
  configuration:
    name: rendered-worker-8e7f1827931649f08fb55d3ce0c698e0
    source:
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 00-worker
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 01-worker-container-runtime
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 01-worker-kubelet
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 99-worker-generated-registries
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 99-worker-ssh
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: worker
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""
  paused: false
status:
  conditions:
  - lastTransitionTime: "2020-07-07T14:31:58Z"
    message: ""
    reason: ""
    status: "False"
    type: RenderDegraded
  - lastTransitionTime: "2020-07-07T20:05:49Z"
    message: ""
    reason: ""
    status: "False"
    type: Updated
  - lastTransitionTime: "2020-07-07T20:05:49Z"
    message: All nodes are updating to rendered-worker-8e7f1827931649f08fb55d3ce0c698e0
    reason: ""
    status: "True"
    type: Updating
  - lastTransitionTime: "2020-07-07T20:09:29Z"
    message: 'Node ip-10-0-213-191.us-west-2.compute.internal is reporting: "failed
      to run pivot: failed to run pivot: exit status 1: failed to run command podman
      (6 tries): timed out waiting for the condition: running podman pull -q --authfile
      /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4
      failed: Error: error pulling image \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4\":
      unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4:
      unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4:
      Error reading manifest sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4
      in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest
      unknown\n: exit status 125"'
    reason: 1 nodes are reporting degraded status on sync
    status: "True"
    type: NodeDegraded
  - lastTransitionTime: "2020-07-07T20:09:29Z"
    message: ""
    reason: ""
    status: "True"
    type: Degraded
  configuration:
    name: rendered-worker-f2257ee53059916ea51a9652971094c5
    source:
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 00-worker
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 01-worker-container-runtime
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 01-worker-kubelet
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 99-worker-generated-registries
    - apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      name: 99-worker-ssh
  degradedMachineCount: 1
  machineCount: 3
  observedGeneration: 7
  readyMachineCount: 0
  unavailableMachineCount: 1
  updatedMachineCount: 0

4. Verified the error is rolled up to the clusteroperator

    $ oc get co

NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h43m
cloud-credential                           4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h53m
cluster-autoscaler                         4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h41m
config-operator                            4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h48m
console                                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h33m
csi-snapshot-controller                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      13m
dns                                        4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h47m
etcd                                       4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h45m
image-registry                             4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h36m
ingress                                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h36m
insights                                   4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h41m
kube-apiserver                             4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h44m
kube-controller-manager                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h44m
kube-scheduler                             4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h46m
kube-storage-version-migrator              4.6.0-0.nightly-2020-07-07-083718   True        False         False      13m
machine-api                                4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h38m
machine-approver                           4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h44m
machine-config                             4.6.0-0.nightly-2020-07-07-083718   False       False         True       3m59s
marketplace                                4.6.0-0.nightly-2020-07-07-083718   True        False         False      12m
monitoring                                 4.6.0-0.nightly-2020-07-07-083718   True        False         False      149m
network                                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h48m
node-tuning                                4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h48m
openshift-apiserver                        4.6.0-0.nightly-2020-07-07-083718   True        False         True       5h43m
openshift-controller-manager               4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h42m
openshift-samples                          4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h41m
operator-lifecycle-manager                 4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h47m
operator-lifecycle-manager-catalog         4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h47m
operator-lifecycle-manager-packageserver   4.6.0-0.nightly-2020-07-07-083718   True        False         False      13m
service-ca                                 4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h48m
storage                                    4.6.0-0.nightly-2020-07-07-083718   True        False         False      5h42m

    $ oc describe co/machine-config 

Name:         machine-config
Namespace:    
Labels:       <none>
Annotations:  exclude.release.openshift.io/internal-openshift-hosted: true
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-07-07T14:25:54Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:exclude.release.openshift.io/internal-openshift-hosted:
      f:spec:
      f:status:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-07-07T14:25:54Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:extension:
          .:
          f:lastSyncError:
          f:master:
          f:worker:
        f:relatedObjects:
        f:versions:
    Manager:         machine-config-operator
    Operation:       Update
    Time:            2020-07-07T20:16:02Z
  Resource Version:  148084
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/machine-config
  UID:               6fde0884-7b00-4d7a-8a5e-5d36915d1aed
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-07-07T14:32:25Z
    Message:               Cluster version is 4.6.0-0.nightly-2020-07-07-083718
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2020-07-07T20:15:46Z
    Message:               Failed to resync 4.6.0-0.nightly-2020-07-07-083718 because: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 1)
    Reason:                RequiredPoolsFailed
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2020-07-07T20:15:46Z
    Message:               Cluster not available for 4.6.0-0.nightly-2020-07-07-083718
    Status:                False
    Type:                  Available
    Last Transition Time:  2020-07-07T14:32:24Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:
    Last Sync Error:  error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 2, updated: 2, unavailable: 1)
    Master:           pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node ip-10-0-211-227.us-west-2.compute.internal is reporting: \"failed to run pivot: failed to run pivot: exit status 1: failed to run command podman (6 tries): timed out waiting for the condition: running podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 failed: Error: error pulling image \\\"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4\\\": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: Error reading manifest sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown\\n: exit status 125\""
    Worker:           pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node ip-10-0-213-191.us-west-2.compute.internal is reporting: \"failed to run pivot: failed to run pivot: exit status 1: failed to run command podman (6 tries): timed out waiting for the condition: running podman pull -q --authfile /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 failed: Error: error pulling image \\\"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4\\\": unable to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4: Error reading manifest sha256:d36a8e91d2affa7999b0da6d30e01360f9727b7d2cbfd250d0f7d5f385314ca4 in quay.io/openshift-release-dev/ocp-v4.0-art-dev: manifest unknown: manifest unknown\\n: exit status 125\""
  Related Objects:
    Group:     
    Name:      openshift-machine-config-operator
    Resource:  namespaces
    Group:     machineconfiguration.openshift.io
    Name:      
    Resource:  machineconfigpools
    Group:     machineconfiguration.openshift.io
    Name:      
    Resource:  controllerconfigs
    Group:     machineconfiguration.openshift.io
    Name:      
    Resource:  kubeletconfigs
    Group:     machineconfiguration.openshift.io
    Name:      
    Resource:  containerruntimeconfigs
    Group:     machineconfiguration.openshift.io
    Name:      
    Resource:  machineconfigs
  Versions:
    Name:     operator
    Version:  4.6.0-0.nightly-2020-07-07-083718
Events:       <none>

Comment 20 errata-xmlrpc 2020-10-27 16:01:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196