Created attachment 1699473 [details] before update - operators 4.5-rc.2 Description of problem: -------------------------- Tried to update from 4.5.0-rc.2 to 4.5.0-rc.4 in disconnected env without using force flag- when the process came to 84%, it failed with this error: [kni@provisionhost-0-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS ------------------------------------------------------------------------------------------------------------------------------------------ version 4.5.0-rc.2 True True 145m Unable to apply 4.5.0-rc.4: the cluster operator openshift-apiserver is degraded Also, some operators degraded after the process failed: [kni@provisionhost-0-0 ~]$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE ------------------------------------------------------------------------------------------------------ authentication 4.5.0-rc.4 True False False 17h cloud-credential 4.5.0-rc.4 True False False 18h cluster-autoscaler 4.5.0-rc.4 True False False 17h config-operator 4.5.0-rc.4 True False False 17h console 4.5.0-rc.4 True False False 112m csi-snapshot-controller 4.5.0-rc.4 True False False 112m dns 4.5.0-rc.4 True True False 17h etcd 4.5.0-rc.4 True False False 17h image-registry 4.5.0-rc.4 True False False 113m ingress 4.5.0-rc.4 True False False 17h insights 4.5.0-rc.4 True False False 17h kube-apiserver 4.5.0-rc.4 True False False 17h kube-controller-manager 4.5.0-rc.4 True False False 17h kube-scheduler 4.5.0-rc.4 True False False 17h kube-storage-version-migrator 4.5.0-rc.4 True False False 113m machine-api 4.5.0-rc.4 True False False 17h machine-approver 4.5.0-rc.4 True False False 17h machine-config 4.5.0-rc.2 False True True 100m marketplace 4.5.0-rc.4 True False False 111m monitoring 4.5.0-rc.4 True False False 17h network 4.5.0-rc.4 True True True 17h node-tuning 4.5.0-rc.4 True False False 139m openshift-apiserver 4.5.0-rc.4 True False True 0s openshift-controller-manager 4.5.0-rc.4 True False False 17h openshift-samples 4.5.0-rc.4 True False False 130m operator-lifecycle-manager 4.5.0-rc.4 True False False 17h operator-lifecycle-manager-catalog 4.5.0-rc.4 True False False 17h operator-lifecycle-manager-packageserver 4.5.0-rc.4 True False False 93m service-ca 4.5.0-rc.4 True False False 17h storage 4.5.0-rc.4 True False False 139m In addition, when i tried to update again it shows: [kni@provisionhost-0-0 ~]$ oc adm upgrade --to 4.5.0-rc.4 info: Cluster is already at version 4.5.0-rc.4 Although the cluster is still in version 4.5.0-rc.2 [kni@provisionhost-0-0 ~]$ oc describe clusterversion Name: version Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterVersion Metadata: Creation Timestamp: 2020-06-30T14:19:30Z Generation: 6 Managed Fields: API Version: config.openshift.io/v1 Fields Type: FieldsV1 ...... ............ ................. Spec: Channel: candidate-4.5 Cluster ID: 0dce70b4-e916-43b2-979d-ab264cd2a1bc Desired Update: Force: false Image: quay.io/openshift-release-dev/ocp-release@sha256:acdef4d62b87c5a1e256084c55b5cfaae5ca42b7f2c49b69913a509b8954c798 Version: 4.5.0-rc.4 Upstream: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/update_graph Status: Available Updates: <nil> Conditions: Last Transition Time: 2020-06-30T15:10:03Z Message: Done applying 4.5.0-rc.2 Status: True Type: Available Last Transition Time: 2020-07-01T06:41:55Z Message: Cluster operator openshift-apiserver is reporting a failure: APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver Reason: ClusterOperatorDegraded Status: True Type: Failing Last Transition Time: 2020-07-01T06:03:55Z Message: Unable to apply 4.5.0-rc.4: the cluster operator openshift-apiserver is degraded Reason: ClusterOperatorDegraded Status: True Type: Progressing Last Transition Time: 2020-07-01T05:57:10Z Status: True Type: RetrievedUpdates Desired: Force: false Image: quay.io/openshift-release-dev/ocp-release@sha256:acdef4d62b87c5a1e256084c55b5cfaae5ca42b7f2c49b69913a509b8954c798 Version: 4.5.0-rc.4 History: Completion Time: <nil> Image: quay.io/openshift-release-dev/ocp-release@sha256:acdef4d62b87c5a1e256084c55b5cfaae5ca42b7f2c49b69913a509b8954c798 Started Time: 2020-07-01T06:03:55Z State: Partial Verified: true Version: 4.5.0-rc.4 Completion Time: 2020-06-30T15:10:03Z Image: quay.io/openshift-release-dev/ocp-release@sha256:986674e3202ab46c944b02b44c8b836bd0b52372195fa1526cb3d7291579d79a Started Time: 2020-06-30T14:19:39Z State: Completed Verified: false Version: 4.5.0-rc.2 Observed Generation: 6 Version Hash: -TbXeAYSQ04= Events: <none> Version-Release number of selected component (if applicable): --------------------------------------------------------------- Current version: ------------------ quay.io/openshift-release-dev/ocp-release:4.5.0-rc.2-x86_64 Target version: ------------------ quay.io/openshift-release-dev/ocp-release:4.5.0-rc.4-x86_64 How reproducible: ------------------ Steps to Reproduce: --------------------- 1. Run update on OCP4.5 without using force flag: - Mirror the image for update using oc adm release mirror - Create ImageContentSourcePolicy - Create custom update graph in /opt/cached_disconnected_images - Create config map - Point CVO to update graph - oc patch clusterversion/version --patch '{"spec": {"channel": "candidate-4.5"}}' --type=merge - oc adm upgrade --to 4.5.0-rc.4 Actual results: ---------------------- Cluster failed to update to version 4.5.0-rc.4, not all operators are available, cluster is not stable. Expected results: ---------------------- Cluster updated successfully to version 4.5.0-rc.4, all operators are available, cluster is stable. Additional info: ---------------------- Must-gather: https://drive.google.com/file/d/1hw_ylqz2L7zXuYBzQg2agvQf4FqiFrUj/view?usp=sharing Images added.
Created attachment 1699474 [details] openshift apiserver error msg
Created attachment 1699475 [details] oc descripe clusterversion
The MCO has a huge list of failure related to reaching the API: 2020-07-01T06:55:55.585755596Z E0701 06:55:55.585646 4995 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: i/o timeout 2020-07-01T06:56:24.98593869Z I0701 06:56:24.985814 4995 trace.go:116] Trace[919889828]: "Reflector ListAndWatch" name:github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101 (started: 2020-07-01 06:55:54.984097113 +0000 UTC m=+762.181157899) (total time: 30.001675941s): 2020-07-01T06:56:24.98593869Z Trace[919889828]: [30.001675941s] [30.001675941s] END 2020-07-01T06:56:24.98593869Z E0701 06:56:24.985841 4995 reflector.go:178] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: i/o timeout 2020-07-01T06:57:01.074614704Z I0701 06:57:01.074458 4995 trace.go:116] Trace[1465987202]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:135 (started: 2020-07-01 06:56:31.073558086 +0000 UTC m=+798.270618844) (total time: 30.000861779s): 2020-07-01T06:57:01.074614704Z Trace[1465987202]: [30.000861779s] [30.000861779s] END 2020-07-01T06:57:01.074614704Z E0701 06:57:01.074507 4995 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: i/o timeout 2020-07-01T06:57:51.896917749Z I0701 06:57:51.896831 4995 trace.go:116] Trace[1980435746]: "Reflector ListAndWatch" name:github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101 (started: 2020-07-01 06:57:21.895861641 +0000 UTC m=+849.092922432) (total time: 30.000934291s): 2020-07-01T06:57:51.896917749Z Trace[1980435746]: [30.000934291s] [30.000934291s] END 2020-07-01T06:57:51.897039438Z E0701 06:57:51.897021 4995 reflector.go:178] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: i/o timeout 2020-07-01T06:58:30.44373067Z I0701 06:58:30.443647 4995 trace.go:116] Trace[1059014376]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:135 (started: 2020-07-01 06:58:00.442651145 +0000 UTC m=+887.639711898) (total time: 30.000952587s): 2020-07-01T06:58:30.44373067Z Trace[1059014376]: [30.000952587s] [30.000952587s] END 2020-07-01T06:58:30.44373067Z E0701 06:58:30.443678 4995 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: i/o timeout 2020-07-01T06:58:54.623398605Z I0701 06:58:54.623263 4995 trace.go:116] Trace[2050729718]: "Reflector ListAndWatch" name:github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101 (started: 2020-07-01 06:58:24.622346502 +0000 UTC m=+911.819407282) (total time: 30.000866922s): 2020-07-01T06:58:54.623398605Z Trace[2050729718]: [30.000866922s] [30.000866922s] END
preliminary findings: 1) openshift-apiserver is reporting degraded because not all its pods could be scheduled 2) pods could not be scheduled because not all master nodes are available 3) not all master nodes are available because of issues contacting the k8s apiserver (see MCO errors in comment 3) 4) MCO + Networking are also reporting degraded 5) the k8s apiserver itself is reporting available but degraded: conditions: - lastTransitionTime: "2020-07-01T09:32:15Z" message: |- InstallerPodContainerWaitingDegraded: Pod "installer-9-master-0-2" on node "master-0-2" container "installer" is waiting for 13m31.141901586s because "" InstallerPodNetworkingDegraded: Pod "installer-9-master-0-2" on node "master-0-2" observed degraded networking: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-9-master-0-2_openshift-kube-apiserver_e80c4c50-23e5-4332-ba9f-ab705ec3df67_0(b99028b166da8edf9beaa3d25b38251bb5b5b574c0e7a98d31a6d392eb42a054): Multus: [openshift-kube-apiserver/installer-9-master-0-2]: PollImmediate error waiting for ReadinessIndicatorFile: timed out waiting for the condition reason: InstallerPodContainerWaiting_ContainerCreating::InstallerPodNetworking_FailedCreatePodSandBox status: "True" type: Degraded - lastTransitionTime: "2020-07-01T09:24:38Z" message: 'NodeInstallerProgressing: 3 nodes are at revision 8; 0 nodes have achieved new revision 10' reason: NodeInstaller status: "True" type: Progressing - lastTransitionTime: "2020-06-30T14:38:08Z" message: 'StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8; 0 nodes have achieved new revision 10' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2020-06-30T14:36:05Z" reason: AsExpected status: "True" type: Upgradeable
Seeing as rc.5 to rc.6 is updateable I'm moving this to ON_QA ti indicate its being re-tested. If it works, we can close this one.
Per comment 6, the cause is network. Checked the must-gather (via cm/cluster-config-v1 in namespaces/kube-system/core/configmaps.yaml), it is baremetal disconnected OVN env. Moving to Networking component. BTW I already triggered four envs of baremetal disconnected envs with and without OVN for reproducing later.
After retesting again - updating from rc.5 to rc.6 without force flag, this happend: [kni@provisionhost-0-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS ------ -------- --------- ----------- ------ ------- version 4.5.0-rc.5 True True 11h Unable to apply 4.5.0-rc.6: the image may not be safe to use [kni@provisionhost-0-0 ~]$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE ----- ----------- ---------- ------------ ---------- ------- authentication 4.5.0-rc.5 True False False 13h cloud-credential 4.5.0-rc.5 True False False 14h cluster-autoscaler 4.5.0-rc.5 True False False 13h config-operator 4.5.0-rc.5 True False False 13h console 4.5.0-rc.5 True False False 13h csi-snapshot-controller 4.5.0-rc.5 True False False 13h dns 4.5.0-rc.5 True False False 13h etcd 4.5.0-rc.5 True False False 13h image-registry 4.5.0-rc.5 True False False 13h ingress 4.5.0-rc.5 True False False 13h insights 4.5.0-rc.5 True False False 13h kube-apiserver 4.5.0-rc.5 True False False 13h kube-controller-manager 4.5.0-rc.5 True False False 13h kube-scheduler 4.5.0-rc.5 True False False 13h kube-storage-version-migrator 4.5.0-rc.5 True False False 13h machine-api 4.5.0-rc.5 True False False 13h machine-approver 4.5.0-rc.5 True False False 13h machine-config 4.5.0-rc.5 True False False 13h marketplace 4.5.0-rc.5 True False False 13h monitoring 4.5.0-rc.5 True False False 13h network 4.5.0-rc.5 True False False 13h node-tuning 4.5.0-rc.5 True False False 13h openshift-apiserver 4.5.0-rc.5 True False False 72m openshift-controller-manager 4.5.0-rc.5 True False False 13h openshift-samples 4.5.0-rc.5 True False False 13h operator-lifecycle-manager 4.5.0-rc.5 True False False 13h operator-lifecycle-manager-catalog 4.5.0-rc.5 True False False 13h operator-lifecycle-manager-packageserver 4.5.0-rc.5 True False False 13h service-ca 4.5.0-rc.5 True False False 13h storage 4.5.0-rc.5 True False False 13h but when updated with the force flag, the updating succeed.
InstallerPodNetworkingDegraded: Pod "installer-11-master-0-2" on node "master-0-2" observed degraded networking: (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-11-master-0-2_openshift-kube-apiserver_569a19e5-fe46-4e34-9f5e-0ae67b259786_0(c4275101c2593ab24480e17d6b7d36b2b4001a16974d073633f948ffda0cbf11): Multus: [openshift-kube-apiserver/installer-11-master-0-2]: PollImmediate error waiting for ReadinessIndicatorFile: timed out waiting for the condition sounds like networking pod failed to get created due to a node/crio issue? Networking itself reports: status: conditions: - lastTransitionTime: "2020-07-09T14:50:38Z" message: DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2020-07-09T14:39:36Z reason: RolloutHung status: "True" type: Degraded - lastTransitionTime: "2020-07-09T11:29:21Z" status: "True" type: Upgradeable - lastTransitionTime: "2020-07-09T14:38:30Z" message: |- DaemonSet "openshift-multus/multus-admission-controller" is not available (awaiting 1 nodes) DaemonSet "openshift-ovn-kubernetes/ovnkube-node" is not available (awaiting 1 nodes) reason: Deploying status: "True" type: Progressing networking pod ovnkube-node-lqphr is showing: - containerID: cri-o://d6fde6e77032e51c11a18e3e27440b684dea8256fb4fb80a9b44f63c0227a81f image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f3c2711b2f0e762862981c97143e2871b39af1bcde90fdbd5d7147b4a91b764 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f3c2711b2f0e762862981c97143e2871b39af1bcde90fdbd5d7147b4a91b764 lastState: terminated: containerID: cri-o://d6fde6e77032e51c11a18e3e27440b684dea8256fb4fb80a9b44f63c0227a81f exitCode: 1 finishedAt: "2020-07-12T14:02:44Z" message: | + [[ -f /env/master-0-2 ]] + cp -f /usr/libexec/cni/ovn-k8s-cni-overlay /cni-bin-dir/ + ovn_config_namespace=openshift-ovn-kubernetes + retries=0 + true ++ kubectl get ep -n openshift-ovn-kubernetes ovnkube-db -o 'jsonpath={.subsets[0].addresses[0].ip}' Unable to connect to the server: dial tcp: lookup api-int.ocp-edge-cluster-0.qe.lab.redhat.com on 192.168.123.1:53: no such host + db_ip= reason: Error startedAt: "2020-07-12T14:02:44Z" name: ovnkube-node so I guess agree that this seems DNS related.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409
I think we accidentally forgot to pull this BZ from the errata. Re-openning.
The DNS Operator is available but indicates a progressing condition: status: conditions: - lastTransitionTime: "2020-07-09T14:39:52Z" message: All desired DNS DaemonSets available and operand Namespace exists reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2020-07-09T14:38:30Z" message: At least 1 DNS DaemonSet is progressing. reason: Reconciling status: "True" type: Progressing - lastTransitionTime: "2020-07-09T11:35:36Z" message: At least 1 DNS DaemonSet available reason: AsExpected status: "True" type: Available # One of the dns dameonset pods ("dns-default-4dbgg") is unavailable: status: currentNumberScheduled: 5 desiredNumberScheduled: 5 numberAvailable: 4 numberMisscheduled: 0 numberReady: 4 numberUnavailable: 1 observedGeneration: 2 updatedNumberScheduled: 5 # All containers in pod "dns-default-4dbgg" are not ready: status: conditions: - lastProbeTime: null lastTransitionTime: "2020-07-09T13:32:46Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2020-07-09T14:39:05Z" message: 'containers with unready status: [dns kube-rbac-proxy dns-node-resolver]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2020-07-09T14:40:24Z" message: 'containers with unready status: [dns kube-rbac-proxy dns-node-resolver]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2020-07-09T13:32:46Z" status: "True" type: PodScheduled containerStatuses: - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c09633512a460fda547cd079565554ab79cbfbe767c827bba075f05b47e71d4a imageID: "" lastState: {} name: dns ready: false restartCount: 0 started: false state: waiting: reason: ContainerCreating - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b93a3f13057991466caf3ba6517493015299a856c6b752bd49b7d4c294312177 imageID: "" lastState: {} name: dns-node-resolver ready: false restartCount: 0 started: false state: waiting: reason: ContainerCreating - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0b9be905dc8404760427a4bfbb9274545b2fb03774d85cd8ee5d93f847c69293 imageID: "" lastState: {} name: kube-rbac-proxy ready: false restartCount: 0 started: false state: waiting: reason: ContainerCreating 192.168.123.114 is the InternalIP address of node "master-0-2' where pod "dns-default-4dbgg" was scheduled. The node conditions are as expected: conditions: - lastHeartbeatTime: "2020-07-12T14:05:56Z" lastTransitionTime: "2020-07-09T14:40:16Z" message: kubelet has sufficient memory available reason: KubeletHasSufficientMemory status: "False" type: MemoryPressure - lastHeartbeatTime: "2020-07-12T14:05:56Z" lastTransitionTime: "2020-07-09T14:40:16Z" message: kubelet has no disk pressure reason: KubeletHasNoDiskPressure status: "False" type: DiskPressure - lastHeartbeatTime: "2020-07-12T14:05:56Z" lastTransitionTime: "2020-07-09T14:40:16Z" message: kubelet has sufficient PID available reason: KubeletHasSufficientPID status: "False" type: PIDPressure - lastHeartbeatTime: "2020-07-12T14:05:56Z" lastTransitionTime: "2020-07-09T14:40:16Z" message: kubelet is posting ready status reason: KubeletReady status: "True" type: Ready Enets indicate an issue creating the pod network sandbox for pod "dns-default-4dbgg": message: '(combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_dns-default-4dbgg_openshift-dns_df0adbd5-dc00-4367-b02e-07c62a925a4b_0(f771c552839c5276e622b6f0980a84f0ae496a90c39bab1b1157f7dc8d357a6d): Multus: [openshift-dns/dns-default-4dbgg]: PollImmediate error waiting for ReadinessIndicatorFile: timed out waiting for the condition' CRIO logs indicate the same error for dns pod "dns-default-4dbgg": Jul 11 05:41:37.152475 master-0-2 crio[1821]: 2020-07-11T05:41:37Z [error] Multus: [openshift-dns/dns-default-4dbgg]: PollImmediate error waiting for ReadinessIndicatorFile (on del): timed out waiting for the condition Jul 11 05:41:37.154347 master-0-2 crio[1821]: time="2020-07-11 05:41:37.154247983Z" level=error msg="Error deleting network: Multus: [openshift-dns/dns-default-4dbgg]: PollImmediate error waiting for ReadinessIndicatorFile (on del): timed out waiting for the condition" Jul 11 05:41:37.154347 master-0-2 crio[1821]: time="2020-07-11 05:41:37.154332566Z" level=error msg="Error while removing pod from CNI network \"multus-cni-network\": Multus: [openshift-dns/dns-default-4dbgg]: PollImmediate error waiting for ReadinessIndicatorFile (on del): timed out waiting for the condition" Jul 11 05:41:37.154557 master-0-2 crio[1821]: time="2020-07-11 05:41:37.154451137Z" level=error msg="Error stopping network on cleanup: failed to destroy network for pod sandbox k8s_dns-default-4dbgg_openshift-dns_df0adbd5-dc00-4367-b02e-07c62a925a4b_0(9064208bb220d12adb8a12c24492db4aea36419f66f0f6b932a065925429ffb2): Multus: [openshift-dns/dns-default-4dbgg]: PollImmediate error waiting for ReadinessIndicatorFile (on del): timed out waiting for the condition" id=e6548c60-b32b-41dd-a4a7-ebde016067e7 name=/runtime.v1alpha2.RuntimeService/RunPodSandbox This BZ appears to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1805444. Maybe the fix in BZ 1805444 needs to be ported to OVN? Reassigning to the SDN team for confirmation and further investigation.
Assigned to Doug to see if it's the same (or similar) to the other issue that Dane found.
A `PollImmediate error waiting for ReadinessIndicatorFile` means that (in the context of ovn-kubernetes in OCP) the file `/var/run/multus/cni/net.d/10-ovn-kubernetes.conf` was not found by Multus CNI. This is the "readiness indicator file" -- and indicates the readiness of the default network (in this case, ovn-kubernetes) is not ready, and there may be some failure of the process that writes that CNI configuration file to disk. Without this, we can't be certain that OVN is ready to handle network traffic from workloads, so Multus waits for this readiness indication from the default network CNI configuration file.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409
(In reply to Eric Paris from comment #20) > I think we accidentally forgot to pull this BZ from the errata. Re-openning. Don't do that. Once it's been shipped in an errata it can never be removed or shipped again. Bugs shipped by errata are intended to be immutable. It needs to be cloned to proceed. As the ET comment indicates: > If the solution does not work for you, open a new bug report. I've cloned it as https://bugzilla.redhat.com/show_bug.cgi?id=1867718 I'll return the bug to CLOSED ERRATA although it is clearly not actually fixed.