Created attachment 1817693 [details] must gather logs Description of problem: Created a single node cluster on aws using cluster bot and once it up then I tried to add mco resource but cluster didn't come up after long wait. Attached gather logs. Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-24-235829 How reproducible: - Create a single node cluster - Apply mco resource which make the node reboot. ``` apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: chronyd-mask spec: config: ignition: version: 3.2.0 systemd: units: - name: chronyd.service mask: true ``` Actual results: ``` bash-3.2$ oc get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-5c6cff879d-bp5tg 0/1 ContainerCreating 1 (26m ago) 28m openshift-apiserver apiserver-f9999f5c7-twwjj 0/2 PodInitializing 2 23m openshift-authentication-operator authentication-operator-6b6d58ff5b-rzmr2 0/1 ContainerCreating 1 (26m ago) 28m openshift-authentication oauth-openshift-66cb75d9b6-xjv7k 0/1 ContainerCreating 1 19m openshift-cloud-controller-manager-operator cluster-cloud-controller-manager-operator-ddff5fc56-gfzmb 0/2 ContainerCreating 2 28m openshift-cloud-credential-operator cloud-credential-operator-978ff46f8-7rqz6 0/2 ContainerCreating 2 29m openshift-cloud-credential-operator pod-identity-webhook-7cfdd7dfb9-cfn4r 0/1 ContainerCreating 1 23m openshift-cluster-csi-drivers aws-ebs-csi-driver-controller-69fdc967f8-2mw9k 0/11 CreateContainerConfigError 27 (19m ago) 27m openshift-cluster-csi-drivers aws-ebs-csi-driver-node-4pfjw 0/3 ContainerCreating 3 27m openshift-cluster-csi-drivers aws-ebs-csi-driver-operator-594c88bd99-n7s5x 0/1 ContainerCreating 1 27m openshift-cluster-machine-approver machine-approver-6ff87b47b9-c55c9 0/2 ContainerCreating 2 28m openshift-cluster-node-tuning-operator cluster-node-tuning-operator-6b4d57f5f6-5p82k 0/1 ContainerCreating 1 28m openshift-cluster-node-tuning-operator tuned-j2874 0/1 ContainerCreating 1 26m openshift-cluster-samples-operator cluster-samples-operator-7b7468bc49-w8744 0/2 ContainerCreating 2 23m openshift-cluster-storage-operator cluster-storage-operator-77d66554c8-p2mwm 0/1 ContainerCreating 1 (25m ago) 28m openshift-cluster-storage-operator csi-snapshot-controller-85b5595f65-mr87d 0/1 ContainerCreating 1 27m openshift-cluster-storage-operator csi-snapshot-controller-operator-7458db4694-b87v2 0/1 ContainerCreating 1 28m openshift-cluster-storage-operator csi-snapshot-webhook-696b489f7b-xls44 0/1 ContainerCreating 1 27m openshift-cluster-version cluster-version-operator-5b4ccd6696-k58kt 0/1 ContainerCreating 1 28m openshift-config-operator openshift-config-operator-78b7fcbc77-7fsls 0/1 ContainerCreating 1 (26m ago) 29m openshift-console-operator console-operator-75fcc679d5-949js 0/1 ContainerCreating 1 23m openshift-console console-dc6f59bd-zqgfj 0/1 ContainerCreating 1 20m openshift-console downloads-77c996cd45-7w866 0/1 ContainerCreating 1 (20m ago) 22m openshift-controller-manager-operator openshift-controller-manager-operator-5fb85db56b-4gxdf 0/1 ContainerCreating 1 (25m ago) 28m openshift-controller-manager controller-manager-mkq9d 0/1 ContainerCreating 1 20m openshift-dns-operator dns-operator-7474bfcff4-njvl4 0/2 ContainerCreating 2 28m openshift-dns dns-default-cjwht 0/2 ContainerCreating 2 26m openshift-dns node-resolver-f8js5 0/1 ContainerCreating 1 26m openshift-etcd-operator etcd-operator-6cb74bfc4d-wk78x 0/1 ContainerCreating 1 (25m ago) 29m openshift-etcd etcd-ip-10-0-169-138.us-west-1.compute.internal 4/4 Running 6 26m openshift-etcd installer-2-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 26m openshift-image-registry cluster-image-registry-operator-5cc6b97c9b-xcqqr 0/1 ContainerCreating 1 28m openshift-image-registry image-registry-7b79d5789-kdv62 0/1 Error 1 22m openshift-image-registry node-ca-ctpsr 0/1 ContainerCreating 1 22m openshift-ingress-canary ingress-canary-bxkhw 0/1 ContainerCreating 1 23m openshift-ingress-operator ingress-operator-54487bb8bc-czgtc 0/2 ContainerCreating 4 (24m ago) 28m openshift-ingress router-default-549f6d7845-6j9t7 0/1 Pending 0 6m8s openshift-ingress router-default-549f6d7845-v5chg 0/1 Preempting 1 23m openshift-insights insights-operator-54f5c864cd-fgh65 0/1 ContainerCreating 1 (26m ago) 28m openshift-kube-apiserver-operator kube-apiserver-operator-dccf66bf-wjjzd 0/1 ContainerCreating 1 (26m ago) 28m openshift-kube-apiserver installer-3-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 25m openshift-kube-apiserver installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 23m openshift-kube-apiserver installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 22m openshift-kube-apiserver installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 20m openshift-kube-apiserver kube-apiserver-ip-10-0-169-138.us-west-1.compute.internal 4/5 Running 5 19m openshift-kube-apiserver revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 19m openshift-kube-controller-manager-operator kube-controller-manager-operator-6c749b4c77-8bzdd 0/1 ContainerCreating 1 (26m ago) 28m openshift-kube-controller-manager installer-3-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 26m openshift-kube-controller-manager installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 25m openshift-kube-controller-manager installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 24m openshift-kube-controller-manager installer-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 22m openshift-kube-controller-manager installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 20m openshift-kube-controller-manager installer-8-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 16m openshift-kube-controller-manager kube-controller-manager-ip-10-0-169-138.us-west-1.compute.internal 4/4 Running 0 16m openshift-kube-controller-manager revision-pruner-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 24m openshift-kube-controller-manager revision-pruner-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 23m openshift-kube-controller-manager revision-pruner-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 21m openshift-kube-controller-manager revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 19m openshift-kube-controller-manager revision-pruner-8-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 16m openshift-kube-scheduler-operator openshift-kube-scheduler-operator-569fcddbd9-sp8bb 0/1 ContainerCreating 1 (25m ago) 28m openshift-kube-scheduler installer-2-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 26m openshift-kube-scheduler installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 26m openshift-kube-scheduler installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 24m openshift-kube-scheduler installer-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 22m openshift-kube-scheduler installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 20m openshift-kube-scheduler openshift-kube-scheduler-ip-10-0-169-138.us-west-1.compute.internal 2/3 Running 3 20m openshift-kube-scheduler revision-pruner-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 23m openshift-kube-scheduler revision-pruner-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 20m openshift-kube-scheduler revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 18m openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-6bb764d44c-h6qlm 0/1 ContainerCreating 1 (25m ago) 28m openshift-kube-storage-version-migrator migrator-5785b7f85f-b5mlr 0/1 ContainerCreating 1 27m openshift-machine-api cluster-autoscaler-operator-6967d6fcc5-lnskj 0/2 ContainerCreating 2 28m openshift-machine-api cluster-baremetal-operator-5846d55944-sbsq7 0/2 ContainerCreating 2 (25m ago) 28m openshift-machine-api machine-api-controllers-65f8d75f77-6lfsh 0/7 ContainerCreating 7 26m openshift-machine-api machine-api-operator-644676cbf8-rbx4w 0/2 ContainerCreating 2 28m openshift-machine-config-operator machine-config-controller-7bd8cc86fc-cftqw 0/1 ContainerCreating 1 26m openshift-machine-config-operator machine-config-daemon-wtwr9 0/2 ContainerCreating 2 27m openshift-machine-config-operator machine-config-operator-66759b58c8-gg7lx 0/1 ContainerCreating 1 28m openshift-machine-config-operator machine-config-server-42pl9 0/1 ContainerCreating 1 25m openshift-marketplace certified-operators-p8mzk 0/1 Pending 0 3m48s openshift-marketplace certified-operators-pnfzb 0/1 ContainerCreating 1 25m openshift-marketplace community-operators-9b9d9 0/1 ContainerCreating 1 25m openshift-marketplace community-operators-gkmfd 0/1 Pending 0 68s openshift-marketplace marketplace-operator-75b4666dd-tr94c 0/1 ContainerCreating 1 28m openshift-marketplace redhat-marketplace-kzjnr 0/1 Pending 0 4m37s openshift-marketplace redhat-marketplace-wqd4h 0/1 ContainerCreating 1 25m openshift-marketplace redhat-operators-jvtl2 0/1 Pending 0 66s openshift-marketplace redhat-operators-qndj2 0/1 ContainerCreating 1 25m openshift-monitoring alertmanager-main-0 0/5 ContainerCreating 5 20m openshift-monitoring cluster-monitoring-operator-fccf5697-nmzg6 0/2 ContainerCreating 5 (26m ago) 28m openshift-monitoring grafana-5c6cbf4977-wqhfl 0/2 ContainerCreating 2 20m openshift-monitoring kube-state-metrics-59b87859b8-fxpfg 0/3 ContainerCreating 3 26m openshift-monitoring node-exporter-gxj85 0/2 PodInitializing 2 26m openshift-monitoring openshift-state-metrics-66585c8c7c-8vsp5 0/3 ContainerCreating 3 26m openshift-monitoring prometheus-adapter-6fb846d8f6-4mw6k 0/1 ContainerCreating 1 24m openshift-monitoring prometheus-k8s-0 0/7 PodInitializing 7 20m openshift-monitoring prometheus-operator-78b5644557-mw5hl 0/2 ContainerCreating 2 (26m ago) 27m openshift-monitoring telemeter-client-8895d564d-th7cr 0/3 ContainerCreating 3 23m openshift-monitoring thanos-querier-768dcfcc7d-j8mqt 0/5 ContainerCreating 5 20m openshift-multus multus-additional-cni-plugins-jfwnl 0/1 PodInitializing 1 28m openshift-multus multus-admission-controller-42xbc 0/2 ContainerCreating 2 27m openshift-multus multus-qp2bw 0/1 ContainerCreating 1 28m openshift-multus network-metrics-daemon-jdn4m 0/2 ContainerCreating 2 28m openshift-network-diagnostics network-check-source-75749bc6b4-q9gx8 0/1 ContainerCreating 1 28m openshift-network-diagnostics network-check-target-65g85 0/1 Pending 0 6m42s openshift-network-operator network-operator-59c687c84-6cj9d 0/1 ContainerCreating 1 28m openshift-oauth-apiserver apiserver-55dc9cbf74-qbwhw 0/1 PodInitializing 2 (24m ago) 25m openshift-operator-lifecycle-manager catalog-operator-869c9cf896-v8c7t 0/1 ContainerCreating 1 28m openshift-operator-lifecycle-manager collect-profiles-27164550--1-bk8pf 0/1 Completed 0 19m openshift-operator-lifecycle-manager collect-profiles-27164565--1-npx7f 0/1 Pending 0 4m13s openshift-operator-lifecycle-manager olm-operator-75c5889c65-zfvq8 0/1 ContainerCreating 1 28m openshift-operator-lifecycle-manager package-server-manager-79d9cf4c5b-z78g8 0/1 ContainerCreating 1 28m openshift-operator-lifecycle-manager packageserver-545bbd88bd-mjg9p 0/1 ContainerCreating 1 26m openshift-sdn sdn-controller-2gcfk 0/1 ContainerCreating 1 28m openshift-sdn sdn-sz6ns 0/2 ContainerCreating 2 28m openshift-service-ca-operator service-ca-operator-d6bb8446c-n5xsg 0/1 ContainerCreating 1 (25m ago) 28m openshift-service-ca service-ca-68994c6756-kwr8d 0/1 ContainerCreating 1 27m $ oc adm must-gather [must-gather ] OUT the server is currently unable to handle the request (get imagestreams.image.openshift.io must-gather) [must-gather ] OUT [must-gather ] OUT Using must-gather plug-in image: registry.redhat.io/openshift4/ose-must-gather:latest When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: 67e57b0d-c303-441d-9a5b-ad6f2e536886 ClusterVersion: Stable at "4.9.0-0.nightly-2021-08-24-235829" ClusterOperators: clusteroperator/authentication is not available (APIServerDeploymentAvailable: no apiserver.openshift-oauth-apiserver pods available on any node. APIServicesAvailable: PreconditionNotReady OAuthServerServiceEndpointAccessibleControllerAvailable: Get "https://172.30.16.204:443/healthz": dial tcp 172.30.16.204:443: connect: connection refused) because APIServerDeploymentDegraded: 1 of 1 requested instances are unavailable for apiserver.openshift-oauth-apiserver (crashlooping container is waiting in apiserver-55dc9cbf74-qbwhw pod) IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server OAuthServerServiceEndpointAccessibleControllerDegraded: Get "https://172.30.16.204:443/healthz": dial tcp 172.30.16.204:443: connect: connection refused OAuthServerServiceEndpointsEndpointAccessibleControllerDegraded: oauth service endpoints are not ready clusteroperator/csi-snapshot-controller is not available (Available: Waiting for Deployment to deploy csi-snapshot-controller pods CSISnapshotWebhookControllerAvailable: Waiting for a validating webhook Deployment pod to start) because All is well clusteroperator/dns is not available (DNS "default" is unavailable.) because DNS default is degraded clusteroperator/image-registry is not available (Available: The registry is ready NodeCADaemonAvailable: The daemon set node-ca does not have available replicas ImagePrunerAvailable: Pruner CronJob has been created) because clusteroperator/kube-storage-version-migrator is not available (Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available) because All is well clusteroperator/machine-config is not upgradeable because One or more machine config pools are updating, please see `oc get mcp` for further details clusteroperator/network is progressing: DaemonSet "openshift-multus/multus" is not available (awaiting 1 nodes) DaemonSet "openshift-multus/multus-additional-cni-plugins" is not available (awaiting 1 nodes) DaemonSet "openshift-multus/network-metrics-daemon" is waiting for other operators to become ready DaemonSet "openshift-multus/multus-admission-controller" is waiting for other operators to become ready DaemonSet "openshift-sdn/sdn-controller" is not available (awaiting 1 nodes) DaemonSet "openshift-sdn/sdn" is not available (awaiting 1 nodes) DaemonSet "openshift-network-diagnostics/network-check-target" is waiting for other operators to become ready Deployment "openshift-network-diagnostics/network-check-source" is waiting for other operators to become ready clusteroperator/node-tuning is not available (DaemonSet "tuned" has no available Pod(s)) because DaemonSet "tuned" available clusteroperator/openshift-apiserver is not available (APIServerDeploymentAvailable: no apiserver.openshift-apiserver pods available on any node. APIServicesAvailable: PreconditionNotReady) because APIServerDeploymentDegraded: 1 of 1 requested instances are unavailable for apiserver.openshift-apiserver (2 containers are waiting in apiserver-f9999f5c7-twwjj pod) clusteroperator/openshift-controller-manager is not available (Available: no daemon pods available on any node.) because All is well clusteroperator/operator-lifecycle-manager-packageserver is not available (ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver observed in phase Failed with reason: InstallCheckFailed, message: install timeout) because clusteroperator/service-ca is progressing: Progressing: Progressing: service-ca does not have available replicas clusteroperator/storage is not available (AWSEBSCSIDriverOperatorCRAvailable: AWSEBSDriverControllerServiceControllerAvailable: Waiting for Deployment AWSEBSCSIDriverOperatorCRAvailable: AWSEBSDriverNodeServiceControllerAvailable: Waiting for the DaemonSet to deploy the CSI Node Service) because [container "thanos-query" in pod "thanos-querier-768dcfcc7d-j8mqt" is terminated, previous terminated container "thanos-query" in pod "thanos-querier-768dcfcc7d-j8mqt" not found, previous terminated container "oauth-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" not found, container "oauth-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" is terminated, previous terminated container "kube-rbac-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" not found, container "kube-rbac-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" is terminated, previous terminated container "prom-label-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" not found, container "prom-label-proxy" in pod "thanos-querier-768dcfcc7d-j8mqt" is terminated, previous terminated container "kube-rbac-proxy-rules" in pod "thanos-querier-768dcfcc7d-j8mqt" not found, container "kube-rbac-proxy-rules" in pod "thanos-querier-768dcfcc7d-j8mqt" is terminated]], skipping gathering namespaces/openshift-multus due to error: one or more errors ocurred while gathering pod-specific data for namespace: openshift-multus [one or more errors ocurred while gathering container data for pod multus-additional-cni-plugins-jfwnl: [container "kube-multus-additional-cni-plugins" in pod "multus-additional-cni-plugins-jfwnl" is terminated, previous terminated container "kube-multus-additional-cni-plugins" in pod "multus-additional-cni-plugins-jfwnl" not found], one or more errors ocurred while gathering container data for pod multus-admission-controller-42xbc: [previous terminated container "multus-admission-controller" in pod "multus-admission-controller-42xbc" not found, container "multus-admission-controller" in pod "multus-admission-controller-42xbc" is terminated, previous terminated container "kube-rbac-proxy" in pod "multus-admission-controller-42xbc" not found, container "kube-rbac-proxy" in pod "multus-admission-controller-42xbc" is terminated], one or more errors ocurred while gathering container data for pod multus-qp2bw: [container "kube-multus" in pod "multus-qp2bw" is terminated, previous terminated container "kube-multus" in pod "multus-qp2bw" not found], one or more errors ocurred while gathering container data for pod network-metrics-daemon-jdn4m: [previous terminated container "network-metrics-daemon" in pod "network-metrics-daemon-jdn4m" not found, container "network-metrics-daemon" in pod "network-metrics-daemon-jdn4m" is terminated, previous terminated container "kube-rbac-proxy" in pod "network-metrics-daemon-jdn4m" not found, container "kube-rbac-proxy" in pod "network-metrics-daemon-jdn4m" is terminated]], skipping gathering namespaces/openshift-sdn due to error: one or more errors ocurred while gathering pod-specific data for namespace: openshift-sdn [one or more errors ocurred while gathering container data for pod sdn-controller-2gcfk: [container "sdn-controller" in pod "sdn-controller-2gcfk" is terminated, previous terminated container "sdn-controller" in pod "sdn-controller-2gcfk" not found], one or more errors ocurred while gathering container data for pod sdn-sz6ns: [container "sdn" in pod "sdn-sz6ns" is terminated, previous terminated container "sdn" in pod "sdn-sz6ns" not found, container "kube-rbac-proxy" in pod "sdn-sz6ns" is terminated, previous terminated container "kube-rbac-proxy" in pod "sdn-sz6ns" not found]], skipping gathering EgressFirewall.k8s.ovn.org due to error: the server doesn't have a resource type "EgressFirewall", skipping gathering EgressIP.k8s.ovn.org due to error: the server doesn't have a resource type "EgressIP", skipping gathering endpoints/host-etcd-2 due to error: endpoints "host-etcd-2" not found, skipping gathering namespaces/openshift-cluster-csi-drivers due to error: one or more errors ocurred while gathering pod-specific data for namespace: openshift-cluster-csi-drivers one or more errors ocurred while gathering container data for pod aws-ebs-csi-driver-controller-69fdc967f8-2mw9k: [previous terminated container "csi-driver" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "csi-driver" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, previous terminated container "driver-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "driver-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, previous terminated container "provisioner-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "provisioner-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, previous terminated container "attacher-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "attacher-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, previous terminated container "resizer-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "resizer-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, container "snapshotter-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated, previous terminated container "snapshotter-kube-rbac-proxy" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, previous terminated container "csi-liveness-probe" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" not found, container "csi-liveness-probe" in pod "aws-ebs-csi-driver-controller-69fdc967f8-2mw9k" is terminated], skipping gathering namespaces/openshift-manila-csi-driver due to error: namespaces "openshift-manila-csi-driver" not found]error: gather did not start for pod must-gather-zqlvj: timed out waiting for the condition bash-3.2$ oc get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-5c6cff879d-bp5tg 1/1 Running 2 137m openshift-apiserver apiserver-f9999f5c7-twwjj 2/2 Running 2 131m openshift-authentication-operator authentication-operator-6b6d58ff5b-rzmr2 1/1 Running 3 (102m ago) 137m openshift-authentication oauth-openshift-66cb75d9b6-xjv7k 1/1 Running 1 128m openshift-cloud-controller-manager-operator cluster-cloud-controller-manager-operator-ddff5fc56-gfzmb 2/2 Running 2 137m openshift-cloud-credential-operator cloud-credential-operator-978ff46f8-7rqz6 2/2 Running 2 137m openshift-cloud-credential-operator pod-identity-webhook-7cfdd7dfb9-cfn4r 1/1 Running 1 131m openshift-cluster-csi-drivers aws-ebs-csi-driver-controller-69fdc967f8-2mw9k 0/11 CreateContainerConfigError 27 (128m ago) 135m openshift-cluster-csi-drivers aws-ebs-csi-driver-node-4pfjw 3/3 Running 3 135m openshift-cluster-csi-drivers aws-ebs-csi-driver-operator-594c88bd99-n7s5x 1/1 Running 1 135m openshift-cluster-machine-approver machine-approver-6ff87b47b9-c55c9 2/2 Running 3 (115m ago) 137m openshift-cluster-node-tuning-operator cluster-node-tuning-operator-6b4d57f5f6-5p82k 1/1 Running 1 137m openshift-cluster-node-tuning-operator tuned-j2874 1/1 Running 1 134m openshift-cluster-samples-operator cluster-samples-operator-7b7468bc49-w8744 2/2 Running 2 131m openshift-cluster-storage-operator cluster-storage-operator-77d66554c8-p2mwm 1/1 Running 2 137m openshift-cluster-storage-operator csi-snapshot-controller-85b5595f65-mr87d 1/1 Running 1 135m openshift-cluster-storage-operator csi-snapshot-controller-operator-7458db4694-b87v2 1/1 Running 1 137m openshift-cluster-storage-operator csi-snapshot-webhook-696b489f7b-xls44 1/1 Running 1 136m openshift-cluster-version cluster-version-operator-5b4ccd6696-k58kt 1/1 Running 1 136m openshift-config-operator openshift-config-operator-78b7fcbc77-7fsls 1/1 Running 2 137m openshift-console-operator console-operator-75fcc679d5-949js 1/1 Running 2 (105m ago) 131m openshift-console console-dc6f59bd-zqgfj 0/1 Running 13 (3m1s ago) 129m openshift-console downloads-77c996cd45-7w866 1/1 Running 2 130m openshift-controller-manager-operator openshift-controller-manager-operator-5fb85db56b-4gxdf 1/1 Running 2 137m openshift-controller-manager controller-manager-mkq9d 1/1 Running 1 129m openshift-dns-operator dns-operator-7474bfcff4-njvl4 2/2 Running 2 137m openshift-dns dns-default-cjwht 2/2 Running 2 134m openshift-dns node-resolver-f8js5 1/1 Running 1 134m openshift-etcd-operator etcd-operator-6cb74bfc4d-wk78x 1/1 Running 2 137m openshift-etcd etcd-ip-10-0-169-138.us-west-1.compute.internal 4/4 Running 6 134m openshift-etcd installer-2-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 135m openshift-image-registry cluster-image-registry-operator-5cc6b97c9b-xcqqr 1/1 Running 1 137m openshift-image-registry image-registry-7b79d5789-kdv62 0/1 Error 1 130m openshift-image-registry node-ca-ctpsr 1/1 Running 1 130m openshift-ingress-canary ingress-canary-bxkhw 1/1 Running 1 131m openshift-ingress-operator ingress-operator-54487bb8bc-czgtc 2/2 Running 6 (114m ago) 137m openshift-ingress router-default-549f6d7845-6j9t7 1/1 Terminating 0 114m openshift-ingress router-default-549f6d7845-v5chg 0/1 Error 1 131m openshift-insights insights-operator-54f5c864cd-fgh65 1/1 Running 2 137m openshift-kube-apiserver-operator kube-apiserver-operator-dccf66bf-wjjzd 1/1 Running 2 137m openshift-kube-apiserver installer-3-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 133m openshift-kube-apiserver installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 131m openshift-kube-apiserver installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 130m openshift-kube-apiserver installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 128m openshift-kube-apiserver kube-apiserver-ip-10-0-169-138.us-west-1.compute.internal 5/5 Running 5 128m openshift-kube-apiserver revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 127m openshift-kube-controller-manager-operator kube-controller-manager-operator-6c749b4c77-8bzdd 1/1 Running 2 137m openshift-kube-controller-manager installer-3-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 135m openshift-kube-controller-manager installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 134m openshift-kube-controller-manager installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 132m openshift-kube-controller-manager installer-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 130m openshift-kube-controller-manager installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 129m openshift-kube-controller-manager installer-8-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 125m openshift-kube-controller-manager kube-controller-manager-ip-10-0-169-138.us-west-1.compute.internal 4/4 Running 4 125m openshift-kube-controller-manager revision-pruner-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 132m openshift-kube-controller-manager revision-pruner-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 131m openshift-kube-controller-manager revision-pruner-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 129m openshift-kube-controller-manager revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 128m openshift-kube-controller-manager revision-pruner-8-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 125m openshift-kube-scheduler-operator openshift-kube-scheduler-operator-569fcddbd9-sp8bb 1/1 Running 2 137m openshift-kube-scheduler installer-2-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 135m openshift-kube-scheduler installer-4-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 135m openshift-kube-scheduler installer-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 132m openshift-kube-scheduler installer-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 130m openshift-kube-scheduler installer-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 129m openshift-kube-scheduler openshift-kube-scheduler-ip-10-0-169-138.us-west-1.compute.internal 3/3 Running 3 128m openshift-kube-scheduler revision-pruner-5-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 131m openshift-kube-scheduler revision-pruner-6-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 129m openshift-kube-scheduler revision-pruner-7-ip-10-0-169-138.us-west-1.compute.internal 0/1 Completed 0 127m openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-6bb764d44c-h6qlm 1/1 Running 2 137m openshift-kube-storage-version-migrator migrator-5785b7f85f-b5mlr 1/1 Running 1 136m openshift-machine-api cluster-autoscaler-operator-6967d6fcc5-lnskj 2/2 Running 2 137m openshift-machine-api cluster-baremetal-operator-5846d55944-sbsq7 2/2 Running 4 (114m ago) 137m openshift-machine-api machine-api-controllers-65f8d75f77-6lfsh 7/7 Running 7 134m openshift-machine-api machine-api-operator-644676cbf8-rbx4w 2/2 Running 2 137m openshift-machine-config-operator machine-config-controller-7bd8cc86fc-cftqw 1/1 Running 1 134m openshift-machine-config-operator machine-config-daemon-wtwr9 2/2 Running 2 136m openshift-machine-config-operator machine-config-operator-66759b58c8-gg7lx 1/1 Running 1 137m openshift-machine-config-operator machine-config-server-42pl9 1/1 Running 1 134m openshift-marketplace certified-operators-pnfzb 1/1 Running 1 134m openshift-marketplace community-operators-9b9d9 1/1 Running 1 134m openshift-marketplace marketplace-operator-75b4666dd-tr94c 1/1 Running 1 137m openshift-marketplace redhat-marketplace-wqd4h 1/1 Running 1 134m openshift-marketplace redhat-operators-qndj2 1/1 Running 1 134m openshift-monitoring alertmanager-main-0 5/5 Running 5 129m openshift-monitoring cluster-monitoring-operator-fccf5697-nmzg6 2/2 Running 6 137m openshift-monitoring grafana-5c6cbf4977-wqhfl 2/2 Running 2 129m openshift-monitoring kube-state-metrics-59b87859b8-fxpfg 3/3 Running 3 135m openshift-monitoring node-exporter-gxj85 2/2 Running 2 135m openshift-monitoring openshift-state-metrics-66585c8c7c-8vsp5 3/3 Running 3 135m openshift-monitoring prometheus-adapter-6fb846d8f6-4mw6k 1/1 Running 1 132m openshift-monitoring prometheus-k8s-0 7/7 Running 7 129m openshift-monitoring prometheus-operator-78b5644557-mw5hl 2/2 Running 3 136m openshift-monitoring telemeter-client-8895d564d-th7cr 3/3 Running 3 131m openshift-monitoring thanos-querier-768dcfcc7d-j8mqt 5/5 Running 5 129m openshift-multus multus-additional-cni-plugins-jfwnl 1/1 Running 1 137m openshift-multus multus-admission-controller-42xbc 2/2 Running 2 136m openshift-multus multus-qp2bw 1/1 Running 1 137m openshift-multus network-metrics-daemon-jdn4m 2/2 Running 2 137m openshift-network-diagnostics network-check-source-75749bc6b4-q9gx8 1/1 Running 1 136m openshift-network-diagnostics network-check-target-65g85 1/1 Running 0 115m openshift-network-operator network-operator-59c687c84-6cj9d 1/1 Running 1 137m openshift-oauth-apiserver apiserver-55dc9cbf74-qbwhw 1/1 Running 3 134m openshift-operator-lifecycle-manager catalog-operator-869c9cf896-v8c7t 1/1 Running 1 137m openshift-operator-lifecycle-manager collect-profiles-27164640--1-fqvpj 0/1 Completed 0 37m openshift-operator-lifecycle-manager collect-profiles-27164655--1-wwgn2 0/1 Completed 0 22m openshift-operator-lifecycle-manager collect-profiles-27164670--1-z8mhj 0/1 Completed 0 7m52s openshift-operator-lifecycle-manager olm-operator-75c5889c65-zfvq8 1/1 Running 1 137m openshift-operator-lifecycle-manager package-server-manager-79d9cf4c5b-z78g8 1/1 Running 1 137m openshift-operator-lifecycle-manager packageserver-545bbd88bd-mjg9p 1/1 Running 1 135m openshift-sdn sdn-controller-2gcfk 1/1 Running 1 136m openshift-sdn sdn-sz6ns 2/2 Running 2 136m openshift-service-ca-operator service-ca-operator-d6bb8446c-n5xsg 1/1 Running 2 137m openshift-service-ca service-ca-68994c6756-kwr8d 1/1 Running 1 135m $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.9.0-0.nightly-2021-08-24-235829 False False True 115m OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.ci-ln-ssknvhb-d5d6b.origin-ci-int-aws.dev.rhcloud.com/healthz": EOF baremetal 4.9.0-0.nightly-2021-08-24-235829 True False False 136m cloud-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 139m cloud-credential 4.9.0-0.nightly-2021-08-24-235829 True False False 139m cluster-autoscaler 4.9.0-0.nightly-2021-08-24-235829 True False False 136m config-operator 4.9.0-0.nightly-2021-08-24-235829 True False False 138m console 4.9.0-0.nightly-2021-08-24-235829 False True False 55m DeploymentAvailable: 0 pods available for console deployment RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.ci-ln-ssknvhb-d5d6b.origin-ci-int-aws.dev.rhcloud.com): Get "https://console-openshift-console.apps.ci-ln-ssknvhb-d5d6b.origin-ci-int-aws.dev.rhcloud.com": EOF csi-snapshot-controller 4.9.0-0.nightly-2021-08-24-235829 True False False 56m dns 4.9.0-0.nightly-2021-08-24-235829 True False False 57m etcd 4.9.0-0.nightly-2021-08-24-235829 True False False 135m image-registry 4.9.0-0.nightly-2021-08-24-235829 False True True 56m Available: The deployment does not have available replicas NodeCADaemonAvailable: The daemon set node-ca has available replicas ImagePrunerAvailable: Pruner CronJob has been created ingress 4.9.0-0.nightly-2021-08-24-235829 False True True 56m The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.) insights 4.9.0-0.nightly-2021-08-24-235829 True False False 132m kube-apiserver 4.9.0-0.nightly-2021-08-24-235829 True False False 134m kube-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 135m kube-scheduler 4.9.0-0.nightly-2021-08-24-235829 True False False 134m kube-storage-version-migrator 4.9.0-0.nightly-2021-08-24-235829 True False False 57m machine-api 4.9.0-0.nightly-2021-08-24-235829 True False False 133m machine-approver 4.9.0-0.nightly-2021-08-24-235829 True False False 137m machine-config 4.9.0-0.nightly-2021-08-24-235829 True False False 57m marketplace 4.9.0-0.nightly-2021-08-24-235829 True False False 137m monitoring 4.9.0-0.nightly-2021-08-24-235829 True False False 55m network 4.9.0-0.nightly-2021-08-24-235829 True False False 138m node-tuning 4.9.0-0.nightly-2021-08-24-235829 True False False 57m openshift-apiserver 4.9.0-0.nightly-2021-08-24-235829 True False False 56m openshift-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 56m openshift-samples 4.9.0-0.nightly-2021-08-24-235829 True False False 131m operator-lifecycle-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 137m operator-lifecycle-manager-catalog 4.9.0-0.nightly-2021-08-24-235829 True False False 137m operator-lifecycle-manager-packageserver 4.9.0-0.nightly-2021-08-24-235829 True False False 56m service-ca 4.9.0-0.nightly-2021-08-24-235829 True False False 138m storage 4.9.0-0.nightly-2021-08-24-235829 False True False 114m AWSEBSCSIDriverOperatorCRAvailable: AWSEBSDriverControllerServiceControllerAvailable: Waiting for Deployment ``` Expected results: Should successful. Additional info:
I can reproduce this reliably with a reboot alone (without having to apply any machine config). It seems to happen almost every time on reboot with single node + this nightly: - Create a single node cluster (I used cluster bot) - Reboot it (oc debug node; chroot /host; shutdown -r 1; ) My results are similar to Praveen's: NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.9.0-0.nightly-2021-08-24-235829 False False True 95m OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.ci-ln-pgbd8zb-d5d6b.origin-ci-int-aws.dev.rhcloud.com/healthz": EOF baremetal 4.9.0-0.nightly-2021-08-24-235829 True False False 128m cloud-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 132m cloud-credential 4.9.0-0.nightly-2021-08-24-235829 True False False 132m cluster-autoscaler 4.9.0-0.nightly-2021-08-24-235829 True False False 129m config-operator 4.9.0-0.nightly-2021-08-24-235829 True False False 130m console 4.9.0-0.nightly-2021-08-24-235829 False True False 36m DeploymentAvailable: 0 pods available for console deployment RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.ci-ln-pgbd8zb-d5d6b.origin-ci-int-aws.dev.rhcloud.com): Get "https://console-openshift-console.apps.ci-ln-pgbd8zb-d5d6b.origin-ci-int-aws.dev.rhcloud.com": EOF csi-snapshot-controller 4.9.0-0.nightly-2021-08-24-235829 True False False 37m dns 4.9.0-0.nightly-2021-08-24-235829 True False False 37m etcd 4.9.0-0.nightly-2021-08-24-235829 True False False 127m image-registry 4.9.0-0.nightly-2021-08-24-235829 False True True 36m Available: The deployment does not have available replicas NodeCADaemonAvailable: The daemon set node-ca has available replicas ImagePrunerAvailable: Pruner CronJob has been created ingress 4.9.0-0.nightly-2021-08-24-235829 False True True 36m The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.) insights 4.9.0-0.nightly-2021-08-24-235829 True False True 123m Reporting was not allowed: your Red Hat account is not enabled for remote support or your token has expired: UHC services authentication failed kube-apiserver 4.9.0-0.nightly-2021-08-24-235829 True False False 127m kube-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 127m kube-scheduler 4.9.0-0.nightly-2021-08-24-235829 True False False 128m kube-storage-version-migrator 4.9.0-0.nightly-2021-08-24-235829 True False False 37m machine-api 4.9.0-0.nightly-2021-08-24-235829 True False False 126m machine-approver 4.9.0-0.nightly-2021-08-24-235829 True False False 129m machine-config 4.9.0-0.nightly-2021-08-24-235829 True False False 37m marketplace 4.9.0-0.nightly-2021-08-24-235829 True False False 129m monitoring 4.9.0-0.nightly-2021-08-24-235829 True False False 35m network 4.9.0-0.nightly-2021-08-24-235829 True False False 131m node-tuning 4.9.0-0.nightly-2021-08-24-235829 True False False 37m openshift-apiserver 4.9.0-0.nightly-2021-08-24-235829 True False False 37m openshift-controller-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 37m openshift-samples 4.9.0-0.nightly-2021-08-24-235829 True False False 126m operator-lifecycle-manager 4.9.0-0.nightly-2021-08-24-235829 True False False 130m operator-lifecycle-manager-catalog 4.9.0-0.nightly-2021-08-24-235829 True False False 130m operator-lifecycle-manager-packageserver 4.9.0-0.nightly-2021-08-24-235829 True False False 37m service-ca 4.9.0-0.nightly-2021-08-24-235829 True False False 130m storage 4.9.0-0.nightly-2021-08-24-235829 False True False 96m AWSEBSCSIDriverOperatorCRAvailable: AWSEBSDriverControllerServiceControllerAvailable: Waiting for Deployment
I think like John shows, this isn't an MCO issue necessarily. A potential way to debug is to start at the first failing operator, and see what caused that. I'm also not sure what the state of SNO CI is. Maybe we should ask if this is a known SNO issue? What do you think @Praveen? I'm not sure where this should live but I don't think this should be on the MCO board.
@Yu I am able to reproduce what @John pointed to and I am also asked SNO team to take a look, we can remove the MCO component but I am not sure which component we target it :(
Shouldn't be a release blocker since this bug is not known to impact regular OCP or Single Node OpenShift cluster
@Rom I have to test this with rc0 or latest nightly because this is something John observed on sno side and will update this bug.
I tested this with latest nightly of 4.9 and didn't encounter this issue anymore. Closing it, will reopen if it appear again.