Description of problem: Set up a sno cluster with template ipi-on-aws/versioned-installer-customer_vpc-disconnected-sno-ci and enable ccm from start, cluster install failed, ingress/authentication clusteroperator degraded. Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 3h3m Unable to apply 4.9.0-0.nightly-2021-09-06-055314: some cluster operators have not yet rolled out How reproducible: always Steps to Reproduce: 1. Set up a sno cluster with template ipi-on-aws/versioned-installer-customer_vpc-disconnected-sno-ci and enable ccm from start cat <<EOF > manifests/manifest_feature_gate.yaml --- apiVersion: config.openshift.io/v1 kind: FeatureGate metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" name: cluster spec: featureSet: TechPreviewNoUpgrade 2. 3. Actual results: Cluster installation failed. 09-07 14:25:11.350 level=debug msg=Still waiting for the cluster to initialize: Working towards 4.9.0-0.nightly-2021-09-06-055314: 713 of 734 done (97% complete) 09-07 14:26:48.048 level=debug msg=Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console 09-07 14:58:19.736 level=error msg=Cluster operator authentication Degraded is True with OAuthServerRouteEndpointAccessibleController_SyncError::ProxyConfigController_SyncError: OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz": EOF 09-07 14:58:19.736 level=error msg=ProxyConfigControllerDegraded: endpoint("https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz") is unreachable with proxy(Get "https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz": EOF) and without proxy(Get "https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz": context deadline exceeded) 09-07 14:58:19.736 level=info msg=Cluster operator authentication Available is False with OAuthServerRouteEndpointAccessibleController_EndpointUnavailable: OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz": EOF 09-07 14:58:19.736 level=info msg=Cluster operator baremetal Disabled is True with UnsupportedPlatform: Nothing to do on this Platform 09-07 14:58:19.736 level=info msg=Cluster operator console Progressing is True with SyncLoopRefresh_InProgress: SyncLoopRefreshProgressing: Working toward version 4.9.0-0.nightly-2021-09-06-055314, 0 replicas available 09-07 14:58:19.737 level=info msg=Cluster operator console Available is False with Deployment_InsufficientReplicas::RouteHealth_FailedGet: DeploymentAvailable: 0 replicas available for console deployment 09-07 14:58:19.737 level=info msg=RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.zhsun97s.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.zhsun97s.qe.devcluster.openshift.com": EOF 09-07 14:58:19.737 level=info msg=Cluster operator etcd RecentBackup is Unknown with ControllerStarted: 09-07 14:58:19.737 level=error msg=Cluster operator ingress Degraded is True with IngressDegraded: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) 09-07 14:58:19.737 level=info msg=Cluster operator insights Disabled is True with Disabled: Health reporting is disabled 09-07 14:58:19.737 level=info msg=Cluster operator network ManagementStateDegraded is False with : 09-07 14:58:19.737 level=error msg=Cluster initialization failed because one or more operators are not functioning properly. 09-07 14:58:19.737 level=error msg=The cluster should be accessible for troubleshooting as detailed in the documentation linked below, 09-07 14:58:19.737 level=error msg=https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html 09-07 14:58:19.737 level=error msg=The 'wait-for install-complete' subcommand can then be used to continue the installation 09-07 14:58:19.738 level=fatal msg=failed to initialize the cluster: Some cluster operators are still updating: authentication, console $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-57-132.us-east-2.compute.internal Ready master,worker 179m v1.22.0-rc.0+75ee307 $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.9.0-0.nightly-2021-09-06-055314 False False True 179m OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.zhsun97s.qe.devcluster.openshift.com/healthz": EOF baremetal 4.9.0-0.nightly-2021-09-06-055314 True False False 176m cloud-controller-manager 4.9.0-0.nightly-2021-09-06-055314 True False False 3h cloud-credential 4.9.0-0.nightly-2021-09-06-055314 True False False 3h2m cluster-autoscaler 4.9.0-0.nightly-2021-09-06-055314 True False False 176m config-operator 4.9.0-0.nightly-2021-09-06-055314 True False False 178m console 4.9.0-0.nightly-2021-09-06-055314 False True False 170m DeploymentAvailable: 0 replicas available for console deployment RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.zhsun97s.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.zhsun97s.qe.devcluster.openshift.com": EOF csi-snapshot-controller 4.9.0-0.nightly-2021-09-06-055314 True False False 178m dns 4.9.0-0.nightly-2021-09-06-055314 True False False 176m etcd 4.9.0-0.nightly-2021-09-06-055314 True False False 177m image-registry 4.9.0-0.nightly-2021-09-06-055314 True False False 174m ingress 4.9.0-0.nightly-2021-09-06-055314 True False True 170m The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) Expected results: Cluster installation is success. Additional info: must-gather: http://file.rdu.redhat.com/~zhsun/must-gather.local.8824229733135208710.tar.gz kubeconfig: https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/40558/artifact/workdir/install-dir/auth/kubeconfig/*view*/
verified clusterversion: 4.10.0-0.nightly-2021-09-15-220746 oc get node NAME STATUS ROLES AGE VERSION ip-10-0-65-233.us-east-2.compute.internal Ready master,worker 173m v1.22.0-rc.0+75ee307 $ oc get featuregate cluster -o yaml apiVersion: config.openshift.io/v1 kind: FeatureGate metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" creationTimestamp: "2021-09-16T02:41:16Z" generation: 1 name: cluster resourceVersion: "993" uid: 80c405d9-e8be-4193-a2a9-2b1b27be6264 spec: featureSet: TechPreviewNoUpgrade sh-4.4# cat /etc/systemd/system/kubelet.service --cloud-provider=external \ $ oc describe po kube-controller-manager-ip-10-0-65-233.us-east-2.compute.internal -n openshift-kube-controller-manager | grep cloud-provider -C 20 --cloud-provider=external --requestheader-client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/aggregator-client-ca/ca-bundle.crt -v=2 --tls-cert-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.crt --tls-private-key-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.key --allocate-node-cidrs=false --cert-dir=/var/run/kubernetes --cloud-provider=external
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056