Bug 1856425

Summary: OCP 4.6.x Installation failed due to authentication operator goes in degraded state.
Product: OpenShift Container Platform Reporter: pmali
Component: apiserver-authAssignee: Maru Newby <mnewby>
Status: CLOSED DUPLICATE QA Contact: pmali
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.6CC: aos-bugs, mfojtik
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 16:05:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description pmali 2020-07-13 15:38:32 UTC
Description of problem:

Installation for version 4.6 is most of the time failed due to authentication operator is in degraded state. And even kubeadmin is not able to login. Below are the installation errors:

~~~~~
level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingSubsets: IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server"
level=info msg="Cluster operator authentication Progressing is True with OAuthServerDeployment_ReplicasNotAvailable::OAuthVersionDeployment_ReplicasNotAvailable::OAuthVersionRoute_WaitingForRoute: OAuthVersionDeploymentProgressing: Waiting for 2 replicas of OAuth server to be available\nOAuthVersionRouteProgressing: Request to \"https://oauth-openshift.apps.pmali13.qe.azure.devcluster.openshift.com/healthz\" not successfull yet\nOAuthServerDeploymentProgressing: Waiting for 2 replicas of OAuth server to be available"
level=info msg="Cluster operator authentication Available is False with OAuthVersionRoute_RequestFailed: OAuthVersionRouteAvailable: HTTP request to \"https://oauth-openshift.apps.pmali13.qe.azure.devcluster.openshift.com/healthz\" failed: EOF"
level=error msg="Cluster operator console Degraded is True with RouteHealth_StatusError: RouteHealthDegraded: route not yet available, https://console-openshift-console.apps.pmali13.qe.azure.devcluster.openshift.com/health returns '503 Service Unavailable'"
level=info msg="Cluster operator console Progressing is True with SyncLoopRefresh_InProgress: SyncLoopRefreshProgressing: Working toward version 4.6.0-0.nightly-2020-07-12-232219"
level=info msg="Cluster operator console Available is False with Deployment_InsufficientReplicas: DeploymentAvailable: 0 pods available for console deployment"
level=info msg="Cluster operator insights Disabled is False with AsExpected: "
level=fatal msg="failed to initialize the cluster: Cluster operator console is reporting a failure: RouteHealthDegraded: route not yet available, https://console-openshift-console.apps.pmali13.qe.azure.devcluster.openshift.com/health returns '503 Service Unavailable'"
~~~~

$ oc get pod -n openshift-authentication 
NAME                               READY   STATUS              RESTARTS   AGE
oauth-openshift-548dff5775-5brw4   0/1     ContainerCreating   0          4h2m
oauth-openshift-548dff5775-klg2k   0/1     ContainerCreating   0          4h2m

$  oc describe -n openshift-authentication pod oauth-openshift-548dff5775-klg2k
 ...
    Mounts:
      /var/config/system/configmaps/v4-0-config-system-cliconfig from v4-0-config-system-cliconfig (ro)
      /var/config/system/configmaps/v4-0-config-system-service-ca from v4-0-config-system-service-ca (ro)
      /var/config/system/configmaps/v4-0-config-system-trusted-ca-bundle from v4-0-config-system-trusted-ca-bundle (ro)
      /var/config/system/secrets/v4-0-config-system-ocp-branding-template from v4-0-config-system-ocp-branding-template (ro)
      /var/config/system/secrets/v4-0-config-system-router-certs from v4-0-config-system-router-certs (ro)
      /var/config/system/secrets/v4-0-config-system-serving-cert from v4-0-config-system-serving-cert (ro)
      /var/config/system/secrets/v4-0-config-system-session from v4-0-config-system-session (ro)
      /var/config/user/template/secret/v4-0-config-user-template-error from v4-0-config-user-template-error (ro)
      /var/config/user/template/secret/v4-0-config-user-template-login from v4-0-config-user-template-login (ro)
      /var/config/user/template/secret/v4-0-config-user-template-provider-selection from v4-0-config-user-template-provider-selection (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from oauth-openshift-token-srhvp (ro)

...

Events:
  Type     Reason       Age                      From                             Message
  ----     ------       ----                     ----                             -------
  Warning  FailedMount  8m56s (x125 over 3h40m)  kubelet, pmali13-xb7wt-master-0  (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[v4-0-config-system-cliconfig], unattached volumes=[v4-0-config-user-template-login v4-0-config-system-trusted-ca-bundle v4-0-config-system-session v4-0-config-user-template-provider-selection v4-0-config-system-router-certs v4-0-config-system-ocp-branding-template v4-0-config-system-serving-cert v4-0-config-user-template-error oauth-openshift-token-srhvp v4-0-config-system-cliconfig v4-0-config-system-service-ca]: timed out waiting for the condition
  Warning  FailedMount  3m9s (x102 over 3h59m)   kubelet, pmali13-xb7wt-master-0  MountVolume.SetUp failed for volume "v4-0-config-system-cliconfig" : configmap "v4-0-config-system-cliconfig" not found


Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-07-12-232219

How reproducible:
Often

Steps to Reproduce:
1. Start installation with the version 4.6.x.nightly.*
2. export KUBECONFIG=path/to/kubeconfig
oc get pod ...
oc describe pod

Actual results:
1. Installation failed

Expected results:
1. Installation should not fail. 

Additional info:

Comment 1 Maru Newby 2020-07-13 16:05:03 UTC

*** This bug has been marked as a duplicate of bug 1856316 ***