Bug 1711126 - Failing Install due to Authentication Operator not deploying
Summary: Failing Install due to Authentication Operator not deploying
Keywords:
Status: CLOSED DUPLICATE of bug 1711127
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Erica von Buelow
QA Contact: Chuan Yu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-17 03:17 UTC by Eric Rich
Modified: 2019-05-17 08:02 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-17 08:02:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Eric Rich 2019-05-17 03:17:49 UTC
Description of problem: Authentication Operator fails to deploy or startup. 

Version-Release number of selected component (if applicable):

$ oc --config test_cluster/auth/kubeconfig get  clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-rc.0   False       True          92m     Unable to apply 4.1.0-rc.0: some cluster operators have not yet rolled out

How reproducible: Only seen this once

Steps to Reproduce:
1. https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#installation-approve-csrs_installing-bare-metal

Actual results:

>E0517 03:07:03.918897       1 controller.go:129] {๐Ÿผ ๐Ÿผ} failed with: error checking current version: unable to check route health: failed to GET route: net/http: TLS handshake timeout
> I0517 03:07:03.920730       1 status_controller.go:159] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2019-05-17T03:07:03Z","message":"Failing: error checking current version: unable to check route health: failed to GET route: net/http: TLS handshake timeout","reason":"AsExpected","status":"False","type":"Failing"},{"lastTransitionTime":"2019-05-17T01:48:36Z","reason":"NoData","status":"Unknown","type":"Progressing"},{"lastTransitionTime":"2019-05-17T01:48:36Z","reason":"NoData","status":"Unknown","type":"Available"},{"lastTransitionTime":"2019-05-17T01:48:36Z","reason":"NoData","status":"Unknown","type":"Upgradeable"}]}}
> I0517 03:07:03.927443       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-authentication-operator", Name:"openshift-authentication-operator", UID:"cb20bece-7844-11e9-9275-525400fb04b2", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for operator authentication changed: Failing changed from True to False ("Failing: error checking current version: unable to check route health: failed to GET route: net/http: TLS handshake timeout")
> E0517 03:07:13.990209       1 controller.go:129] {๐Ÿผ ๐Ÿผ} failed with: error checking current version: unable to check route health: failed to GET route: net/http: TLS handshake timeout

Expected results: The authentication operator should tell you what it's failing to connect to (what endpoint it's trying to reach). 

Additional info:

This Operators failed deployment seem to be holding up, the console from deploying: 

I0517 03:15:26.663474       1 option.go:62] Console: handling update openshift-console/console: /apis/route.openshift.io/v1/namespaces/openshift-console/routes/console
time="2019-05-17T03:15:26Z" level=info msg="started syncing operator \"cluster\" (2019-05-17 03:15:26.671787061 +0000 UTC m=+1242.374271175)"
time="2019-05-17T03:15:26Z" level=info msg="console is in a managed state."
time="2019-05-17T03:15:26Z" level=info msg="running sync loop 4.0.0"
time="2019-05-17T03:15:26Z" level=info msg="validating console route..."
time="2019-05-17T03:15:26Z" level=info msg="route ingress 'default' found and admitted, host: console-openshift-console.apps.thoran.dwarf.mine \n"
time="2019-05-17T03:15:26Z" level=info msg="route exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating console service..."
time="2019-05-17T03:15:26Z" level=info msg="service exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating console configmap..."
time="2019-05-17T03:15:26Z" level=info msg="generated console config yaml:"
time="2019-05-17T03:15:26Z" level=info msg="apiVersion: console.openshift.io/v1\nauth:\n  clientID: console\n  clientSecretFile: /var/oauth-config/clientSecret\n  logoutRedirect: \"\"\n  oauthEndpointCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\nclusterInfo:\n  consoleBaseAddress: https://console-openshift-console.apps.thoran.dwarf.mine\n  consoleBasePath: \"\"\n  masterPublicURL: https://api.thoran.dwarf.mine:6443\ncustomization:\n  branding: ocp\n  documentationBaseURL: https://docs.openshift.com/container-platform/4.1/\nkind: ConsoleConfig\nservingInfo:\n  bindAddress: https://0.0.0.0:8443\n  certFile: /var/serving-cert/tls.crt\n  keyFile: /var/serving-cert/tls.key\n \n"
time="2019-05-17T03:15:26Z" level=info msg="configmap exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating service-ca configmap..."
time="2019-05-17T03:15:26Z" level=info msg="service-ca configmap exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating oauth secret..."
time="2019-05-17T03:15:26Z" level=info msg="secret exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating oauthclient..."
time="2019-05-17T03:15:26Z" level=info msg="route ingress 'default' found and admitted, host: console-openshift-console.apps.thoran.dwarf.mine \n"
time="2019-05-17T03:15:26Z" level=info msg="oauthclient exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg="validating console deployment..."
time="2019-05-17T03:15:26Z" level=info msg="deployment exists and is in the correct state"
time="2019-05-17T03:15:26Z" level=info msg=-----------------------
time="2019-05-17T03:15:26Z" level=info msg="sync loop 4.0.0 resources updated: false \n"
time="2019-05-17T03:15:26Z" level=info msg=-----------------------
time="2019-05-17T03:15:26Z" level=info msg="deployment is available: false \n"
time="2019-05-17T03:15:26Z" level=info msg="sync_v400: updating console status"
time="2019-05-17T03:15:26Z" level=info msg="route ingress 'default' found and admitted, host: console-openshift-console.apps.thoran.dwarf.mine \n"
time="2019-05-17T03:15:26Z" level=info msg="updating console.config.openshift.io with hostname: console-openshift-console.apps.thoran.dwarf.mine \n"
time="2019-05-17T03:15:26Z" level=info msg="sync loop 4.0.0 complete:"
time="2019-05-17T03:15:26Z" level=info msg="\t service changed: false"
time="2019-05-17T03:15:26Z" level=info msg="\t route changed: false"
time="2019-05-17T03:15:26Z" level=info msg="\t configMap changed: false"
time="2019-05-17T03:15:26Z" level=info msg="\t secret changed: false"
time="2019-05-17T03:15:26Z" level=info msg="\t oauth changed: false"
time="2019-05-17T03:15:26Z" level=info msg="\t deployment changed: false"
time="2019-05-17T03:15:26Z" level=info msg=Operator.Status.Conditions
time="2019-05-17T03:15:26Z" level=info msg="Status.Condition.UnsupportedConfigOverridesUpgradeable: True"
time="2019-05-17T03:15:26Z" level=info msg="Status.Condition.Available: False | (NoPodsAvailable) No pods available for console deployment."
time="2019-05-17T03:15:26Z" level=info msg="Status.Condition.Progressing: True | (SyncLoopProgressing) Moving to version 4.1.0"
time="2019-05-17T03:15:26Z" level=info msg="Status.Condition.Failing: False"
time="2019-05-17T03:15:26Z" level=info msg="finished syncing operator \"cluster\" (341.775ยตs) \n\n"


> tail -f test_cluster/.openshift_install.log
>time="2019-05-16T22:42:54-04:00" level=info msg="Waiting up to 30m0s for the cluster at https://api.thoran.dwarf.mine:6443 to initialize..."
> time="2019-05-16T22:42:54-04:00" level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, image-registry"
> time="2019-05-16T22:43:14-04:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 98% complete"
> time="2019-05-16T22:46:29-04:00" level=debug msg="Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console"
> time="2019-05-16T23:12:54-04:00" level=fatal msg="failed to initialize the cluster: Some cluster operators are still updating: authentication, console: timed out waiting for the condition"

Comment 1 Xiaoli Tian 2019-05-17 08:02:45 UTC

*** This bug has been marked as a duplicate of bug 1711127 ***


Note You need to log in before you can comment on or make changes to this bug.