Hide Forgot
The console regresses to being "unavailable" after the CVO completes install. fail [github.com/openshift/origin/test/extended/operators/operators.go:164]: Mar 28 21:06:31.162: Some cluster operators never became available <nil>/console https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_installer/1488/pull-ci-openshift-installer-master-e2e-aws/4766/
We can't connect to well-known: 2019/03/28 21:06:03 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found cc Mo
This should be fixed if we create separate readiness and liveness probes for console
Further investigation, this may be in operator logic, I'll take the bug back.
We usually also met following error, atm console cannot be accessed atm. It seems that the console is waiting for well-known related stuff ready from the authentication server. The console will work well after several minutes. $ oc logs console-5484f684b9-t985l -n openshift-console 2019/04/2 06:49:07 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.zitang-401.qe.devcluster.openshift.com/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.zitang-401.qe.devcluster.openshift.com: EOF And the pod's events like Readiness probe failed: Get https://10.128.0.65:8443/health: dial tcp 10.128.0.65:8443: connect: connection refused”
(In reply to shahan from comment #4) > We usually also met following error, atm console cannot be accessed atm. It > seems that the console is waiting for well-known related stuff ready from > the authentication server. > The console will work well after several minutes. > $ oc logs console-5484f684b9-t985l -n openshift-console > 2019/04/2 06:49:07 auth: error contacting auth provider (retrying in 10s): > request to OAuth issuer endpoint > https://openshift-authentication-openshift-authentication.apps.zitang-401.qe. > devcluster.openshift.com/oauth/token failed: Head > https://openshift-authentication-openshift-authentication.apps.zitang-401.qe. > devcluster.openshift.com: EOF This is normal, transient error during a new install. Console blocks until the OAuth server is ready. Are you seeing it go back into this state after a install completes successfully? Or only when first installing?
I met this issue when an install completes successfully. After a long time(Maybe more ten minutes, I can not remember the specific time), console came back. If it is not a issue on the console, it might be for OAuth server i think. In short, it is not our expected behavior.
OK, you could be seeing a symptom of this bug. I believe console is reporting available before it's really ready in some circumstances.
New instance of this [1]. Reconstructed timeline: 18:56:45.342354 kube-apiserver Available 18:58:36.101073 console Available, but "route ingress not yet ready for console" 18:59:04.395264 console not Available, "Operator sync loop failed to complete" and "No pods available for console deployment" 18:59:14 console-7485df4cdb-kdz85 started 18:59:19 console-7485df4cdb-cpj52 started 19:01:11.071431 authentication not Available, OAuth 404 19:01:14 Install complete 19.01.13.248047 authentication Available 19:02:10 console-7485df4cdb-kdz85 killed, OAuth certificate signed by unknown authority 19:02:10 console-7485df4cdb-kdz85 restarted 19:02:32.179 failing test, console unavailable 19:02:41.006344 console Available 19:02:57 console-7485df4cdb-cpj52 killed, OAuth certificate signed by unknown authority and 404s 19:02:59 console-7485df4cdb-cpj52 restarted 19:03:01.265096 authentication not Available, OAuth 404 19:03:03.252500 authentication Available 19:03:05.312084 authentication not Available, OAuth 404 19:03:07.279044 authentication Available CVO stops checking you on bootstrap if you claim Available and set a version, so the early flap means we didn't block "Install complete". Of course, fixing the OAuth issues would be nice too, but there's still a bug in the console operator's reporting. [1]: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_installer/1646/pull-ci-openshift-installer-master-e2e-aws/5422
The error seems to be occurring here. We incorrectly sync status twice. The second request fails because the object has been modified. Somewhere Available is defaulted to True even though we're not explicitly setting it. time="2019-04-17T18:58:36Z" level=info msg=Operator.Status.Conditions time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.UnsupportedConfigOverridesUpgradeable: True" time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.Available: Unknown | (SyncError) \"oauth\": route ingress not yet ready for console\n" time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.Progressing: True | (SyncError) \"oauth\": route ingress not yet ready for console\n" time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.Failing: True | (SyncError) \"oauth\": route ingress not yet ready for console\n" I0417 18:58:36.093332 1 option.go:62] Console: handling update /cluster: /apis/operator.openshift.io/v1/consoles/cluster time="2019-04-17T18:58:36Z" level=info msg=Operator.Status.Conditions time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.UnsupportedConfigOverridesUpgradeable: True" time="2019-04-17T18:58:36Z" level=info msg="Status.Condition.Failing: True | (SyncLoopError) Operator sync loop failed to complete." I0417 18:58:36.093775 1 status_controller.go:159] clusteroperator/console diff {"status":{"conditions":[{"lastTransitionTime":"2019-04-17T18:58:36Z","message":"Failing: \"oauth\": route ingress not yet ready for console\nFailing: ","reason":"AsExpected","status":"False","type":"Failing"},{"lastTransitionTime":"2019-04-17T18:58:36Z","message":"Progressing: \"oauth\": route ingress not yet ready for console\nProgressing: ","reason":"ProgressingSyncError","status":"True","type":"Progressing"},{"lastTransitionTime":"2019-04-17T18:58:36Z","message":"Available: \"oauth\": route ingress not yet ready for console\nAvailable: ","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2019-04-17T18:58:34Z","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} time="2019-04-17T18:58:36Z" level=error msg="status update error: Operation cannot be fulfilled on consoles.operator.openshift.io \"cluster\": the object has been modified; please apply your changes to the latest version and try again \n" time="2019-04-17T18:58:36Z" level=info msg="finished syncing operator \"cluster\" (35.247µs) \n\n" E0417 18:58:36.098020 1 controller.go:130] {🐼 🐼} failed with: route ingress not yet ready for console I0417 18:58:36.101073 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"bf4421e7-6142-11e9-ad06-1204721f74ca", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for operator console changed: Failing changed from Unknown to False ("Failing: \"oauth\": route ingress not yet ready for console\nFailing: "),Progressing changed from Unknown to True ("Progressing: \"oauth\": route ingress not yet ready for console\nProgressing: "),Available changed from Unknown to True ("Available: \"oauth\": route ingress not yet ready for console\nAvailable: ")
https://github.com/openshift/console-operator/pull/195
4.1.0-0.nightly-2019-04-24-221323 console operator image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6d7c17581d4c264aa7a9d9c6374e57f1fc4d3d849cff532adc308795a85f582 io.openshift.build.commit.url=https://github.com/openshift/console-operator/commit/58eb2fd3a5938c662c904441ada55c925e31ceec Checked on ocp4.0 env with above image version, console installation completed successfully. Console could be accessed. Check console pod log and console operator pod log: Console pod log shows auth error: $ oc logs pod/console-5c74978b4d-j8jp4 -n openshift-console --config=ui25.kubeconfig 2019/04/25 00:56:32 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:56:42 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:56:52 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:57:02 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:57:12 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:57:22 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found 2019/04/25 00:57:32 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found Console operator pod log: http://pastebin.test.redhat.com/758248 @Ben, there is auth error in console log, is it expected? the fix pr https://github.com/openshift/console-operator/pull/195 is for console operator, and the console operator pod log looks normal.
The auth error is expected and harmless as long as it can eventually connect. Several components are starting at once. This simply indicates the OAuth server isn't ready yet. The key thing to check for this bug is that console operator doesn't report Available: True -> Available: False -> Available: True during initial install. Once it becomes available, it should stay that way. (Operator will print its status to the pod log.)
4.1.0-0.nightly-2019-04-25-121505 console operator image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6d7c17581d4c264aa7a9d9c6374e57f1fc4d3d849cff532adc308795a85f582 io.openshift.build.commit.url=https://github.com/openshift/console-operator/commit/58eb2fd3a5938c662c904441ada55c925e31ceec Install on OCP4.0 env with above version. Checked console operator pod log from begin to end, could find that the status changed from False to True: time="2019-04-28T02:40:17Z" level=info msg="Status.Condition.Available: False | (NoPodsAvailable) 0 pods available for console deployment" time="2019-04-28T02:43:01Z" level=info msg="Status.Condition.Available: True" The bug has been fixed, so move it to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758