1699469 – Cluster operator console has not yet reported success: https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found

Bug 1699469 - Cluster operator console has not yet reported success: https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found

Summary: Cluster operator console has not yet reported success: https://172.30.0.1:443...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Master
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Michal Fojtik
QA Contact:	Xingxing Xia
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-04-12 19:20 UTC by W. Trevor King
Modified:	2019-04-12 21:06 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-04-12 21:06:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Failures related to this issue (565.51 KB, image/png) 2019-04-12 21:06 UTC, W. Trevor King	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-config-operator pull 43	0	'None'	closed	remove possibly failing CRs	2020-06-11 00:00:09 UTC

Description W. Trevor King 2019-04-12 19:20:10 UTC

Description of problem:

Around 2019-04-12T13:27Z today, CI e2e-aws success dropped to near 0% with errors like:

$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/148/ | grep 'Cluster operator console'
Apr 11 12:46:40.467 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 12:46:40.467 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/154/ | grep 'Cluster operator console'
Apr 11 14:19:02.495 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 14:19:02.495 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/159/ | grep 'Cluster operator console'
Apr 11 15:43:07.639 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 15:43:07.639 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/164/ | grep 'Cluster operator console'
Apr 11 17:48:05.132 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:55:20.136 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:48:05.132 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:55:20.136 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/167/ | grep 'Cluster operator console'
level=fatal msg=&#34;failed to initialize the cluster: Cluster operator console has not yet reported success: timed out waiting for the condition&#34;

Comparing random jobs across the transition:

$ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_console/1422/pull-ci-openshift-console-master-e2e-aws-console/695/artifacts/release-latest/release-payload-latest/image-references | sed 's|ci-[^/]*/stable|.../stable|') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6694/artifacts/release-images-latest/release-images-latest | sed 's|ci-[^/]*/stable|.../stable|') | grep -B3 io.openshift.build.source-location
-          "io.openshift.build.commit.id": "debd02db8f6c49aa0436d359310e42a32319c2e8",
+          "io.openshift.build.commit.id": "eed9a57bae31b26f4ed6dd323cc061173d8094ce",
          "io.openshift.build.commit.ref": "master",
          "io.openshift.build.source-location": "https://github.com/openshift/cluster-config-operator"
--
-          "io.openshift.build.commit.id": "df320f64d59b867d6fd0bdd77b5026d3c53083c8",
+          "io.openshift.build.commit.id": "46e1c20984d134cd04fcb046bc67ed0091edd56c",
          "io.openshift.build.commit.ref": "master",
          "io.openshift.build.source-location": "https://github.com/openshift/cluster-kube-controller-manager-operator"

Looking at those changes turned up the suspect [1], which has been partially rolled back in [2].  Hopefully that fixes CI.

[1]: https://github.com/openshift/cluster-config-operator/pull/34
[2]: https://github.com/openshift/cluster-config-operator/pull/43

Comment 1 W. Trevor King 2019-04-12 19:24:38 UTC

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-kube-scheduler-operator/97/pull-ci-openshift-cluster-kube-scheduler-operator-master-e2e-aws-operator/275/artifacts/e2e-aws-operator/pods/openshift-console_console-785b77b769-6hqmt_console_previous.log.gz | gunzip | head -n1
2019/04/12 16:39:55 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-kube-scheduler-operator/97/pull-ci-openshift-cluster-kube-scheduler-operator-master-e2e-aws-operator/275/artifacts/e2e-aws-operator/pods/openshift-console_console-785b77b769-6hqmt_console.log.gz | gunzip | tail -n1
2019/04/12 16:50:12 auth: error contacting auth provider (retrying in 10s): discovery through endpoint https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found

^ where the subject's 404 messages came from.

Comment 2 W. Trevor King 2019-04-12 21:06:23 UTC

Created attachment 1554866 [details]
Failures related to this issue

Looks fixed to me, with the big blue dots being [1,2] (with random y values).

[1]: https://github.com/openshift/cluster-config-operator/pull/34#event-2272384982
[2]: https://github.com/openshift/cluster-config-operator/pull/43#event-2273345198

Note You need to log in before you can comment on or make changes to this bug.