Bug 1699469

Summary: Cluster operator console has not yet reported success: https://172.30.0.1:443/.well-known/oauth-authorization-server failed: 404 Not Found
Product: OpenShift Container Platform Reporter: W. Trevor King <wking>
Component: MasterAssignee: Michal Fojtik <mfojtik>
Status: CLOSED CURRENTRELEASE QA Contact: Xingxing Xia <xxia>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-12 21:06:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Failures related to this issue none

Description W. Trevor King 2019-04-12 19:20:10 UTC
Description of problem:

Around 2019-04-12T13:27Z today, CI e2e-aws success dropped to near 0% with errors like:

$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/148/ | grep 'Cluster operator console'
Apr 11 12:46:40.467 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 12:46:40.467 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/154/ | grep 'Cluster operator console'
Apr 11 14:19:02.495 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 14:19:02.495 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/159/ | grep 'Cluster operator console'
Apr 11 15:43:07.639 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 15:43:07.639 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/164/ | grep 'Cluster operator console'
Apr 11 17:48:05.132 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:55:20.136 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:48:05.132 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
Apr 11 17:55:20.136 E clusterversion/version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator console has not yet reported success
$ curl -s https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-config-operator/34/pull-ci-openshift-cluster-config-operator-master-e2e-aws/167/ | grep 'Cluster operator console'
level=fatal msg=&#34;failed to initialize the cluster: Cluster operator console has not yet reported success: timed out waiting for the condition&#34;

Comparing random jobs across the transition:

$ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_console/1422/pull-ci-openshift-console-master-e2e-aws-console/695/artifacts/release-latest/release-payload-latest/image-references | sed 's|ci-[^/]*/stable|.../stable|') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6694/artifacts/release-images-latest/release-images-latest | sed 's|ci-[^/]*/stable|.../stable|') | grep -B3 io.openshift.build.source-location
-          "io.openshift.build.commit.id": "debd02db8f6c49aa0436d359310e42a32319c2e8",
+          "io.openshift.build.commit.id": "eed9a57bae31b26f4ed6dd323cc061173d8094ce",
          "io.openshift.build.commit.ref": "master",
          "io.openshift.build.source-location": "https://github.com/openshift/cluster-config-operator"
--
-          "io.openshift.build.commit.id": "df320f64d59b867d6fd0bdd77b5026d3c53083c8",
+          "io.openshift.build.commit.id": "46e1c20984d134cd04fcb046bc67ed0091edd56c",
          "io.openshift.build.commit.ref": "master",
          "io.openshift.build.source-location": "https://github.com/openshift/cluster-kube-controller-manager-operator"

Looking at those changes turned up the suspect [1], which has been partially rolled back in [2].  Hopefully that fixes CI.

[1]: https://github.com/openshift/cluster-config-operator/pull/34
[2]: https://github.com/openshift/cluster-config-operator/pull/43

Comment 2 W. Trevor King 2019-04-12 21:06:23 UTC
Created attachment 1554866 [details]
Failures related to this issue

Looks fixed to me, with the big blue dots being [1,2] (with random y values).

[1]: https://github.com/openshift/cluster-config-operator/pull/34#event-2272384982
[2]: https://github.com/openshift/cluster-config-operator/pull/43#event-2273345198