Bug 1879131

Summary: oauth-openshift sometimes gets stuck waiting for operator cert secret to sync
Product: OpenShift Container Platform Reporter: Jeremy Poulin <jpoulin>
Component: apiserver-authAssignee: Maru Newby <mnewby>
Status: CLOSED ERRATA QA Contact: pmali
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.6CC: aos-bugs, dorzel, eparis, mfojtik, psundara, rdossant, slaznick
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:41:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeremy Poulin 2020-09-15 14:03:20 UTC
Description of problem:
For greater context.
https://coreos.slack.com/archives/CB48XQ4KZ/p1600115386274900

The bug as it manifests involves the following behavior:
1. Oftentimes, on our s390x installs we see an installation fail because oauth and console don't come up properly. (Console is dependent on oauth, so this is expected.)
2. Upon inspection, we can see that `v4-0-config-system-router-certs` is missing from the openshift-authentication namespace.
3. Looking at the starter.go code, it appears that this is supposed to be synced from openshift-config-managed <https://github.com/openshift/cluster-authentication-operator/blob/ee0dce672a145d198e4704f385a8afc976d22420/pkg/operator/starter.go#L206-L208>
4. Upon inspection of that namespace, the certs *are* present there, but the continuous syncing doesn't appear to be working properly.

Workaround -
According to Maru from the auth team, it appears that a scale-down->0->scale-up->1 of openshift-authentication-operator resolves the issue.

Expected results:
That the operator comes up healthy as soon as the secret becomes available from ingress.


Additional info: 
Must-Gather Logs - https://drive.google.com/file/d/1G_8CCMqXTzEdj1mkHqmtEfXgmP1dH1qP/view?usp=sharing

Comment 1 Standa Laznicka 2020-09-16 11:00:19 UTC
should be fixed by https://github.com/openshift/cluster-authentication-operator/pull/346

Comment 6 errata-xmlrpc 2020-10-27 16:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196