Bug 1879131 - oauth-openshift sometimes gets stuck waiting for operator cert secret to sync
Summary: oauth-openshift sometimes gets stuck waiting for operator cert secret to sync
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.6
Hardware: s390x
OS: Unspecified
Target Milestone: ---
: 4.6.0
Assignee: Maru Newby
QA Contact: pmali
Depends On:
TreeView+ depends on / blocked
Reported: 2020-09-15 14:03 UTC by Jeremy Poulin
Modified: 2020-10-27 16:41 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2020-10-27 16:41:13 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:41:30 UTC

Description Jeremy Poulin 2020-09-15 14:03:20 UTC
Description of problem:
For greater context.

The bug as it manifests involves the following behavior:
1. Oftentimes, on our s390x installs we see an installation fail because oauth and console don't come up properly. (Console is dependent on oauth, so this is expected.)
2. Upon inspection, we can see that `v4-0-config-system-router-certs` is missing from the openshift-authentication namespace.
3. Looking at the starter.go code, it appears that this is supposed to be synced from openshift-config-managed <https://github.com/openshift/cluster-authentication-operator/blob/ee0dce672a145d198e4704f385a8afc976d22420/pkg/operator/starter.go#L206-L208>
4. Upon inspection of that namespace, the certs *are* present there, but the continuous syncing doesn't appear to be working properly.

Workaround -
According to Maru from the auth team, it appears that a scale-down->0->scale-up->1 of openshift-authentication-operator resolves the issue.

Expected results:
That the operator comes up healthy as soon as the secret becomes available from ingress.

Additional info: 
Must-Gather Logs - https://drive.google.com/file/d/1G_8CCMqXTzEdj1mkHqmtEfXgmP1dH1qP/view?usp=sharing

Comment 1 Standa Laznicka 2020-09-16 11:00:19 UTC
should be fixed by https://github.com/openshift/cluster-authentication-operator/pull/346

Comment 6 errata-xmlrpc 2020-10-27 16:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.