Bug 1841372 - Cluster Authentication Operator should check that the well-known endpoint is served by all apiservers
Summary: Cluster Authentication Operator should check that the well-known endpoint is ...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oauth-apiserver
Version: 4.5
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.6.0
Assignee: David Eads
QA Contact: pmali
Depends On:
TreeView+ depends on / blocked
Reported: 2020-05-29 02:49 UTC by Maru Newby
Modified: 2020-10-27 16:02 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Last Closed: 2020-10-27 16:01:56 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-authentication-operator pull 318 0 None closed Bug 1841372: prevent Available=true until all kube-apiservers have restarted 2020-12-01 15:33:19 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:02:36 UTC

Description Maru Newby 2020-05-29 02:49:45 UTC
Currently the cluster-authentication-operator (CAO) checks that all endpoints of the kube apiserver service are properly serving the well-known endpoint. However, the set of endpoints may not represent all apiservers running in a cluster in some instances (e.g. after initial install). In those cases, a cluster with 3 masters and therefore 3 desired apiservers may only have 2 endpoints for the service while the cluster-kube-apiserver-operator is progressing to bring the 3rd apiserver online.  The CAO needs to not report available until the well-known endpoint is checked on all expected apiservers.

Comment 1 Maru Newby 2020-05-29 02:53:12 UTC
While the suggestion in the slack discussion that prompted this filing seemed to be that the CAO should be responsible for determining whether it was checking the well-known endpoint on the desired count of apiservers, I wonder if the cluster-kube-apiserver-operator shouldn't be reporting the number of intended replicas in its status for consumption by CAO and others rather than requiring the CAO to derive that number itself (e.g. by counting master nodes).

Comment 9 errata-xmlrpc 2020-10-27 16:01:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.