Bug 1841372

Summary: Cluster Authentication Operator should check that the well-known endpoint is served by all apiservers
Product: OpenShift Container Platform Reporter: Maru Newby <mnewby>
Component: oauth-apiserverAssignee: David Eads <deads>
Status: CLOSED ERRATA QA Contact: pmali
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, mfojtik, scheng, xxia
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:01:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Maru Newby 2020-05-29 02:49:45 UTC
Currently the cluster-authentication-operator (CAO) checks that all endpoints of the kube apiserver service are properly serving the well-known endpoint. However, the set of endpoints may not represent all apiservers running in a cluster in some instances (e.g. after initial install). In those cases, a cluster with 3 masters and therefore 3 desired apiservers may only have 2 endpoints for the service while the cluster-kube-apiserver-operator is progressing to bring the 3rd apiserver online.  The CAO needs to not report available until the well-known endpoint is checked on all expected apiservers.

Comment 1 Maru Newby 2020-05-29 02:53:12 UTC
While the suggestion in the slack discussion that prompted this filing seemed to be that the CAO should be responsible for determining whether it was checking the well-known endpoint on the desired count of apiservers, I wonder if the cluster-kube-apiserver-operator shouldn't be reporting the number of intended replicas in its status for consumption by CAO and others rather than requiring the CAO to derive that number itself (e.g. by counting master nodes).

Comment 9 errata-xmlrpc 2020-10-27 16:01:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196