Description of problem: When looking at logs of an unhealthy API server, I noticed it appears to be hitting /healthz instead of /readyz for readiness probes. Mar 15 03:01:36.711435 ip-10-0-146-200 hyperkube[1442]: I0315 03:01:36.711411 1442 prober.go:117] Readiness probe for "kube-apiserver-ip-10-0-146-200.us-west-1.compute.internal_openshift-kube-apiserver(22bd459e-677c-40ce-a715-9a263b663b2b):kube-apiserver" failed (failure): Get "https://10.0.146.200:6443/healthz": net/http: request canceled (Client.Timeout exceeded while awaiting headers) See also must-gather here: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.8/1371290082278903808/artifacts/e2e-aws/ Version-Release number of selected component (if applicable): 4.8? How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
readyz isn't used to maintain the kube-apiserver service endpoints (it directly writes), but it does impact the metrics gathering
Verified in 4.8.0-0.nightly-2021-04-06-162113 per result in https://bugzilla.redhat.com/show_bug.cgi?id=1939227#c5 .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438