DescriptionPablo Alonso Rodriguez
2021-09-08 16:14:29 UTC
Description of problem:
In one customer, whenever an installation is tried, the kubelet is inexplicably slow, so it doesn't start the kube-apiserver even after waiting hours.
As per crio, it doesn't seem to even try to start it, but I cannot point any failure log.
Sar metrics were also collected and there was no apparent resource exhaustion (either at CPU, RAM, storage, network, no high load...).
So I am going to need kubelet team help to try to understand where can slowness come from and whether it can be due to a kubelet bug.
Version-Release number of selected component (if applicable):
4.6 (different erratas)
How reproducible:
Only at a concrete environment.
Steps to Reproduce:
1. Install a cluster
Actual results:
Bootstrap kube-apiserver pod never starts due to apparent kubelet slowness
Expected results:
kube-apiserver pod starting.
Additional info:
Comment 14Benjamin Gilbert
2021-09-17 18:41:56 UTC
Moving to POST because bug 1978268 has landed in a build, and we're just waiting for the bootimage bump.
Comment 15Benjamin Gilbert
2021-09-22 21:47:43 UTC
The bootimage bump in bug 1981999 has landed. Moving to MODIFIED.
In payload quay.io/openshift-release-dev/ocp-release:4.9.0-rc.3-x86_64, RHCOS-49.84.202109172039-0 was used as boot image.
[root@ip-10-0-13-79 ~]# rpm-ostree status
State: idle
Deployments:
* ostree://67a210b2d0d1c3787f813061995783c3528d132cfb97bd44b3eb003fb8dacde8
Version: 49.84.202109172039-0 (2021-09-17T20:43:24Z)
In QE's CI test, we didn't see bootstrap failure with 4.9.0-rc.3-x86_64, move this bug as verified.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2021:3759