Description of problem: Recently we have seen evidence of etcd operator (and other operators that run static pod installers) that in some cases (usually during install) there are multiple revisions being installed in a short time (sometimes a few seconds between), which might trigger a race condition(s) in kubelet when it comes to running these static pods. In some cases, 5 or 6 revisions are being installed on a single master node within 40 seconds (during installation). We should throttle the rate of how often the static pod installer creates new revisions on a node to decrease kubelet utilization a minimize possible side effects. Version-Release number of selected component (if applicable): 4.11 4.10 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
these fixes already merged in PRs around https://github.com/openshift/cluster-kube-scheduler-operator/pull/407
Tried several rounds of test with 4.11.0-0.nightly-2022-02-24-054925, 8 revisions in 20 minutes, it should be acceptable improvement according to https://bugzilla.redhat.com/show_bug.cgi?id=2053148#c5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069