Description of problem: https://github.com/openshift/machine-config-operator/pull/1687 Introduced a recycler-pod template but placed it in the kubelet's static manifests directory. It is trying to run it once per second Nov 09 09:55:44 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:44.815976 1944 kubelet.go:1891] SyncLoop (SYNC): 2 pods; recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041), sdn-q9qqv_openshift-sdn(45f227b2-1dc4-4547-8198-a40a8b8ca516) Nov 09 09:55:44 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:44.816022 1944 kubelet.go:1936] Pod "recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041)" has completed, ignoring remaining sync work: sync Nov 09 09:55:45 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:45.815114 1944 kubelet.go:1891] SyncLoop (SYNC): 2 pods; ovs-ktg7f_openshift-sdn(6e2cd68b-b8d0-49ca-ae20-340c0578407c), recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041) Nov 09 09:55:45 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:45.815171 1944 kubelet.go:1936] Pod "recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041)" has completed, ignoring remaining sync work: sync # journalctl -u kubelet --since="1 hour ago" | grep recyler-pod | wc -l 7266 Version-Release number of selected component (if applicable): 4.6.3 How reproducible: Always on masters Steps to Reproduce: 1. Install a cluster 2. 3. Actual results: Master kubelet logs fill with sync's for the recycler pod Expected results: The recycler-pod template is not in the kubelet static manifests directory Additional info:
The way I see it, this will be a 3 step fix: 1. Move location of the recycler pod in MCO 2. Change KCM to use new location 3. Project empty file at old location (is there a way to remove a previously projected file?) so the kubelet doesn't try to start it all the time
it's possible it's also continuously running because it's never actually being created, because of https://github.com/openshift/machine-config-operator/pull/2215
which is more reason to not have it a static pod
first step PR https://github.com/openshift/machine-config-operator/pull/2238
second step PR https://github.com/openshift/cluster-kube-controller-manager-operator/pull/482
Assigning to Seth as he is working on the PR, also moving over to storage board as that was the original component for the recycler pod as per bug 1805908. Note also that the original bug was cherry picked to 4.4 (but not 4.5?) so maybe there is a need for backport.
I'm still trying to figure out how this can be done in a backward compatible way.
(In reply to Seth Jennings from comment #7) > I'm still trying to figure out how this can be done in a backward compatible > way. The idea I have to solve this is to move the rendering of template to KCM operator instead. Currently that's done in MCO, but that can be problematic because KCM operator can start before the template is rendered.
Verified on 4.8.0-0.nightly-2021-03-04-203700. NFS recycler works well and I changed the status to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438