Bug 1896226 - recycler-pod template should not be in kubelet static manifests directory
Summary: recycler-pod template should not be in kubelet static manifests directory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Seth Jennings
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks: 1932860
TreeView+ depends on / blocked
 
Reported: 2020-11-10 04:38 UTC by Seth Jennings
Modified: 2021-07-27 22:34 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Removes a misplaced recycler pod template from the kubelet static pod manifests directory. This resulted in kubelet log messages indicating failure to start the recycler static pod.
Clone Of:
Environment:
Last Closed: 2021-07-27 22:34:10 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-controller-manager-operator pull 488 0 None closed Bug 1912888: Add recycler pod template as a ConfigMap 2021-02-22 14:12:22 UTC
Github openshift machine-config-operator pull 2318 0 None closed Bug 1896226: Remove recycler pod templates 2021-03-02 15:27:26 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:34:30 UTC

Description Seth Jennings 2020-11-10 04:38:33 UTC
Description of problem:

https://github.com/openshift/machine-config-operator/pull/1687

Introduced a recycler-pod template but placed it in the kubelet's static manifests directory.

It is trying to run it once per second

Nov 09 09:55:44 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:44.815976    1944 kubelet.go:1891] SyncLoop (SYNC): 2 pods; recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041), sdn-q9qqv_openshift-sdn(45f227b2-1dc4-4547-8198-a40a8b8ca516)
Nov 09 09:55:44 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:44.816022    1944 kubelet.go:1936] Pod "recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041)" has completed, ignoring remaining sync work: sync
Nov 09 09:55:45 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:45.815114    1944 kubelet.go:1891] SyncLoop (SYNC): 2 pods; ovs-ktg7f_openshift-sdn(6e2cd68b-b8d0-49ca-ae20-340c0578407c), recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041)
Nov 09 09:55:45 master-0.ocp.variantweb.net hyperkube[1944]: I1109 09:55:45.815171    1944 kubelet.go:1936] Pod "recyler-pod-master-0.ocp.variantweb.net_openshift-infra(1974bdf306c9e1ace3cc502fe8a2f041)" has completed, ignoring remaining sync work: sync

# journalctl -u kubelet --since="1 hour ago" | grep recyler-pod | wc -l
7266

Version-Release number of selected component (if applicable):
4.6.3

How reproducible:
Always on masters

Steps to Reproduce:
1. Install a cluster
2.
3.

Actual results:
Master kubelet logs fill with sync's for the recycler pod

Expected results:
The recycler-pod template is not in the kubelet static manifests directory

Additional info:

Comment 1 Seth Jennings 2020-11-18 19:52:21 UTC
The way I see it, this will be a 3 step fix:

1. Move location of the recycler pod in MCO
2. Change KCM to use new location
3. Project empty file at old location (is there a way to remove a previously projected file?) so the kubelet doesn't try to start it all the time

Comment 2 Peter Hunt 2020-11-18 19:55:58 UTC
it's possible it's also continuously running because it's never actually being created, because of https://github.com/openshift/machine-config-operator/pull/2215

Comment 3 Peter Hunt 2020-11-18 19:56:26 UTC
which is more reason to not have it a static pod

Comment 4 Seth Jennings 2020-11-18 19:59:09 UTC
first step PR
https://github.com/openshift/machine-config-operator/pull/2238

Comment 6 Yu Qi Zhang 2020-12-08 17:24:54 UTC
Assigning to Seth as he is working on the PR, also moving over to storage board as that was the original component for the recycler pod as per bug 1805908.

Note also that the original bug was cherry picked to 4.4 (but not 4.5?) so maybe there is a need for backport.

Comment 7 Seth Jennings 2020-12-08 17:34:32 UTC
I'm still trying to figure out how this can be done in a backward compatible way.

Comment 8 Fabio Bertinatto 2020-12-11 14:21:55 UTC
(In reply to Seth Jennings from comment #7)
> I'm still trying to figure out how this can be done in a backward compatible
> way.

The idea I have to solve this is to move the rendering of template to KCM operator instead. Currently that's done in MCO, but that can be problematic because KCM operator can start before the template is rendered.

Comment 11 Wei Duan 2021-03-05 10:08:01 UTC
Verified on 4.8.0-0.nightly-2021-03-04-203700.
NFS recycler works well and I changed the status to Verified.

Comment 14 errata-xmlrpc 2021-07-27 22:34:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.