Bug 1994277 - Changing the memory manager policy via the kubelet config will drop the node to NotReady state
Summary: Changing the memory manager policy via the kubelet config will drop the node ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.9
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.9.0
Assignee: Artyom
QA Contact: Walid A.
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-17 08:13 UTC by Artyom
Modified: 2021-10-18 17:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:46:44 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2718 0 None None None 2021-08-19 14:11:54 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:47:00 UTC

Description Artyom 2021-08-17 08:13:55 UTC
Description of problem:
Changing the memory manager policy via the kubelet config will drop the node to NotReady state. The reason is that the memory manager assumes that you will delete the memory manager state file during the kubelet restart.

Version-Release number of selected component (if applicable):
master

How reproducible:
Always

Steps to Reproduce:
1. Change the memory manager policy via the KubeletConfig and set the reserved memory.
2. Wait for the node to be ready.
3.

Actual results:
The node stays in the NonReady state forever with the error under the Kubelet logs
Aug 16 18:04:51 alukiano-csbfk-worker-a-dcvzf.c.openshift-gce-devel.internal hyperkube[9402]: E0816 18:04:51.711228    9402 memory_manager.go:174] "Could not initialize checkpoint manager, please drain node and remove policy state file" err="could not restore state from checkpoint: [memorymanager] configured policy \"Static\" differs from state checkpoint policy \"None\", please drain this node and delete the memory manager checkpoint file \"/var/lib/kubelet/memory_manager_state\" before restarting Kubelet"


Expected results:
The kubelet should be ready.

Additional info:

Comment 4 errata-xmlrpc 2021-10-18 17:46:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.