Description of problem: KubeletConfig are not reconciling MachineConfigPool changes. Version-Release number of selected component (if applicable): Server Version: 4.7.21 How reproducible: 1. Create a cluster 2. Add label to "worker" MachineConfigPool: aro.openshift.io/limits: "" 3. Create customer KubeletConfig: apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: generation: 1 labels: aro.openshift.io/limits: "" name: aro-limits spec: kubeletConfig: evictionHard: imagefs.available: 15% memory.available: 500Mi nodefs.available: 10% nodefs.inodesFree: 5% systemReserved: memory: 2000Mi machineConfigPoolSelector: matchLabels: aro.openshift.io/limits: "" Wait for it to be applied (nodes rotates). 4. Add same label to master MachineConfigPool. aro.openshift.io/limits: "" 5. New MachineConfig is not generated. 6. Try updating KubeletConfig with arbitrary changes, still nothing. 7. If you delete and re-create Kubeletconfig it is applied. It is VERY disruptive to big cluster Any changes to machineConfigPool should be acted by KubeletConfig controller Expected result: Edits or updates to both MachineConfigPool and KubeletConfig should trigger regeneration of MachineConfig
(In reply to Mangirdas Judeikis from comment #0) > Description of problem: > > KubeletConfig are not reconciling MachineConfigPool changes. > > > Version-Release number of selected component (if applicable): > > Server Version: 4.7.21 > > > How reproducible: > > 1. Create a cluster > 2. Add label to "worker" MachineConfigPool: > aro.openshift.io/limits: "" > > 3. Create customer KubeletConfig: > > apiVersion: machineconfiguration.openshift.io/v1 > > kind: KubeletConfig > > metadata: > > generation: 1 > > labels: > > aro.openshift.io/limits: "" > name: aro-limits > spec: > kubeletConfig: > evictionHard: > imagefs.available: 15% > memory.available: 500Mi > nodefs.available: 10% > nodefs.inodesFree: 5% > systemReserved: > memory: 2000Mi > machineConfigPoolSelector: > matchLabels: > aro.openshift.io/limits: "" > > > Wait for it to be applied (nodes rotates). > > 4. Add same label to master MachineConfigPool. > aro.openshift.io/limits: "" > > 5. New MachineConfig is not generated. > > 6. Try updating KubeletConfig with arbitrary changes, still nothing. > > 7. If you delete and re-create Kubeletconfig it is applied. > > It is VERY disruptive to big cluster > Any changes to machineConfigPool should be acted by KubeletConfig controller > > Expected result: > Edits or updates to both MachineConfigPool and KubeletConfig should trigger > regeneration of MachineConfig What is the status of the nodes after step 6? I tried to reproduce, after step 5, no new machineconfig generated as mentioned above. But after edit kuletconfig, a new machineconfig 99-master-generated-kubelet generated on my cluster, it leads one o the master nodes stuck at Ready,SchedulingDisabled status. What is "still nothing" in detail, there's no machineconfig created after step 6?
@mjudeiki Could you also provide the must-gather logs?
Did not get completed this sprint. Waiting for responses.
I don't have cluster alive anymore for this, so I can't provide an must-gather. But from what you wrote looks like you partially recreated it to the point there issue can be observed. I suspect node not ready is separate issue, not related to issue above.
Set up a PR to have new machineconfig generated after setting the label to master pool and editing the kubeletconfig. Verified the kubeletconfig can be applied to both master pool and worker pool. 1.# Apply the kubeletconfig to worker pool, use example kubeletconfig from https://github.com/openshift/machine-config-operator/blob/master/examples/kubeletconfig.crd.yaml $ oc label mcp worker custom-kubelet=small-pods machineconfigpool.machineconfiguration.openshift.io/worker labeled $ oc apply -f /home/qiwan/test-crds/kubeletconfig.yml kubeletconfig.machineconfiguration.openshift.io/set-max-pods created $ oc get mc 99-worker-generated-kubelet 2.# Tag master pool with the same label above and edit the kubeletconfig $ oc label mcp master custom-kubelet=small-pods machineconfigpool.machineconfiguration.openshift.io/master labeled $ oc edit kubeletconfig/set-max-pods kubeletconfig.machineconfiguration.openshift.io/set-max-pods edited $ oc get mc 99-master-generated-kubelet 99-worker-generated-kubelet 3. Debug into the node check the /etc/kubernetes/kubelet.conf has been updated.
https://github.com/openshift/machine-config-operator/pull/2759
verified on 4.7.0-0.nightly-2021-10-15-152957, test steps as Comment 5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.36 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3931