Bug 2009682
| Summary: | machine-config-daemon automatically applies new render config and reverts it back on reboot | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Himanshu Madhavani <himadhav> |
| Component: | Machine Config Operator | Assignee: | Yu Qi Zhang <jerzhang> |
| Machine Config Operator sub component: | Machine Config Operator | QA Contact: | Rio Liu <rioliu> |
| Status: | CLOSED INSUFFICIENT_DATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aos-bugs, apjagtap, harpatil, jerzhang, mkrejci, nagrawal, palonsor, skumari |
| Version: | 4.6 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-17 21:53:17 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Himanshu Madhavani
2021-10-01 10:10:05 UTC
Must-gather data is available in supportshell to the attached case. Just to add some more information on the above issue.
System was updated to platform 4.6.40 on 28/08/2021.
Prior to applying kubeletconfig, machine-config-conntroller was restarted 50 minutes ago.
****************************************
new kubeletconfig to change SystemReservedMemory of master nodes was applied . I have applied it to MCP master. Once the master nodes have been successfully rebooted, the infra and router-pre nodes have started to reboot.
****************************************
Below yaml was applied
$ cat set-allocatable-master.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: set-allocatable-master
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet-systemreservation: reserved-resources-master
kubeletConfig:
systemReserved:
cpu: 500m
memory: 5.5Gi
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
infra rendered-infra-cb54d95db7c87bafa2af73265130cab7 True False False 3 3 3 0 245d
master rendered-master-3e92983a28b6609e55b69ea3653982d5 True False False 3 3 3 0 246d
router-pre rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545 True False False 3 3 3 0 50d
worker rendered-worker-cb54d95db7c87bafa2af73265130cab7 True False False 0 0 0 0 246d
worker-pre rendered-worker-pre-cb54d95db7c87bafa2af73265130cab7 False True False 47 27 47 0 233d
worker-pro rendered-worker-pro-cb54d95db7c87bafa2af73265130cab7 False True False 34 15 34 0 233d
$ oc get kubeletconfig
NAME AGE
logrotation 158d
set-allocatable 14d
set-allocatable-master 76m
After that, it behaved as if the right rendered config for router-pre was rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c and the right rendered config for infra was rendered-infra-296e595dd79dd8e6fb76da54d26b7393, so it starts rolling them out. Those are created on 2021-08-27 (both the creation timestamps and logs of non-rebooted nodes show it clearly).
~~~
2021-09-24T23:31:02.499863462Z I0924 23:31:02.499822 1 node_controller.go:414] Pool infra: Setting node ocjc4-pqh6j-infra-0-mpmzr target to rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:31:02.499994502Z I0924 23:31:02.499981 1 node_controller.go:414] Pool router-pre: Setting node ocjc4-pqh6j-router-pre-6b2w9 target to rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
~~~
Shortly after, it realizes that those are not the right rendered configs, so it switches to the right rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545 and rendered-infra-cb54d95db7c87bafa2af73265130cab7, that were created on 2021-09-10 when set-allocatable was created.
~~~
2021-09-24T23:31:02.609997730Z I0924 23:31:02.609949 1 render_controller.go:522] Pool worker-pro: now targeting: rendered-worker-pro-cb54d95db7c87bafa2af73265130cab7
2021-09-24T23:31:02.632968231Z I0924 23:31:02.632910 1 render_controller.go:522] Pool infra: now targeting: rendered-infra-cb54d95db7c87bafa2af73265130cab7
2021-09-24T23:31:02.680398232Z I0924 23:31:02.680343 1 render_controller.go:522] Pool router-pre: now targeting: rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545
2021-09-24T23:31:07.525145041Z I0924 23:31:07.525088 1 status.go:90] Pool worker: All nodes are updated with rendered-worker-cb54d95db7c87bafa2af73265130cab7
~~~
To add more confusion, if you diff both files (I'd recommend using a visual difftool like meld), you can see that the only relevant difference does correspond to the changes introduced by logrotation kubeletconfig (not set-allocatable, for which dates match, or set-allocatable-master, which is the one changed at the time of interest).
So in short, machine-config-controller behaves glitchy during its startup in what regards the kubeletconfig rendering. Those glitches may switch the rendered config of the mcp for long enough for a fake update to be triggered.
Below are the reboot time for the router-pre nodes
----------------------
ocjc4-pqh6j-router-pre-6b2w9
-----------------------------
$ omg logs machine-config-daemon-wrnj6 -c machine-config-daemon|grep reboot
2021-09-24T23:31:54.047424691Z I0924 23:31:54.047319 3481 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-24T23:35:36.312608646Z I0924 23:35:36.312561 3444 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545
----------------------------
ocjc4-pqh6j-router-pre-8prp2
-----------------------------
This node did not reboot as per the logs
$ omg logs machine-config-daemon-msk6d -c machine-config-daemon|grep reboot
2021-08-27T23:53:56.674494856Z I0827 23:53:56.674420 3123886 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-10T23:45:59.846527148Z I0910 23:45:59.846458 3902 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545
============================
ocjc4-pqh6j-router-pre-gwrrs
------------------------------
$ omg logs machine-config-daemon-bn748 -c machine-config-daemon|grep reboot
2021-09-24T23:26:42.381245275Z I0924 23:26:42.381168 3250 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-24T23:39:42.001537137Z I0924 23:39:42.001485 3376 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545
=================================================================================
Infra nodes
--------------------------
ocjc4-pqh6j-infra-0-6n6tj
--------------------------
$ omg logs machine-config-daemon-bt4sr -c machine-config-daemon|grep reboot
2021-09-24T23:26:54.172792871Z I0924 23:26:54.172752 3536 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:38:08.085630499Z I0924 23:38:08.085600 4041 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7
-----------------------------
ocjc4-pqh6j-infra-0-m6s9t
-----------------------------
This machine did not reboot on 24th as per the logs
$ omg logs machine-config-daemon-tsxxv -c machine-config-daemon|grep reboot
2021-08-27T23:59:21.960996406Z I0827 23:59:21.960939 4043758 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-10T23:50:14.607557463Z I0910 23:50:14.607519 4314 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7
--------------------------
ocjc4-pqh6j-infra-0-mpmzr
--------------------------
$ omg logs machine-config-daemon-xg9h7 -c machine-config-daemon|grep reboot
2021-09-24T23:32:24.210514498Z I0924 23:32:24.210435 3765 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:42:38.389579170Z I0924 23:42:38.389548 3996 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7
From Comment#3, it appears that issue started after applying kubeletconfig. Moving to node team to analyze if problem is related to kubeletconfig controller. So a few notes:
1. I can see that the controller did at some point attempt to flip like you mentioned, but unfortunately due to the fact that the controller restarted about 50 minutes prior to the must-gather, we no longer have those logs as to why it did the flip.
2. The user in this case has multiple kubeletconfigs per pool. This is actually not supported at 4.6 due to the inability to merge or properly handle overwriting them. If you decode the rendered configs, you'll see that one of them has
"containerLogMaxSize": "50Mi",
"systemReserved": {
"cpu": "500m",
"ephemeral-storage": "1Gi",
"memory": "13Gi"
},
and one of them has
"containerLogMaxSize": "100Mi",
"containerLogMaxFiles": 7,
"systemReserved": {
"cpu": "500m",
"ephemeral-storage": "1Gi",
"memory": "1Gi"
},
And the user basically has both of those as separate kubeletconfigs: logrotation and set-allocatable. (logrotation and set-allocatable-master for master pool). Right now only one can take effect at any given time, and assuming what the desired end result is that both exist together, the user should merge those into one kubeletconfig per pool.
For masters, that would look like:
spec:
kubeletConfig:
systemReserved:
cpu: 500m
memory: 5.5Gi
containerLogMaxFiles: 7
containerLogMaxSize: 100Mi
For workers (which propagates to all other nodes) it should look like:
spec:
kubeletConfig:
systemReserved:
cpu: 500m
memory: 13Gi
containerLogMaxFiles: 7
containerLogMaxSize: 100Mi
For this to properly work. Note that in newer versions the overwrite is less buggy, but there is still no merge, so you must do the above still if you want both configs to apply.
My best guess as to what is happening, is that due to the fact that there are multiple configs, the controller somehow is fetching them in different orders one time(?) or reading one after the other and causing the kubeletconfigcontroller updating the base kubeletconfig machineconfig with the old one before realizing it should be on the new one. I can't be certain but it should no longer happen once there is only 1 kubeletconfig per pool
Also I just remembered, I think this is technically not deterministic in 4.6 (having multiple kubeletconfigs). It depends on the syncing of the object to the MCC, which can be due to e.g. a timestamp update or a controller update, I believe. Sorry I didn't get a chance to get back to this earlier. Could you please follow the steps in this comment: https://bugzilla.redhat.com/show_bug.cgi?id=2009682#c5 and let me know if that works? I think we consider this a mix between user error and buggy behaviour. Please only have 1 kubeletconfig per pool. Hi, given that there has been no response, I am going to close this as INSUFFICENT_DATA for now. If you are able to apply the fix (i.e. ensure there is only 1 kubeletconfig per pool, and merging the necessary kubeletconfigs together) and still see issues with the cluster, please provide updated information and reopen the bug. Thank you. It was not possible to reproduce it again, so it is ok to leave this closed until we can reproduce. Thanks for your help. |