Bug 2009682 - machine-config-daemon automatically applies new render config and reverts it back on reboot
Summary: machine-config-daemon automatically applies new render config and reverts it ...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: Yu Qi Zhang
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-01 10:10 UTC by Himanshu Madhavani
Modified: 2022-01-18 08:24 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-17 21:53:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Himanshu Madhavani 2021-10-01 10:10:05 UTC
Note:
1. if you are dealing with Machines or MachineSet objects, please select the component as "Cloud Compute" under same product.
2. if you are dealing with kubelet / kubeletconfigs / container runtime configs, please select the component as "Node" under same product.

Description of problem:

MCD applied new render config even if it was not applied for specific labels or group and then reverts it back to original config after reboot.

2021-09-10T23:45:59.652474153Z I0910 23:45:59.652406    3807 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7
2021-09-24T23:26:54.172792871Z I0924 23:26:54.172752    3536 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:38:08.085630499Z I0924 23:38:08.085600    4041 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7

If we see above node had rebooted to "config rendered-infra-296e595dd79dd8e6fb76da54d26b7393" from rendered-infra-cb54d95db7c87bafa2af73265130cab7 and then reverted back to rendered-infra-cb54d95db7c87bafa2af73265130cab7.

Above change was not applied, however it did above automatically. We would like to understand if its bug.
We have must-gather data available. 

we would like to understand if we need to have more verbose logs then what would be ideal steps ?

Version-Release number of MCO (Machine Config Operator) (if applicable):

Platform (AWS, VSphere, Metal, etc.):

Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
(Y/N/Not sure):

How reproducible:

Actual results:

- MCO should not apply changes to labels which are not matching.
- MCO reverted config back to original after it noticed, however why such action took place.

Expected results:
- MCO should not apply changes to labels which are not matching.


Additional info:

1. Please consider attaching a must-gather archive (via oc adm must-gather). Please review must-gather contents for sensitive information before attaching any must-gathers to a Bugzilla report. You may also mark the bug private if you wish.

Attaching must gather data

Comment 2 Himanshu Madhavani 2021-10-04 17:56:32 UTC
Must-gather data is available in supportshell to the attached case.

Comment 3 Himanshu Madhavani 2021-10-20 12:45:04 UTC
Just to add some more information on the above issue.

System was updated to platform 4.6.40 on 28/08/2021.

Prior to applying kubeletconfig, machine-config-conntroller was restarted 50 minutes ago.


****************************************
new kubeletconfig to change SystemReservedMemory of master nodes was applied . I have applied it to MCP master. Once the master nodes have been successfully rebooted, the infra and router-pre nodes have started to reboot.
****************************************

Below yaml was applied

$ cat set-allocatable-master.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: set-allocatable-master
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet-systemreservation: reserved-resources-master
  kubeletConfig:
    systemReserved:
      cpu: 500m
      memory: 5.5Gi

$ oc get mcp
NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
infra        rendered-infra-cb54d95db7c87bafa2af73265130cab7        True      False      False      3              3                   3                     0                      245d
master       rendered-master-3e92983a28b6609e55b69ea3653982d5       True      False      False      3              3                   3                     0                      246d
router-pre   rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545   True      False      False      3              3                   3                     0                      50d
worker       rendered-worker-cb54d95db7c87bafa2af73265130cab7       True      False      False      0              0                   0                     0                      246d
worker-pre   rendered-worker-pre-cb54d95db7c87bafa2af73265130cab7   False     True       False      47             27                  47                    0                      233d
worker-pro   rendered-worker-pro-cb54d95db7c87bafa2af73265130cab7   False     True       False      34             15                  34                    0                      233d

$ oc get kubeletconfig
NAME                     AGE
logrotation              158d
set-allocatable          14d
set-allocatable-master   76m



After that, it behaved as if the right rendered config for router-pre was rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c and the right rendered config for infra was rendered-infra-296e595dd79dd8e6fb76da54d26b7393, so it starts rolling them out. Those are created on 2021-08-27 (both the creation timestamps and logs of non-rebooted nodes show it clearly). 


~~~
2021-09-24T23:31:02.499863462Z I0924 23:31:02.499822       1 node_controller.go:414] Pool infra: Setting node ocjc4-pqh6j-infra-0-mpmzr target to rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:31:02.499994502Z I0924 23:31:02.499981       1 node_controller.go:414] Pool router-pre: Setting node ocjc4-pqh6j-router-pre-6b2w9 target to rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
~~~

Shortly after, it realizes that those are not the right rendered configs, so it switches to the right rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545 and rendered-infra-cb54d95db7c87bafa2af73265130cab7, that were created on 2021-09-10 when set-allocatable was created. 

~~~
2021-09-24T23:31:02.609997730Z I0924 23:31:02.609949       1 render_controller.go:522] Pool worker-pro: now targeting: rendered-worker-pro-cb54d95db7c87bafa2af73265130cab7
2021-09-24T23:31:02.632968231Z I0924 23:31:02.632910       1 render_controller.go:522] Pool infra: now targeting: rendered-infra-cb54d95db7c87bafa2af73265130cab7
2021-09-24T23:31:02.680398232Z I0924 23:31:02.680343       1 render_controller.go:522] Pool router-pre: now targeting: rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545
2021-09-24T23:31:07.525145041Z I0924 23:31:07.525088       1 status.go:90] Pool worker: All nodes are updated with rendered-worker-cb54d95db7c87bafa2af73265130cab7
~~~

To add more confusion, if you diff both files (I'd recommend using a visual difftool like meld), you can see that the only relevant difference does correspond to the changes introduced by logrotation kubeletconfig (not set-allocatable, for which dates match, or set-allocatable-master, which is the one changed at the time of interest).

So in short, machine-config-controller behaves glitchy during its startup in what regards the kubeletconfig rendering. Those glitches may switch the rendered config of the mcp for long enough for a fake update to be triggered.


Below are the reboot time for the router-pre nodes

----------------------
ocjc4-pqh6j-router-pre-6b2w9
-----------------------------

$ omg logs machine-config-daemon-wrnj6 -c machine-config-daemon|grep reboot

2021-09-24T23:31:54.047424691Z I0924 23:31:54.047319    3481 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-24T23:35:36.312608646Z I0924 23:35:36.312561    3444 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545


----------------------------
ocjc4-pqh6j-router-pre-8prp2
-----------------------------

This node did not reboot as per the logs


$ omg logs machine-config-daemon-msk6d -c machine-config-daemon|grep reboot
2021-08-27T23:53:56.674494856Z I0827 23:53:56.674420 3123886 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-10T23:45:59.846527148Z I0910 23:45:59.846458    3902 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545

============================
ocjc4-pqh6j-router-pre-gwrrs
------------------------------


$ omg logs machine-config-daemon-bn748 -c machine-config-daemon|grep reboot
2021-09-24T23:26:42.381245275Z I0924 23:26:42.381168    3250 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c3356e1eb7c5a1f950dbdd5c52b16d0c
2021-09-24T23:39:42.001537137Z I0924 23:39:42.001485    3376 update.go:1678] initiating reboot: Node will reboot into config rendered-router-pre-c88a1ae5ac364578c5bbd26ffeb3c545


=================================================================================

Infra nodes

--------------------------
ocjc4-pqh6j-infra-0-6n6tj
--------------------------

$ omg logs machine-config-daemon-bt4sr -c machine-config-daemon|grep reboot

2021-09-24T23:26:54.172792871Z I0924 23:26:54.172752    3536 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:38:08.085630499Z I0924 23:38:08.085600    4041 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7


-----------------------------
ocjc4-pqh6j-infra-0-m6s9t
-----------------------------

This machine did not reboot on 24th as per the logs

$ omg logs machine-config-daemon-tsxxv -c machine-config-daemon|grep reboot
2021-08-27T23:59:21.960996406Z I0827 23:59:21.960939 4043758 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-10T23:50:14.607557463Z I0910 23:50:14.607519    4314 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7


--------------------------
ocjc4-pqh6j-infra-0-mpmzr
--------------------------

$ omg logs machine-config-daemon-xg9h7 -c machine-config-daemon|grep reboot
2021-09-24T23:32:24.210514498Z I0924 23:32:24.210435    3765 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-296e595dd79dd8e6fb76da54d26b7393
2021-09-24T23:42:38.389579170Z I0924 23:42:38.389548    3996 update.go:1678] initiating reboot: Node will reboot into config rendered-infra-cb54d95db7c87bafa2af73265130cab7

Comment 4 Sinny Kumari 2021-10-25 10:23:16 UTC
From Comment#3, it appears that issue started after applying kubeletconfig. Moving to node team to analyze if problem is related to kubeletconfig controller.

Comment 5 Yu Qi Zhang 2021-11-02 19:26:55 UTC
So a few notes:

1. I can see that the controller did at some point attempt to flip like you mentioned, but unfortunately due to the fact that the controller restarted about 50 minutes prior to the must-gather, we no longer have those logs as to why it did the flip.

2. The user in this case has multiple kubeletconfigs per pool. This is actually not supported at 4.6 due to the inability to merge or properly handle overwriting them. If you decode the rendered configs, you'll see that one of them has

  "containerLogMaxSize": "50Mi",
  "systemReserved": {
    "cpu": "500m",
    "ephemeral-storage": "1Gi",
    "memory": "13Gi"
  },

and one of them has 

  "containerLogMaxSize": "100Mi",
  "containerLogMaxFiles": 7,
  "systemReserved": {
    "cpu": "500m",
    "ephemeral-storage": "1Gi",
    "memory": "1Gi"
  },

And the user basically has both of those as separate kubeletconfigs: logrotation and set-allocatable. (logrotation and set-allocatable-master for master pool). Right now only one can take effect at any given time, and assuming what the desired end result is that both exist together, the user should merge those into one kubeletconfig per pool.

For masters, that would look like:

spec:
  kubeletConfig:
    systemReserved:
      cpu: 500m
      memory: 5.5Gi
    containerLogMaxFiles: 7
    containerLogMaxSize: 100Mi

For workers (which propagates to all other nodes) it should look like:

spec:
  kubeletConfig:
    systemReserved:
      cpu: 500m
      memory: 13Gi
    containerLogMaxFiles: 7
    containerLogMaxSize: 100Mi

For this to properly work. Note that in newer versions the overwrite is less buggy, but there is still no merge, so you must do the above still if you want both configs to apply.

My best guess as to what is happening, is that due to the fact that there are multiple configs, the controller somehow is fetching them in different orders one time(?) or reading one after the other and causing the kubeletconfigcontroller updating the base kubeletconfig machineconfig with the old one before realizing it should be on the new one. I can't be certain but it should no longer happen once there is only 1 kubeletconfig per pool

Comment 6 Yu Qi Zhang 2021-11-02 19:41:21 UTC
Also I just remembered, I think this is technically not deterministic in 4.6 (having multiple kubeletconfigs). It depends on the syncing of the object to the MCC, which can be due to e.g. a timestamp update or a controller update, I believe.

Comment 8 Yu Qi Zhang 2021-12-06 22:11:30 UTC
Sorry I didn't get a chance to get back to this earlier. Could you please follow the steps in this comment: https://bugzilla.redhat.com/show_bug.cgi?id=2009682#c5 and let me know if that works? I think we consider this a mix between user error and buggy behaviour. Please only have 1 kubeletconfig per pool.

Comment 9 Yu Qi Zhang 2022-01-17 21:53:17 UTC
Hi, given that there has been no response, I am going to close this as INSUFFICENT_DATA for now. If you are able to apply the fix (i.e. ensure there is only 1 kubeletconfig per pool, and merging the necessary kubeletconfigs together) and still see issues with the cluster, please provide updated information and reopen the bug. Thank you.

Comment 10 Pablo Alonso Rodriguez 2022-01-18 08:24:27 UTC
It was not possible to reproduce it again, so it is ok to leave this closed until we can reproduce.

Thanks for your help.


Note You need to log in before you can comment on or make changes to this bug.