Description of problem: machine-config cluster operator degraded due to controller version mismatch ~~~ $ omg get co machine-config -o yaml ... - lastTransitionTime: '2021-04-22T22:41:00Z' message: 'Unable to apply 4.6.25: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for 98-master-generated-kubelet expected d5dc2b519aed5b3ed6a6ab9e7f70f33740f9f8af has 14a2b82d9f4c4d8b423f8f05f6926778ef36870d: all 3 nodes are at latest configuration rendered-master-381b6c37f8f8020f2e740ba44a1460a2, retrying' reason: RequiredPoolsFailed status: 'True' type: Degraded ... extension: lastSyncError: 'pool master has not progressed to latest configuration: controller version mismatch for 98-master-generated-kubelet expected d5dc2b519aed5b3ed6a6ab9e7f70f33740f9f8af has 14a2b82d9f4c4d8b423f8f05f6926778ef36870d: all 3 nodes are at latest configuration rendered-master-381b6c37f8f8020f2e740ba44a1460a2, retrying' master: all 3 nodes are at latest configuration rendered-master-381b6c37f8f8020f2e740ba44a1460a2 worker: all 13 nodes are at latest configuration rendered-worker-e08dcb17ae6631b16767bdd8b61c8e93 ... ~~~ Version-Release number of selected component (if applicable): Version: 4.6.25 Version: 4.6.18 Steps to Reproduce: 1. Upgrade to 4.6.25 from 4.6.18 Actual results: ~~~ $ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version True True 2m52s Unable to apply 4.6.25: the cluster operator machine-config has not yet successfully rolled out ~~~ ~~~ $ omg get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE machine-config 4.6.18 False True True 21h ~~~ Expected results: Upgrade to 4.6.25 successfully. Additional info: Attached the "01-master-kubelet_content.json", "98-master-generated-kubelet_content.json" and "machine-config-operator-57c965559d-66sl2.log" files
Hi, The linked error ``` 'Unable to apply 4.6.25: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for 98-master-generated-kubelet expected d5dc2b519aed5b3ed6a6ab9e7f70f33740f9f8af has 14a2b82d9f4c4d8b423f8f05f6926778ef36870d: all 3 nodes are at latest configuration rendered-master-381b6c37f8f8020f2e740ba44a1460a2, retrying' ``` is basically saying a previous version of the MCO created a machineconfig based on a kubeletconfig, but the new one did not regenerate it, as seen by your later command: 98-master-generated-kubelet 14a2b82d9f4c4d8b423f8f05f6926778ef36870d 3.1.0 10d 98-worker-generated-kubelet eab9c35dfbeb0d21be6e1db3887acbbb93592d34 3.1.0 10d that is very odd, both the master and worker kubeletconfig never generated by the new version (d5dc2b519aed5b3ed6a6ab9e7f70f33740f9f8af), like all the other non-rendered configs. I have a few questions: 1. were those ever modified manually? 2. could you post the kubeletconfigs on the system? 3. could you post the machineconfigcontroller pod logs? (oc get logs -n openshift-machine-config-operator machine-config-controller-xxx)