Description of problem: The environment has only 5 kubelet configs, but when I am trying to create an additional kubeletconifg I got an error: max number of supported kubelet config (10) has been reached. Please delete old kubelet configs before retrying Version-Release number of selected component (if applicable): master How reproducible: Always Steps to Reproduce: 1. Create 10 Kubeletconfigs 2. Delete kubeletconfigs with the low suffix(machineconfiguration.openshift.io/mc-name-suffix:0) 3. Try to create an additional kubeletconfig 4. Check generated machine configs Actual results: Got an error max number of supported kubelet config (10) has been reached. Please delete old kubelet configs before retrying Expected results: The controller should succeed to create a machine-config. Additional info: The problem is that the code does not assume that some of kubeletconfig can be deleted. Relevant code: // If we are here, this means that a new kubelet config was created, so we have to calculate the suffix value for its MC name suffixNum := 0 // Go through the list of kubelet config objects created and get the max suffix value currently created for _, item := range kcList.Items { val, ok := item.GetAnnotations()[ctrlcommon.MCNameSuffixAnnotationKey] if ok { // Convert the suffix value to int so we can look through the list and grab the max suffix created so far intVal, err := strconv.Atoi(val) if err != nil { return "", fmt.Errorf("error converting %s to int: %v", val, err) } if intVal > suffixNum { suffixNum = intVal } } } // The max suffix value that we can go till with this logic is 9 - this means that a user can create up to 10 different kubelet config CRs. // However, if there is a kc-1 mapping to mc-1 and kc-2 mapping to mc-2 and the user deletes kc-1, it will delete mc-1 but // then if the user creates a kc-new it will map to mc-3. This is what we want as the latest kubelet config created should be higher in priority // so that those changes can be rolled out to the nodes. But users will have to be mindful of how many kubelet config CRs they create. Don't think // anyone should ever have the need to create 10 when they can simply update an existing kubelet config unless it is to apply to another pool. if suffixNum+1 > 9 { return "", fmt.Errorf("max number of supported kubelet config (10) has been reached. Please delete old kubelet configs before retrying") }
Are we not able to have more than (6) individual, unique MCPs each with a PAO (Performance Add On Operator) profile? I am able to trigger this bug as well in my environment, with adding (2) additional MCPs / PAOs. I had (4) before this. Output from before creating new MCPs/PAOs: $ oc get mc | grep kube 01-master-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 01-worker-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 99-ran-du-fec2-smci00-generated-kubelet-7 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d9h 99-ran-du-fec3-dell03-generated-kubelet-8 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 3d20h 99-ran-du-ldc1-smci01-generated-kubelet-5 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 6d11h 99-ran-du-ldc1-smci02-generated-kubelet-6 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d15h $ oc get kubeletconfig NAME AGE performance-ran-du-fec2-smci00-profile0 4d9h performance-ran-du-fec3-dell03-profile0 3d20h performance-ran-du-ldc1-smci01-profile0 6d11h performance-ran-du-ldc1-smci02-profile0 4d15h After creating (1) additional MCPs / PAOs, one called "ran-du-fec4-dell10", looks OK: $ oc get mc | grep kubelet 01-master-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 01-worker-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 99-ran-du-fec2-smci00-generated-kubelet-7 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d10h 99-ran-du-fec3-dell03-generated-kubelet-8 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 3d22h 99-ran-du-fec4-dell10-generated-kubelet-9 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 3s 99-ran-du-ldc1-smci01-generated-kubelet-5 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 6d12h 99-ran-du-ldc1-smci02-generated-kubelet-6 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d16h $ oc get kubeletconfig NAME AGE performance-ran-du-fec2-smci00-profile0 4d11h performance-ran-du-fec3-dell03-profile0 3d22h performance-ran-du-fec4-dell10-profile0 5s performance-ran-du-ldc1-smci01-profile0 6d13h performance-ran-du-ldc1-smci02-profile0 4d16h After creating (1) additional MCP / PAO, one called "ran-du-fec5-dell11", I notice there's no new generated MC (no fec5): $ oc get mc | grep kubelet 01-master-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 01-worker-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 99-ran-du-fec2-smci00-generated-kubelet-7 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d10h 99-ran-du-fec3-dell03-generated-kubelet-8 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 3d22h 99-ran-du-fec4-dell10-generated-kubelet-9 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 3m9s 99-ran-du-ldc1-smci01-generated-kubelet-5 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 6d12h 99-ran-du-ldc1-smci02-generated-kubelet-6 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 4d16h There is a new kubeletconfig: $ oc get kubeletconfig NAME AGE performance-ran-du-fec2-smci00-profile0 4d11h performance-ran-du-fec3-dell03-profile0 3d22h performance-ran-du-fec4-dell10-profile0 55s performance-ran-du-fec5-dell11-profile0 4s performance-ran-du-ldc1-smci01-profile0 6d13h performance-ran-du-ldc1-smci02-profile0 4d16h However, the new PAO exhibits the problem described in the summary: $ oc describe performanceprofiles.performance.openshift.io ran-du-fec5-dell11-profile0 Name: ran-du-fec5-dell11-profile0 Namespace: Labels: <none> Annotations: <none> API Version: performance.openshift.io/v2 Kind: PerformanceProfile Metadata: Creation Timestamp: 2021-08-16T13:19:25Z Finalizers: foreground-deletion Generation: 1 Managed Fields: API Version: performance.openshift.io/v2 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:cpu: .: f:isolated: f:reserved: f:hugepages: .: f:defaultHugepagesSize: f:pages: f:net: .: f:devices: f:userLevelNetworking: f:nodeSelector: .: f:node-role.kubernetes.io/ran-du-fec5-dell11: f:numa: .: f:topologyPolicy: f:realTimeKernel: .: f:enabled: Manager: kubectl-create Operation: Update Time: 2021-08-16T13:19:25Z API Version: performance.openshift.io/v2 Fields Type: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: v:"foreground-deletion": f:status: .: f:conditions: f:runtimeClass: f:tuned: Manager: performance-operator Operation: Update Time: 2021-08-16T13:19:25Z Resource Version: 15845626 UID: 92c13f6c-797b-405b-8419-d7515d86481d Spec: Cpu: Isolated: 3-31,35-63 Reserved: 0-2,32-34 Hugepages: Default Hugepages Size: 1G Pages: Count: 16 Node: 0 Size: 1G Net: Devices: Interface Name: ens3f0 Interface Name: ens3f1 Interface Name: ens4f0 Interface Name: ens4f1 Interface Name: ens4f2 Interface Name: ens4f3 Interface Name: ens4f4 Interface Name: ens4f5 Interface Name: ens4f6 Interface Name: ens4f7 Interface Name: eno8303np0 Interface Name: eno8403np1 Interface Name: eno8503np2 Interface Name: eno8603np3 User Level Networking: true Node Selector: node-role.kubernetes.io/ran-du-fec5-dell11: Numa: Topology Policy: best-effort Real Time Kernel: Enabled: true Status: Conditions: Last Heartbeat Time: 2021-08-16T13:19:53Z Last Transition Time: 2021-08-16T13:19:53Z Status: False Type: Available Last Heartbeat Time: 2021-08-16T13:19:53Z Last Transition Time: 2021-08-16T13:19:53Z Status: False Type: Upgradeable Last Heartbeat Time: 2021-08-16T13:19:53Z Last Transition Time: 2021-08-16T13:19:53Z Status: False Type: Progressing Last Heartbeat Time: 2021-08-16T13:19:53Z Last Transition Time: 2021-08-16T13:19:53Z Message: could not get kubelet config key: max number of supported kubelet config (10) has been reached. Please delete old kubelet configs before retrying Reason: KubeletConfig failure Status: True Type: Degraded Runtime Class: performance-ran-du-fec5-dell11-profile0 Tuned: openshift-cluster-node-tuning-operator/openshift-node-performance-ran-du-fec5-dell11-profile0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Creation succeeded 0s (x13 over 89s) performance-profile-controller Succeeded to create all components Any workarounds?
I attempted a workaround in my environment as suggested by Artyom: 1. scale down the PAO Operator replicas to 0 2. Pause all MCPs 3. Delete all kubeletconfigs created by PAO 4. scale up the PAO replicas to 1 5. wait until it will re-create all kubeletconfigs 6. unpause all MCPs This reset the suffix numbering to 0, which unblocked me: $ oc get mc | grep kubelet 01-master-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 01-worker-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 18d 99-ran-du-fec2-smci00-generated-kubelet-3 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 5m9s 99-ran-du-fec3-dell03-generated-kubelet 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 5m12s 99-ran-du-ldc1-smci01-generated-kubelet-1 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 5m12s 99-ran-du-ldc1-smci02-generated-kubelet-2 29813c845a4a3ee8e6856713c585aca834e0bf1e 3.2.0 5m10s but still, there will be environments where we may have almost 500 different MCPs/PAOs given different hardware types and configurations. This should really be scalable beyond 9 on a per cluster basis.
Hi, When you create a kubelet config, a corresponding machine config (MC) is created for it and is applied to the nodes matching the specified pool. MCs are applied in alphanumeric order, so if I have mc-1 and mc-2, mc-2 will take priority and you will only see the configuration from mc-2 applied to your nodes. That is why, the number of kubeletconfigs is limited as the max suffix the corresponding MC can go to is "-9". We have documented that when you hit the limit of 10 and need to create more kubeletconfigs, you need to delete your kubeletconfigs in reverse order, i.e from most recent to oldest. So you will delete the kubeletconfig that created the "kubelet-mc-9" MC and so forth. This is how MCs and kubeletconfigs are designed and the same is true for containerruntimeconfigs. You can find the documentation on this here https://docs.openshift.com/container-platform/4.7/post_installation_configuration/machine-configuration-tasks.html#create-a-kubeletconfig-crd-to-edit-kubelet-parameters_post-install-machine-configuration-tasks. Needing to have 500 different configurations for one cluster seems like a very rare case, a possible workaround could be to create separate Machine Configs to directly change the kubelet.conf files on the nodes as kubeletconfig has a limit of 10.
(In reply to Urvashi Mohnani from comment #5) > When you create a kubelet config, a corresponding machine config (MC) is > created for it and is applied to the nodes matching the specified pool. MCs > are applied in alphanumeric order, so if I have mc-1 and mc-2, mc-2 will > take priority and you will only see the configuration from mc-2 applied to > your nodes. That is why, the number of kubeletconfigs is limited as the max > suffix the corresponding MC can go to is "-9". We have documented that when > you hit the limit of 10 and need to create more kubeletconfigs, you need to > delete your kubeletconfigs in reverse order, i.e from most recent to oldest. > So you will delete the kubeletconfig that created the "kubelet-mc-9" MC and > so forth. This is how MCs and kubeletconfigs are designed and the same is > true for containerruntimeconfigs. Does the Performance Add On Operator know how to handle this?
Not really, we can do it, but IMHO it should be supported by the KubeletConfig controller. Like I do not see a reason for such limitation. An additional thing, it can limit it by 10 for each MCP and not for all of them.
Checked on 4.9.0-0.nightly-2021-09-06-055314, created 10 kubeletconfig. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-09-06-055314 True False 5h6m Cluster version is 4.9.0-0.nightly-2021-09-06-055314 $ oc get kubeletconfig NAME AGE set-max-pods-1 4h56m set-max-pods-10 14m set-max-pods-2 4h49m set-max-pods-3 4h41m set-max-pods-4 4h35m set-max-pods-5 4h27m set-max-pods-6 138m set-max-pods-7 122m set-max-pods-8 85m set-max-pods-9 30m $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 00-worker 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 01-master-container-runtime 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 01-master-kubelet 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 01-worker-container-runtime 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 01-worker-kubelet 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 99-master-generated-registries 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 99-master-ssh 3.2.0 5h31m 99-worker-generated-kubelet 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h56m 99-worker-generated-kubelet-1 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h49m 99-worker-generated-kubelet-2 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h41m 99-worker-generated-kubelet-3 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h35m 99-worker-generated-kubelet-4 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h27m 99-worker-generated-kubelet-5 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 138m 99-worker-generated-kubelet-6 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 122m 99-worker-generated-kubelet-7 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 85m 99-worker-generated-kubelet-8 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 30m 99-worker-generated-kubelet-9 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 14m 99-worker-generated-registries 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m 99-worker-ssh 3.2.0 5h31m rendered-master-2e7cd6479c24109f2e0f5d021c69d103 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m rendered-worker-0fc3dd7ff9b7d0204f320d9d303a7c9f 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h35m rendered-worker-48fd4be36648379fbb4eb1665b9cbf00 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h56m rendered-worker-4cf6bbd9199d7589b055ddee88619a8e 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 30m rendered-worker-5e90bda2302d7ef2077d3d4d4833cd72 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h27m rendered-worker-80b39c8e9894054014bbd4d036b12d73 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 85m rendered-worker-86113f9fcec81bf6fac7785e8df98168 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 14m rendered-worker-88d901113a41e0c32c0514c7af3964f6 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 122m rendered-worker-95f00be90ae402a404a50286ad371f9b 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 138m rendered-worker-980cc08b06917c8c82c33254181f0092 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 5h28m rendered-worker-ac354bf54e90d30b86457b82a5e5bed3 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h41m rendered-worker-d7a533c322c051fcb4066552be840661 2ec816a4aa741821e664fa512ab02f465926c0ab 3.2.0 4h49m $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-2e7cd6479c24109f2e0f5d021c69d103 True False False 3 3 3 0 5h30m worker rendered-worker-86113f9fcec81bf6fac7785e8df98168 True False False 3 3 3 0 5h30m $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-156-0.us-east-2.compute.internal Ready master 5h31m v1.22.0-rc.0+75ee307 ip-10-0-157-43.us-east-2.compute.internal Ready worker 5h24m v1.22.0-rc.0+75ee307 ip-10-0-160-231.us-east-2.compute.internal Ready master 5h30m v1.22.0-rc.0+75ee307 ip-10-0-189-205.us-east-2.compute.internal Ready worker 5h24m v1.22.0-rc.0+75ee307 ip-10-0-197-226.us-east-2.compute.internal Ready worker 5h24m v1.22.0-rc.0+75ee307 ip-10-0-201-4.us-east-2.compute.internal Ready master 5h30m v1.22.0-rc.0+75ee307
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759