Description of problem: When Enabling HigNodeUtilization scheduler profile the scheduler pods goes into crashLoopBackOffState with error “profiles[0].plugins.score.enabled[6]: Invalid value: "NodeResourcesMostAllocated": was removed in version "kubescheduler.config.k8s.io/v1beta2" (KubeSchedulerConfiguration is version "kubescheduler.config.k8s.io/v1beta2")” Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2021-12-20-231053 How reproducible: Always Steps to Reproduce: 1. Install latest 4.10 cluster 2. Now run the command below to enable HigNodeUtilization scheduler profile 3. oc patch Scheduler cluster --type='json' -p='[{"op": "add", "path": "/spec/profile", "value":"HighNodeUtilization"}]' Actual results: Scheduler pod goes into CrashLoopBackOffState with error “profiles[0].plugins.score.enabled[6]: Invalid value: "NodeResourcesMostAllocated": was removed in version "kubescheduler.config.k8s.io/v1beta2" (KubeSchedulerConfiguration is version "kubescheduler.config.k8s.io/v1beta2")” openshift-kube-scheduler-master-00.knarra2712.qe.devcluster.openshift.com 2/3 CrashLoopBackOff 8 (2m51s ago) 19m openshift-kube-scheduler-master-01.knarra2712.qe.devcluster.openshift.com 3/3 Running 0 80m openshift-kube-scheduler-master-02.knarra2712.qe.devcluster.openshift.com 3/3 Running 0 81m Expected results: Scheduler pod should not go into CrashLoopBackOffState or right way to enable HighNodeUtilization from 4.10 should be defined Additional info:
The NodeResourcesMostAllocated plugin was removed as part of https://github.com/kubernetes/kubernetes/pull/101822. For more detail see https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/2458-node-resource-score-strategy enhancement.
From https://github.com/kubernetes/kubernetes/blob/1727cea64c1d53f7badbc03b0ca77543283e6157/pkg/scheduler/apis/config/v1beta2/default_plugins.go: ``` Score: v1beta2.PluginSet{ Enabled: []v1beta2.Plugin{ {Name: names.NodeResourcesBalancedAllocation, Weight: pointer.Int32Ptr(1)}, {Name: names.ImageLocality, Weight: pointer.Int32Ptr(1)}, {Name: names.InterPodAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeResourcesFit, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeAffinity, Weight: pointer.Int32Ptr(1)}, // Weight is doubled because: // - This is a score coming from user preference. // - It makes its signal comparable to NodeResourcesFit.LeastAllocated. {Name: names.PodTopologySpread, Weight: pointer.Int32Ptr(2)}, {Name: names.TaintToleration, Weight: pointer.Int32Ptr(1)}, }, }, ``` From https://github.com/kubernetes/kubernetes/blob/release-1.22/pkg/scheduler/apis/config/v1beta1/default_plugins.go: ``` Score: &v1beta1.PluginSet{ Enabled: []v1beta1.Plugin{ {Name: names.NodeResourcesBalancedAllocation, Weight: pointer.Int32Ptr(1)}, {Name: names.ImageLocality, Weight: pointer.Int32Ptr(1)}, {Name: names.InterPodAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeResourcesLeastAllocated, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodePreferAvoidPods, Weight: pointer.Int32Ptr(10000)}, // Weight is doubled because: // - This is a score coming from user preference. // - It makes its signal comparable to NodeResourcesLeastAllocated. {Name: names.PodTopologySpread, Weight: pointer.Int32Ptr(2)}, {Name: names.TaintToleration, Weight: pointer.Int32Ptr(1)}, }, }, ``` NodeResourcesLeastAllocated turns into NodeResourcesFit.
kubescheduler.config.k8s.io/v1beta2 after applying the fix: ``` ... profiles: - pluginConfig: ... - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: MostAllocated name: NodeResourcesFit ... plugins: ... score: enabled: - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeAffinity weight: 1 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 - name: NodeResourcesFit weight: 5 ... ``` No sign of NodeResourcesBalancedAllocation as in the previous case. NodeResourcesLeastAllocated completely gone, only NodeResourcesFit kept with the "type: MostAllocated" configuration.
In 4.9 case with HighNodeUtilization profile on: ``` profiles: - pluginConfig: ... - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit ... plugins: ... score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeAffinity weight: 1 - name: NodePreferAvoidPods weight: 10000 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 - name: NodeResourcesMostAllocated weight: 0 ... ``` NodeResourcesMostAllocated weight is 0 making it appear disabled. However, based on https://github.com/openshift/kubernetes/blob/release-4.9/pkg/scheduler/framework/runtime/framework.go#L293-L299: ``` for _, e := range profile.Plugins.Score.Enabled { // a weight of zero is not permitted, plugins can be disabled explicitly // when configured. f.scorePluginWeight[e.Name] = int(e.Weight) if f.scorePluginWeight[e.Name] == 0 { f.scorePluginWeight[e.Name] = 1 } ``` the plugin is enabled as expected. Thus, no need to backport the change to 4.9. The NodeResourcesBalancedAllocation plugin is enabled since https://github.com/openshift/cluster-kube-scheduler-operator/pull/379 has not merged yet.
Due to higher priority tasks I have been able to resolve this issue in time. Moving to the next sprint.
Tested with latest nightly build which is 4.10.0-0.nightly-2022-01-18-044014 and i still see NodeResourcesBalancedAllocation which is not expected to be present after the bug fix here. As suggested by jan and maciej i am going to wait until the bug https://bugzilla.redhat.com/show_bug.cgi?id=2033751 moves to ON_QA to verify this bug.
Have tried verifying the bug with the build below but still see NodeResourceBalancedAllocation parameters, so moving the bug to assigned state. [knarra@knarra ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-22-102609 True False 3h13m Cluster version is 4.10.0-0.nightly-2022-01-22-102609 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: enabled: - name: DefaultBinder weight: 0 filter: enabled: - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 0 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 multiPoint: {} permit: {} postBind: {} postFilter: enabled: - name: DefaultPreemption weight: 0 preBind: enabled: - name: VolumeBinding weight: 0 preFilter: enabled: - name: NodeResourcesFit weight: 0 - name: NodePorts weight: 0 - name: VolumeRestrictions weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 - name: VolumeBinding weight: 0 - name: NodeAffinity weight: 0 preScore: enabled: - name: InterPodAffinity weight: 0 - name: PodTopologySpread weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 queueSort: enabled: - name: PrioritySort weight: 0 reserve: enabled: - name: VolumeBinding weight: 0 score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeResourcesFit weight: 1 - name: NodeAffinity weight: 1 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here---------------------------------
Verified with the build below and i could successfully enable HighNodeUtilization profile, did not see any crash with respect to kube-scheduler while enabling this profile. [knarra@knarra ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-24-070025 True False 5h24m Cluster version is 4.10.0-0.nightly-2022-01-24-070025 -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: MostAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: {} queueSort: {} reserve: {} score: disabled: - name: NodeResourcesBalancedAllocation weight: 0 enabled: - name: NodeResourcesFit weight: 5 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Enabled LowNodeUtilization profile and no error seen with the same. -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: {} queueSort: {} reserve: {} score: {} schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Enabled "NoScoring" profile and do not see any issues with the same. -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: disabled: - name: '*' weight: 0 queueSort: {} reserve: {} score: disabled: - name: '*' weight: 0 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Based on the above moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056