Bug 2035927
Summary: | Cannot enable HighNodeUtilization scheduler profile | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | RamaKasturi <knarra> |
Component: | kube-scheduler | Assignee: | Jan Chaloupka <jchaloup> |
Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.10 | CC: | aos-bugs, mfojtik |
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:36:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
RamaKasturi
2021-12-28 16:35:56 UTC
The NodeResourcesMostAllocated plugin was removed as part of https://github.com/kubernetes/kubernetes/pull/101822. For more detail see https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/2458-node-resource-score-strategy enhancement. From https://github.com/kubernetes/kubernetes/blob/1727cea64c1d53f7badbc03b0ca77543283e6157/pkg/scheduler/apis/config/v1beta2/default_plugins.go: ``` Score: v1beta2.PluginSet{ Enabled: []v1beta2.Plugin{ {Name: names.NodeResourcesBalancedAllocation, Weight: pointer.Int32Ptr(1)}, {Name: names.ImageLocality, Weight: pointer.Int32Ptr(1)}, {Name: names.InterPodAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeResourcesFit, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeAffinity, Weight: pointer.Int32Ptr(1)}, // Weight is doubled because: // - This is a score coming from user preference. // - It makes its signal comparable to NodeResourcesFit.LeastAllocated. {Name: names.PodTopologySpread, Weight: pointer.Int32Ptr(2)}, {Name: names.TaintToleration, Weight: pointer.Int32Ptr(1)}, }, }, ``` From https://github.com/kubernetes/kubernetes/blob/release-1.22/pkg/scheduler/apis/config/v1beta1/default_plugins.go: ``` Score: &v1beta1.PluginSet{ Enabled: []v1beta1.Plugin{ {Name: names.NodeResourcesBalancedAllocation, Weight: pointer.Int32Ptr(1)}, {Name: names.ImageLocality, Weight: pointer.Int32Ptr(1)}, {Name: names.InterPodAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeResourcesLeastAllocated, Weight: pointer.Int32Ptr(1)}, {Name: names.NodeAffinity, Weight: pointer.Int32Ptr(1)}, {Name: names.NodePreferAvoidPods, Weight: pointer.Int32Ptr(10000)}, // Weight is doubled because: // - This is a score coming from user preference. // - It makes its signal comparable to NodeResourcesLeastAllocated. {Name: names.PodTopologySpread, Weight: pointer.Int32Ptr(2)}, {Name: names.TaintToleration, Weight: pointer.Int32Ptr(1)}, }, }, ``` NodeResourcesLeastAllocated turns into NodeResourcesFit. kubescheduler.config.k8s.io/v1beta2 after applying the fix: ``` ... profiles: - pluginConfig: ... - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: MostAllocated name: NodeResourcesFit ... plugins: ... score: enabled: - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeAffinity weight: 1 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 - name: NodeResourcesFit weight: 5 ... ``` No sign of NodeResourcesBalancedAllocation as in the previous case. NodeResourcesLeastAllocated completely gone, only NodeResourcesFit kept with the "type: MostAllocated" configuration. In 4.9 case with HighNodeUtilization profile on: ``` profiles: - pluginConfig: ... - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit ... plugins: ... score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeAffinity weight: 1 - name: NodePreferAvoidPods weight: 10000 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 - name: NodeResourcesMostAllocated weight: 0 ... ``` NodeResourcesMostAllocated weight is 0 making it appear disabled. However, based on https://github.com/openshift/kubernetes/blob/release-4.9/pkg/scheduler/framework/runtime/framework.go#L293-L299: ``` for _, e := range profile.Plugins.Score.Enabled { // a weight of zero is not permitted, plugins can be disabled explicitly // when configured. f.scorePluginWeight[e.Name] = int(e.Weight) if f.scorePluginWeight[e.Name] == 0 { f.scorePluginWeight[e.Name] = 1 } ``` the plugin is enabled as expected. Thus, no need to backport the change to 4.9. The NodeResourcesBalancedAllocation plugin is enabled since https://github.com/openshift/cluster-kube-scheduler-operator/pull/379 has not merged yet. Due to higher priority tasks I have been able to resolve this issue in time. Moving to the next sprint. Tested with latest nightly build which is 4.10.0-0.nightly-2022-01-18-044014 and i still see NodeResourcesBalancedAllocation which is not expected to be present after the bug fix here. As suggested by jan and maciej i am going to wait until the bug https://bugzilla.redhat.com/show_bug.cgi?id=2033751 moves to ON_QA to verify this bug. Have tried verifying the bug with the build below but still see NodeResourceBalancedAllocation parameters, so moving the bug to assigned state. [knarra@knarra ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-22-102609 True False 3h13m Cluster version is 4.10.0-0.nightly-2022-01-22-102609 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta2 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: enabled: - name: DefaultBinder weight: 0 filter: enabled: - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 0 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 multiPoint: {} permit: {} postBind: {} postFilter: enabled: - name: DefaultPreemption weight: 0 preBind: enabled: - name: VolumeBinding weight: 0 preFilter: enabled: - name: NodeResourcesFit weight: 0 - name: NodePorts weight: 0 - name: VolumeRestrictions weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 - name: VolumeBinding weight: 0 - name: NodeAffinity weight: 0 preScore: enabled: - name: InterPodAffinity weight: 0 - name: PodTopologySpread weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 queueSort: enabled: - name: PrioritySort weight: 0 reserve: enabled: - name: VolumeBinding weight: 0 score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeResourcesFit weight: 1 - name: NodeAffinity weight: 1 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Verified with the build below and i could successfully enable HighNodeUtilization profile, did not see any crash with respect to kube-scheduler while enabling this profile. [knarra@knarra ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-24-070025 True False 5h24m Cluster version is 4.10.0-0.nightly-2022-01-24-070025 -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: MostAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: {} queueSort: {} reserve: {} score: disabled: - name: NodeResourcesBalancedAllocation weight: 0 enabled: - name: NodeResourcesFit weight: 5 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Enabled LowNodeUtilization profile and no error seen with the same. -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: {} queueSort: {} reserve: {} score: {} schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Enabled "NoScoring" profile and do not see any issues with the same. -------------------------Configuration File Contents Start Here---------------------- apiVersion: kubescheduler.config.k8s.io/v1beta3 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/static-pod-resources/configmaps/scheduler-kubeconfig/kubeconfig qps: 50 enableContentionProfiling: true enableProfiling: true kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 2m17s renewDeadline: 1m47s resourceLock: configmaps resourceName: kube-scheduler resourceNamespace: openshift-kube-scheduler retryPeriod: 26s parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesBalancedAllocationArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesBalancedAllocation - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: NodeResourcesFitArgs scoringStrategy: resources: - name: cpu weight: 1 - name: memory weight: 1 type: LeastAllocated name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta3 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: {} filter: {} multiPoint: enabled: - name: PrioritySort weight: 0 - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 3 - name: NodeAffinity weight: 2 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 1 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 2 - name: InterPodAffinity weight: 2 - name: DefaultPreemption weight: 0 - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: DefaultBinder weight: 0 permit: {} postBind: {} postFilter: {} preBind: {} preFilter: {} preScore: disabled: - name: '*' weight: 0 queueSort: {} reserve: {} score: disabled: - name: '*' weight: 0 schedulerName: default-scheduler ------------------------------------Configuration File Contents End Here--------------------------------- Based on the above moving bug to verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |