Bug 1820253
| Summary: | [Descheduler] RemovePodsViolatingNodeAffinity does not evict pods even when there are viable nodes which can fit them | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | RamaKasturi <knarra> | |
| Component: | kube-scheduler | Assignee: | Mike Dame <mdame> | |
| Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.4 | CC: | aos-bugs, maszulik, mdame, mfojtik | |
| Target Milestone: | --- | |||
| Target Release: | 4.5.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause: Descheduler would return early in a loop checking nodes to determine if pods were evictable in NodeAffinity strategy
Consequence: Pods that were evictable (because they would fit on a certain node) may not be evicted
Fix: Change break condition of the node-checking loop
Result: All nodes are considered when checking evictability
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1821772 (view as bug list) | Environment: | ||
| Last Closed: | 2020-07-13 17:25:01 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1821772 | |||
Upstream PR which I believe will fix this: https://github.com/kubernetes-sigs/descheduler/pull/256 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |
Description of problem: I see that RemovePodsViolatingNodeAffinity does not evict pods even when there are viable nodes which can fit them. Version-Release number of selected component (if applicable): 4.4.0-0.nightly-2020-04-01-080616 How reproducible: Always Steps to Reproduce: 1. Configure descheduler operator on the cluster with 3 worker nodes and make sure that all of them are schedulable 2. Now apply strategy "RemovePodsViolatingNodeAffinity" 3. Run the command to create pods "oc run hello --image=openshift/hello-openshift:latest --replicas=2" 4. Edit the dc and apply the below node affinity spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: e2e-az-NorthSouth operator: In values: - e2e-az-North - e2e-az-South 5. Label node with "oc label node nodeA e2e-az-NorthSouth=e2e-az-North" 6. Now make sure that pod starts running on the nodeA where the label is added. 7. Now remove the lable from NodeA and add it to NodeB Actual results: Pod does not get evicted and does not run on NodeB but descheduler log shows that "Pod does not fit on Node A" I0402 14:24:49.791129 1 node_affinity.go:41] Executing for nodeAffinityType: requiredDuringSchedulingIgnoredDuringExecution I0402 14:24:49.791158 1 node_affinity.go:46] Processing node: "ip-10-0-143-136.us-east-2.compute.internal" I0402 14:24:49.825872 1 node_affinity.go:46] Processing node: "ip-10-0-149-239.us-east-2.compute.internal" I0402 14:24:49.869452 1 node_affinity.go:46] Processing node: "ip-10-0-151-123.us-east-2.compute.internal" I0402 14:24:49.965682 1 node_affinity.go:46] Processing node: "ip-10-0-168-150.us-east-2.compute.internal" I0402 14:24:50.065238 1 node_affinity.go:46] Processing node: "ip-10-0-170-132.us-east-2.compute.internal" I0402 14:24:50.167936 1 node_affinity.go:46] Processing node: "ip-10-0-141-59.us-east-2.compute.internal" I0402 14:24:50.286970 1 node.go:158] Pod hello-2-lmcns does not fit on node ip-10-0-141-59.us-east-2.compute.internal I0402 14:24:50.287027 1 node.go:158] Pod hello-2-mj6st does not fit on node ip-10-0-141-59.us-east-2.compute.internal I0402 14:24:50.287069 1 node_affinity.go:73] Evicted 0 pods I0402 14:25:50.287284 1 node_affinity.go:41] Executing for nodeAffinityType: requiredDuringSchedulingIgnoredDuringExecution Expected results: Pods should get evicted and should schedule on NodeB. Additional info: