Description of problem: Descheduler evicts pod which does not have any ownerRef or annotation http://descheduler.alpha.kubernetes.io/evict in it's spec Version-Release number of selected component (if applicable): [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-10-070949]$ ./oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-fc.2 True False 9h Cluster version is 4.7.0-fc.2 How reproducible: Always Steps to Reproduce: 1. Install descheduler on 4.7 cluster 2. label worker nodes using the commands below oc label node1 knarra-zone=zoneA oc label node2 knarra-zone=zoneB oc label node3 knarra-zone=zonec 3. Now cordon all worker nodes except the first one 4. create pod using the yaml file below [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-10-070949]$ cat /tmp/constrained-pod.yaml kind: Pod apiVersion: v1 metadata: name: mypod-constrained labels: foo: bar spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: knarra-zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: foo: bar containers: - name: pause image: quay.io/openshifttest/hello-openshift@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e 5. create two more pods on node1 using the yaml file below [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-10-070949]$ cat /tmp/demo-pod.yaml kind: Pod apiVersion: v1 metadata: generateName: mypod labels: foo: bar spec: containers: - name: pause image: quay.io/openshifttest/hello-openshift@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e 6. Now cordon node1 and uncordon node2. 7. create a pod on node2 using the yaml file in step5 8. Now edit kubedescheduler cluster and change IntervalSeconds to 60. Actual results: I see that the pod with TopologyPodConstraints set will be evicted [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-10-070949]$ ./oc logs -f cluster-65d59dc468-2hbhk -n openshift-kube-descheduler-operator I0112 14:15:15.486932 1 node.go:46] "Node lister returned empty list, now fetch directly" I0112 14:15:15.579853 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0112 14:15:15.771832 1 evictions.go:117] "Evicted pod" pod="default/mypod-constrained" reason=" (PodTopologySpread)" Expected results: Pods created neither has ownerRef or annotations and they should not be evicted. Additional info:
Upstream PR to fix this: https://github.com/kubernetes-sigs/descheduler/pull/484
Verified bug with the payload below and i see that when the annotation "descheduler.alpha.kubernetes.io/evict": "" is not present, pods does not get evicted and when annotation is present pods get evicted to maintain right TopologyConstraint. [knarra@knarra openshift-misc]$ oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.7.0-202101160343.p0 Kube Descheduler Operator 4.7.0-202101160343.p0 Succeeded [knarra@knarra openshift-misc]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-01-18-000316 True False 5h19m Cluster version is 4.7.0-0.nightly-2021-01-18-000316 Descheduler log when the descheduler annotations are not present for the pods: =========================================================================== [knarra@knarra verification-tests]$ oc logs -f cluster-68fd5d5976-xllt2 -n openshift-kube-descheduler-operator I0119 12:16:07.974680 1 node.go:46] "Node lister returned empty list, now fetch directly" I0119 12:16:08.169078 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0119 12:16:08.376539 1 duplicates.go:94] "Processing node" node="compute-0" I0119 12:16:08.457438 1 duplicates.go:94] "Processing node" node="compute-1" I0119 12:16:08.482481 1 duplicates.go:94] "Processing node" node="control-plane-0" I0119 12:16:08.509079 1 duplicates.go:94] "Processing node" node="control-plane-1" I0119 12:16:08.532517 1 duplicates.go:94] "Processing node" node="control-plane-2" I0119 12:17:08.557448 1 node.go:46] "Node lister returned empty list, now fetch directly" I0119 12:17:08.669790 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0119 12:17:08.770414 1 duplicates.go:94] "Processing node" node="compute-0" I0119 12:17:08.794727 1 duplicates.go:94] "Processing node" node="compute-1" I0119 12:17:08.857392 1 duplicates.go:94] "Processing node" node="control-plane-0" I0119 12:17:08.881861 1 duplicates.go:94] "Processing node" node="control-plane-1" I0119 12:17:08.903355 1 duplicates.go:94] "Processing node" node="control-plane-2" Descheduler log when the descheduler annotations are present for the pod: ======================================================================== [knarra@knarra verification-tests]$ oc logs -f cluster-55b679cd94-bf65m -n openshift-kube-descheduler-operator I0119 12:19:47.873990 1 node.go:46] "Node lister returned empty list, now fetch directly" I0119 12:19:48.059908 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0119 12:19:48.297513 1 evictions.go:117] "Evicted pod" pod="knarra/mypodsv4sp" reason=" (PodTopologySpread)" I0119 12:19:48.297657 1 duplicates.go:94] "Processing node" node="compute-0" I0119 12:19:48.324803 1 duplicates.go:94] "Processing node" node="compute-1" I0119 12:19:48.347550 1 duplicates.go:94] "Processing node" node="control-plane-0" I0119 12:19:48.381212 1 duplicates.go:94] "Processing node" node="control-plane-1" I0119 12:19:48.457461 1 duplicates.go:94] "Processing node" node="control-plane-2" I0119 12:20:48.497487 1 node.go:46] "Node lister returned empty list, now fetch directly" I0119 12:20:48.507606 1 duplicates.go:94] "Processing node" node="compute-0" I0119 12:20:48.527653 1 duplicates.go:94] "Processing node" node="compute-1" I0119 12:20:48.548893 1 duplicates.go:94] "Processing node" node="control-plane-0" I0119 12:20:48.567381 1 duplicates.go:94] "Processing node" node="control-plane-1" I0119 12:20:48.589618 1 duplicates.go:94] "Processing node" node="control-plane-2" I0119 12:20:48.659071 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0119 12:20:48.668271 1 topologyspreadconstraint.go:183] "Skipping topology constraint because it is already balanced" constraint={MaxSkew:1 TopologyKey:knarra-zone WhenUnsatisfiable:DoNotSchedule LabelSelector:&LabelSelector{MatchLabels:map[string]string{foo: bar,},MatchExpressions:[]LabelSelectorRequirement{},}} I0119 12:21:48.902531 1 node.go:46] "Node lister returned empty list, now fetch directly" I0119 12:21:48.958797 1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints" I0119 12:21:48.968011 1 topologyspreadconstraint.go:183] "Skipping topology constraint because it is already balanced" constraint={MaxSkew:1 TopologyKey:knarra-zone WhenUnsatisfiable:DoNotSchedule LabelSelector:&LabelSelector{MatchLabels:map[string]string{foo: bar,},MatchExpressions:[]LabelSelectorRequirement{},}} I0119 12:21:48.980690 1 duplicates.go:94] "Processing node" node="compute-0" I0119 12:21:49.007296 1 duplicates.go:94] "Processing node" node="compute-1" I0119 12:21:49.030910 1 duplicates.go:94] "Processing node" node="control-plane-0" I0119 12:21:49.083342 1 duplicates.go:94] "Processing node" node="control-plane-1" I0119 12:21:49.129958 1 duplicates.go:94] "Processing node" node="control-plane-2" Based on the above moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633