Description of problem: When try with descheduler, no pod will be evicted when in non-dry-run mode due to max-pods-to-evict-per-node is zero Version-Release number of selected component (if applicable): openshift3/ose-descheduler:v3.10.0-0.16.0 How reproducible: always Steps to Reproduce: 1. Setup openshift with descheduler in non-dry-run-mode 2. Create some pods for descheduler to consume oc run hello --image=openshift/hello-openshift:latest --replicas=10 3. Check if any pods is evicted and descheduler pod log Actual results: 3. no pods will be evicted Expected results: 3. Should evicted at least one pod Additional info:
Cloned from upstream issue https://github.com/kubernetes-incubator/descheduler/issues/86 Fix PR: https://github.com/kubernetes-incubator/descheduler/pull/87
openshift/descheduler PR: https://github.com/openshift/descheduler/pull/2
Checked on # oc version oc v3.10.0-0.46.0 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-14-127.ec2.internal:8443 openshift v3.10.0-0.46.0 kubernetes v1.10.0+b81c8f8 And the issue can not be reproduced. CronJob spec info: spec: containers: - args: - --policy-config-file=/policy-dir/policy.yaml - --v=5 - --dry-run # oc logs -f descheduler-cronjob-1526460000-6zmxk -n openshift-descheduler I0516 08:40:09.248704 1 reflector.go:202] Starting reflector *v1.Node (1h0m0s) from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:84 I0516 08:40:09.248833 1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:84 I0516 08:40:09.348921 1 duplicates.go:50] Processing node: "ip-172-18-0-241.ec2.internal" I0516 08:40:09.400252 1 duplicates.go:54] "ReplicationController/hello-1" I0516 08:40:09.400275 1 duplicates.go:65] Evicted pod: "hello-1-7h7xn" (<nil>) I0516 08:40:09.400281 1 duplicates.go:65] Evicted pod: "hello-1-fxz7m" (<nil>) I0516 08:40:09.400286 1 duplicates.go:65] Evicted pod: "hello-1-gp6nr" (<nil>) I0516 08:40:09.400290 1 duplicates.go:65] Evicted pod: "hello-1-k7wzk" (<nil>) I0516 08:40:09.400295 1 duplicates.go:65] Evicted pod: "hello-1-ls5zr" (<nil>) I0516 08:40:09.400299 1 duplicates.go:65] Evicted pod: "hello-1-r78rr" (<nil>) I0516 08:40:09.400304 1 duplicates.go:50] Processing node: "ip-172-18-14-127.ec2.internal" I0516 08:40:09.407324 1 duplicates.go:54] "ReplicationController/hello-1" I0516 08:40:09.407346 1 duplicates.go:65] Evicted pod: "hello-1-ds57x" (<nil>) I0516 08:40:09.407352 1 duplicates.go:65] Evicted pod: "hello-1-nt9dm" (<nil>) I0516 08:40:09.407356 1 duplicates.go:65] Evicted pod: "hello-1-rndqq" (<nil>) I0516 08:40:09.407361 1 duplicates.go:65] Evicted pod: "hello-1-zchpj" (<nil>) I0516 08:40:09.407365 1 duplicates.go:65] Evicted pod: "hello-1-zm8dp" (<nil>) I0516 08:40:09.407370 1 duplicates.go:50] Processing node: "ip-172-18-5-71.ec2.internal" I0516 08:40:09.413700 1 duplicates.go:54] "ReplicationController/hello-1" I0516 08:40:09.413721 1 duplicates.go:65] Evicted pod: "hello-1-b5pzk" (<nil>) I0516 08:40:09.413730 1 duplicates.go:65] Evicted pod: "hello-1-jwtwt" (<nil>) I0516 08:40:09.413736 1 duplicates.go:65] Evicted pod: "hello-1-r9s75" (<nil>) I0516 08:40:09.413743 1 duplicates.go:65] Evicted pod: "hello-1-tlkp6" (<nil>) I0516 08:40:09.413749 1 duplicates.go:65] Evicted pod: "hello-1-vhzzf" (<nil>) I0516 08:40:09.413774 1 duplicates.go:65] Evicted pod: "hello-1-wgswg" (<nil>) I0516 08:40:09.432356 1 lownodeutilization.go:141] Node "ip-172-18-14-127.ec2.internal" is under utilized with usage: api.ResourceThresholds{"cpu":7.5, "memory":3.8011597496777925, "pods":6.4} I0516 08:40:09.432400 1 lownodeutilization.go:149] allPods:16, nonRemovablePods:9, bePods:6, bPods:1, gPods:0 I0516 08:40:09.432442 1 lownodeutilization.go:141] Node "ip-172-18-5-71.ec2.internal" is under utilized with usage: api.ResourceThresholds{"memory":9.654945764181592, "pods":5.2, "cpu":10} I0516 08:40:09.432463 1 lownodeutilization.go:149] allPods:13, nonRemovablePods:5, bePods:7, bPods:1, gPods:0 I0516 08:40:09.432504 1 lownodeutilization.go:141] Node "ip-172-18-0-241.ec2.internal" is under utilized with usage: api.ResourceThresholds{"memory":8.033117604319068, "pods":6, "cpu":7.5} I0516 08:40:09.432520 1 lownodeutilization.go:149] allPods:15, nonRemovablePods:3, bePods:10, bPods:2, gPods:0 I0516 08:40:09.432525 1 lownodeutilization.go:65] Criteria for a node under utilization: CPU: 40, Mem: 40, Pods: 40 I0516 08:40:09.432532 1 lownodeutilization.go:72] Total number of underutilized nodes: 3 I0516 08:40:09.432537 1 lownodeutilization.go:80] all nodes are underutilized, nothing to do here I0516 08:40:09.432543 1 pod_antiaffinity.go:45] Processing node: "ip-172-18-0-241.ec2.internal" I0516 08:40:09.438660 1 pod_antiaffinity.go:45] Processing node: "ip-172-18-14-127.ec2.internal" I0516 08:40:09.445160 1 pod_antiaffinity.go:45] Processing node: "ip-172-18-5-71.ec2.internal"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816