Bug 1566920 - [Pod_public_851][descheduler] max-pods-to-evict-per-node should not be 0 by default
Summary: [Pod_public_851][descheduler] max-pods-to-evict-per-node should not be 0 by d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.0
Assignee: Avesh Agarwal
QA Contact: weiwei jiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-13 07:32 UTC by weiwei jiang
Modified: 2018-07-30 19:13 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-30 19:12:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 0 None None None 2018-07-30 19:13:33 UTC

Description weiwei jiang 2018-04-13 07:32:55 UTC
Description of problem:
When try with descheduler, no pod will be evicted when in non-dry-run mode due to max-pods-to-evict-per-node is zero

Version-Release number of selected component (if applicable):
openshift3/ose-descheduler:v3.10.0-0.16.0

How reproducible:
always

Steps to Reproduce:
1. Setup openshift with descheduler in non-dry-run-mode
2. Create some pods for descheduler to consume
oc run hello --image=openshift/hello-openshift:latest --replicas=10
3. Check if any pods is evicted and descheduler pod log

Actual results:
3. no pods will be evicted

Expected results:
3. Should evicted at least one pod

Additional info:

Comment 2 Seth Jennings 2018-04-26 02:12:50 UTC
openshift/descheduler PR:
https://github.com/openshift/descheduler/pull/2

Comment 4 weiwei jiang 2018-05-16 08:44:46 UTC
Checked on 
# oc version 
oc v3.10.0-0.46.0
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-14-127.ec2.internal:8443
openshift v3.10.0-0.46.0
kubernetes v1.10.0+b81c8f8

And the issue can not be reproduced.

CronJob spec info:
        spec:
          containers:
          - args:
            - --policy-config-file=/policy-dir/policy.yaml
            - --v=5
            - --dry-run


# oc logs -f descheduler-cronjob-1526460000-6zmxk -n openshift-descheduler 
I0516 08:40:09.248704       1 reflector.go:202] Starting reflector *v1.Node (1h0m0s) from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:84
I0516 08:40:09.248833       1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:84
I0516 08:40:09.348921       1 duplicates.go:50] Processing node: "ip-172-18-0-241.ec2.internal"
I0516 08:40:09.400252       1 duplicates.go:54] "ReplicationController/hello-1"
I0516 08:40:09.400275       1 duplicates.go:65] Evicted pod: "hello-1-7h7xn" (<nil>)
I0516 08:40:09.400281       1 duplicates.go:65] Evicted pod: "hello-1-fxz7m" (<nil>)
I0516 08:40:09.400286       1 duplicates.go:65] Evicted pod: "hello-1-gp6nr" (<nil>)
I0516 08:40:09.400290       1 duplicates.go:65] Evicted pod: "hello-1-k7wzk" (<nil>)
I0516 08:40:09.400295       1 duplicates.go:65] Evicted pod: "hello-1-ls5zr" (<nil>)
I0516 08:40:09.400299       1 duplicates.go:65] Evicted pod: "hello-1-r78rr" (<nil>)
I0516 08:40:09.400304       1 duplicates.go:50] Processing node: "ip-172-18-14-127.ec2.internal"
I0516 08:40:09.407324       1 duplicates.go:54] "ReplicationController/hello-1"
I0516 08:40:09.407346       1 duplicates.go:65] Evicted pod: "hello-1-ds57x" (<nil>)
I0516 08:40:09.407352       1 duplicates.go:65] Evicted pod: "hello-1-nt9dm" (<nil>)
I0516 08:40:09.407356       1 duplicates.go:65] Evicted pod: "hello-1-rndqq" (<nil>)
I0516 08:40:09.407361       1 duplicates.go:65] Evicted pod: "hello-1-zchpj" (<nil>)
I0516 08:40:09.407365       1 duplicates.go:65] Evicted pod: "hello-1-zm8dp" (<nil>)
I0516 08:40:09.407370       1 duplicates.go:50] Processing node: "ip-172-18-5-71.ec2.internal"
I0516 08:40:09.413700       1 duplicates.go:54] "ReplicationController/hello-1"
I0516 08:40:09.413721       1 duplicates.go:65] Evicted pod: "hello-1-b5pzk" (<nil>)
I0516 08:40:09.413730       1 duplicates.go:65] Evicted pod: "hello-1-jwtwt" (<nil>)
I0516 08:40:09.413736       1 duplicates.go:65] Evicted pod: "hello-1-r9s75" (<nil>)
I0516 08:40:09.413743       1 duplicates.go:65] Evicted pod: "hello-1-tlkp6" (<nil>)
I0516 08:40:09.413749       1 duplicates.go:65] Evicted pod: "hello-1-vhzzf" (<nil>)
I0516 08:40:09.413774       1 duplicates.go:65] Evicted pod: "hello-1-wgswg" (<nil>)
I0516 08:40:09.432356       1 lownodeutilization.go:141] Node "ip-172-18-14-127.ec2.internal" is under utilized with usage: api.ResourceThresholds{"cpu":7.5, "memory":3.8011597496777925, "pods":6.4}
I0516 08:40:09.432400       1 lownodeutilization.go:149] allPods:16, nonRemovablePods:9, bePods:6, bPods:1, gPods:0
I0516 08:40:09.432442       1 lownodeutilization.go:141] Node "ip-172-18-5-71.ec2.internal" is under utilized with usage: api.ResourceThresholds{"memory":9.654945764181592, "pods":5.2, "cpu":10}
I0516 08:40:09.432463       1 lownodeutilization.go:149] allPods:13, nonRemovablePods:5, bePods:7, bPods:1, gPods:0
I0516 08:40:09.432504       1 lownodeutilization.go:141] Node "ip-172-18-0-241.ec2.internal" is under utilized with usage: api.ResourceThresholds{"memory":8.033117604319068, "pods":6, "cpu":7.5}
I0516 08:40:09.432520       1 lownodeutilization.go:149] allPods:15, nonRemovablePods:3, bePods:10, bPods:2, gPods:0
I0516 08:40:09.432525       1 lownodeutilization.go:65] Criteria for a node under utilization: CPU: 40, Mem: 40, Pods: 40
I0516 08:40:09.432532       1 lownodeutilization.go:72] Total number of underutilized nodes: 3
I0516 08:40:09.432537       1 lownodeutilization.go:80] all nodes are underutilized, nothing to do here
I0516 08:40:09.432543       1 pod_antiaffinity.go:45] Processing node: "ip-172-18-0-241.ec2.internal"
I0516 08:40:09.438660       1 pod_antiaffinity.go:45] Processing node: "ip-172-18-14-127.ec2.internal"
I0516 08:40:09.445160       1 pod_antiaffinity.go:45] Processing node: "ip-172-18-5-71.ec2.internal"

Comment 6 errata-xmlrpc 2018-07-30 19:12:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.