Bug 1903186

Summary: [Descheduler] cluster logs should report some info when PodTopologySpreadConstraints strategy is enabled
Product: OpenShift Container Platform Reporter: RamaKasturi <knarra>
Component: kube-schedulerAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: aos-bugs, jchaloup, maszulik, mfojtik
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:37:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description RamaKasturi 2020-12-01 15:23:17 UTC
Description of problem:
Whenever a strategy is enabled in descheduler, the same thing gets reported in the cluster logs so that user is aware of what is enabled. Unlike other strategies PodTopologySpreadConstraints does not report any kind of info when it is enabled.

Version-Release number of selected component (if applicable):
[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-11-30-172451]$ ./oc get csv
NAME                                                   DISPLAY                     VERSION                 REPLACES   PHASE
clusterkubedescheduleroperator.4.7.0-202011261728.p0   Kube Descheduler Operator   4.7.0-202011261728.p0              Succeeded


How reproducible:
Always

Steps to Reproduce:
1. Enable PodTopologySpreadConstraints strategy
2. Now run oc logs -f <cluster_pod>
3.

Actual results:
cluster logs does not report any info that the strategy is enabled.

Expected results:
cluster logs should report some info so that user knows what strategy is enabled.

Additional info:
We do have this info being reported for other strategies
[knarra@knarra verification-tests]$ oc logs -f cluster-75986fc4cd-pc6v4 -n openshift-kube-descheduler-operator
E1126 15:19:37.026755       1 server.go:50] "failed to validate server configuration" err="unsupported log format: "
I1126 15:19:37.131627       1 node.go:46] "Node lister returned empty list, now fetch directly"
I1126 15:19:37.219802       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-151-196.us-east-2.compute.internal"
I1126 15:19:37.424290       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-155-5.us-east-2.compute.internal"
I1126 15:19:37.525074       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-165-213.us-east-2.compute.internal"
I1126 15:19:37.723996       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-181-80.us-east-2.compute.internal"
I1126 15:19:37.824916       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-193-210.us-east-2.compute.internal"
I1126 15:19:38.020387       1 pod_antiaffinity.go:72] "Processing node" node="ip-10-0-212-237.us-east-2.compute.internal"

Comment 1 Jan Chaloupka 2020-12-01 23:18:14 UTC
Upstream PR merged: https://github.com/kubernetes-sigs/descheduler/pull/448

Comment 3 RamaKasturi 2020-12-04 07:21:52 UTC
Did not find the changes in latest csv clusterkubedescheduleroperator.4.7.0-202012031911.p0, will wait for next operator respin before moving to assigned

Comment 4 RamaKasturi 2020-12-07 16:43:08 UTC
Verified the bug in the latest payload and i see that fix is present.

[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-12-04-013308]$ ./oc get csv
NAME                                                   DISPLAY                            VERSION                 REPLACES   PHASE
clusterkubedescheduleroperator.4.7.0-202012050255.p0   Kube Descheduler Operator          4.7.0-202012050255.p0              Succeeded

[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-12-04-013308]$ ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-12-04-013308   True        False         13h     Cluster version is 4.7.0-0.nightly-2020-12-04-013308

registry.redhat.io/openshift4/ose-descheduler@sha256:1da501059d77a6fa72e6d10b0b1a7a0cc50f2abdffa07daef742b77c889964ea
registry.redhat.io/openshift4/ose-cluster-kube-descheduler-operator@sha256:3585a22428dd6fb2cd3b363667b134e1374dd250a6bc381ff665003e9a303381



[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-12-04-013308]$ ./oc logs -f cluster-847dc7fdb6-7pmxz -n openshift-kube-descheduler-operator
E1207 16:32:48.493237       1 server.go:50] "failed to validate server configuration" err="unsupported log format: "
I1207 16:32:48.699381       1 node.go:46] "Node lister returned empty list, now fetch directly"
I1207 16:32:48.890832       1 topologyspreadconstraint.go:109] "Processing namespaces for topology spread constraints"
I1207 16:32:49.002049       1 duplicates.go:83] "Processing node" node="ip-10-0-53-84.us-east-2.compute.internal"
I1207 16:32:49.031328       1 duplicates.go:83] "Processing node" node="ip-10-0-54-110.us-east-2.compute.internal"
I1207 16:32:49.111166       1 duplicates.go:83] "Processing node" node="ip-10-0-58-68.us-east-2.compute.internal"
I1207 16:32:49.138253       1 duplicates.go:83] "Processing node" node="ip-10-0-59-152.us-east-2.compute.internal"
I1207 16:32:49.328224       1 duplicates.go:83] "Processing node" node="ip-10-0-64-62.us-east-2.compute.internal"
I1207 16:32:49.558115       1 duplicates.go:83] "Processing node" node="ip-10-0-66-131.us-east-2.compute.internal"
I1207 16:33:49.726696       1 node.go:46] "Node lister returned empty list, now fetch directly"

Moving bug to verified as i see that when PodTopologyConstraints i.e TopologyAndDuplicates is enabled descheduler gives the info as above.

Comment 7 errata-xmlrpc 2021-02-24 15:37:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633