Bug 1949972

Summary: Descheduler metrics: populate build info data and make the metrics entries more readeable
Product: OpenShift Container Platform Reporter: Jan Chaloupka <jchaloup>
Component: kube-schedulerAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: aos-bugs, mfojtik
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:01:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Chaloupka 2021-04-15 14:01:40 UTC
As part of writing documentation for the Descheduler metrics in https://github.com/openshift/openshift-docs/pull/31583, few bits are still missing are unclear:
- incomplete build info: https://github.com/openshift/descheduler/pull/59
- ambiguous namespace label: https://github.com/openshift/cluster-kube-descheduler-operator/pull/186

Comment 2 RamaKasturi 2021-04-19 07:23:51 UTC
No new operator yet.

Comment 3 RamaKasturi 2021-04-20 09:10:33 UTC
Hi Jan,

   I am trying to verify the bug here but have one questions.

1) Regarding deschduler_build_info, when i run `oc get csv` below is what i see but when looking at the info from prometheus i see a different one for deschedulerversion, is that expected ? If yes, what is the value we are trying to display there ?

`oc get csv` from command line: clusterkubedescheduleroperator.4.8.0-202104190907.p0
descheduler_build_info from prometheus: descheduler_build_info{DeschedulerVersion="4.8.0-202104162013.p0-860ea3a", GitSha1="860ea3a2f44660ccfd7448ad415d7edfbaf07a7c", GoVersion="go1.16.1", endpoint="https", instance="10.129.2.88:10258", job="metrics", namespace="openshift-kube-descheduler-operator", pod="cluster-54bb5d8bbc-xjxdm", service="metrics"}

Comment 4 Jan Chaloupka 2021-04-20 11:14:39 UTC
CLI: 4.8.0-202104190907.p0
Prometheus: 4.8.0-202104162013.p0-860ea3a

CLI: the time the descheduler "operator" was built
Prometheus: the time the descheduler "operand" was built

Comment 5 RamaKasturi 2021-04-20 16:32:54 UTC
Verified bug with the payload below and i see that descheduler build_info & pod_namespace is present in prometheus UI for descheduler metrics.

[knarra@knarra openshift-client-linux-4.8.0-0.nightly-2021-04-18-203506]$ ./oc get csv
NAME                                                   DISPLAY                     VERSION                 REPLACES   PHASE
clusterkubedescheduleroperator.4.8.0-202104190907.p0   Kube Descheduler Operator   4.8.0-202104190907.p0              Succeeded

descheduler_build_info from prometheus: descheduler_build_info{DeschedulerVersion="4.8.0-202104162013.p0-860ea3a", GitSha1="860ea3a2f44660ccfd7448ad415d7edfbaf07a7c", GoVersion="go1.16.1", endpoint="https", instance="10.129.2.88:10258", job="metrics", namespace="openshift-kube-descheduler-operator", pod="cluster-54bb5d8bbc-xjxdm", service="metrics"}

And the DeschedulerVersion which we see here is the time the descheduler "operand" was built as describe in comment 4 & The metric is supposed to report info about descheduler, not descheduler operator.

descheduler_pods_evicted from prometheus: descheduler_pods_evicted{endpoint="https", exported_namespace="knarra", instance="10.129.2.77:10258", job="metrics", namespace="openshift-kube-descheduler-operator", pod="cluster-9977f7bb8-j2fkg", pod_namespace="knarra", result="success", service="metrics", strategy="RemoveDuplicatePods"} could see pod_namespace along with exported_namespace.

Based on the above info moving bug to verified state.

Comment 8 errata-xmlrpc 2021-07-27 23:01:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438