Bug 1907936 - NTO is not reporting nto_profile_set_total metrics correctly after reboot
Summary: NTO is not reporting nto_profile_set_total metrics correctly after reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Tuning Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Jiří Mencák
QA Contact: Simon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-15 14:42 UTC by Jiří Mencák
Modified: 2021-02-24 15:44 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:44:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-node-tuning-operator pull 189 0 None closed Bug 1907936: Switch to nto_profile_calculated_total. 2021-01-07 19:57:51 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:44:50 UTC

Description Jiří Mencák 2020-12-15 14:42:40 UTC
Description of problem:
In OCP 4.7, NTO implemented a set of metrics that are reported by the operator. One of them, nto_profile_set_total, is not reported correctly after reboot of the node the operator runs on.

Version-Release number of selected component (if applicable):
OCP 4.7

How reproducible:
Always

Steps to Reproduce:
1. Reboot a node NTO runs on.
2. oc project openshift-cluster-node-tuning-operator ; 
3. oc rsh cluster-node-tuning-operator-<id>
4. 
sh-4.4$ curl --insecure https://localhost:60000/metrics

Actual results:
Observe nto_profile_set_total not reported.

Expected results:
A metric indicating nto_profile_set_total or similar (nto_profile_calculated_total) set.

Additional info:
https://github.com/openshift/cluster-node-tuning-operator/pull/189

Comment 3 Simon 2020-12-21 20:14:04 UTC
Cluster version: 4.7.0-0.nightly-2020-12-20-055006

$ oc project openshift-cluster-node-tuning-operator
Now using project "openshift-cluster-node-tuning-operator" on server "https://api.skordas1218a.qe.devcluster.openshift.com:6443".

$ oc get pods
NAME                                            READY   STATUS    RESTARTS   AGE
cluster-node-tuning-operator-7d89b84b6c-m5xz7   1/1     Running   0          4h1m
tuned-8kdkg                                     1/1     Running   0          4h19m
tuned-8lqsn                                     1/1     Running   0          4h19m
tuned-b6lm4                                     1/1     Running   0          4h24m
tuned-k9ms6                                     1/1     Running   0          4h24m
tuned-kzq5s                                     1/1     Running   0          4h24m
tuned-wjcsx                                     1/1     Running   0          4h19m

$ oc rsh cluster-node-tuning-operator-7d89b84b6c-m5xz7
sh-4.4$ curl --insecure https://localhost:60000/metrics
# HELP nto_build_info A metric with a constant '1' value labeled version from which Node Tuning Operator was built.
# TYPE nto_build_info gauge
nto_build_info{version="v4.7.0-202012190243.p0-0-g5c99b95-dirty"} 1
# HELP nto_degraded_info Indicates whether the Node Tuning Operator is degraded.
# TYPE nto_degraded_info gauge
nto_degraded_info 0
# HELP nto_pod_labels_used_info Is the Pod label functionality turned on (1) or off (0)?
# TYPE nto_pod_labels_used_info gauge
nto_pod_labels_used_info 0
# HELP nto_profile_calculated_total The number of times a Tuned profile was calculated for a given node.
# TYPE nto_profile_calculated_total counter
nto_profile_calculated_total{node="ip-10-0-129-18.us-east-2.compute.internal",profile="openshift-node"} 3
nto_profile_calculated_total{node="ip-10-0-147-65.us-east-2.compute.internal",profile="openshift-control-plane"} 3
nto_profile_calculated_total{node="ip-10-0-165-118.us-east-2.compute.internal",profile="openshift-control-plane"} 3
nto_profile_calculated_total{node="ip-10-0-167-70.us-east-2.compute.internal",profile="openshift-node"} 3
nto_profile_calculated_total{node="ip-10-0-203-237.us-east-2.compute.internal",profile="openshift-control-plane"} 3
nto_profile_calculated_total{node="ip-10-0-216-106.us-east-2.compute.internal",profile="openshift-node"} 3

Comment 5 errata-xmlrpc 2021-02-24 15:44:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.