Bug 2214033

Summary:	[UI] Resources utilization in ODF Topology do not match to metrics
Product:	[Red Hat Storage] Red Hat OpenShift Data Foundation	Reporter:	Daniel Osypenko <dosypenk>
Component:	management-console	Assignee:	Bipul Adhikari <badhikar>
Status:	CLOSED ERRATA	QA Contact:	Daniel Osypenko <dosypenk>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	4.13	CC:	badhikar, ebenahar, kbg, odf-bz-bot, skatiyar, tdesala
Target Milestone:	---
Target Release:	ODF 4.14.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	4.14.0-110	Doc Type:	Bug Fix
Doc Text:	Previously, the resource utilization of OpenShift Data Foundation topology did not match the metrics because the metrics query used in the sidebar for resources list of nodes and deployment were different. With this fix, the metric queries are made the same and as a result the values are same in both the places.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-11-08 18:51:26 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2244409

Description Daniel Osypenko 2023-06-11 08:08:53 UTC

Created attachment 1970282 [details]
odf-console-resources-mem-cpu-utilization-topology

Description of problem (please be detailed as possible and provide log
snippests):

When opening ODF Topology and navigate to any deployment of any node it shows the utilization which does not match to one we see in full metrics.
We may need to rename Resources to something specific if the numbers are correct or we need to make them match Metrics data (see attachments)

Version of all relevant components (if applicable):

OC version:
Client Version: 4.12.0-202208031327
Kustomize Version: v4.5.4
Server Version: 4.13.0-0.nightly-2023-06-03-031200
Kubernetes Version: v1.26.5+7a891f0

OCS verison:
ocs-operator.v4.13.0-207.stable              OpenShift Container Storage   4.13.0-207.stable              Succeeded

Cluster version
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.0-0.nightly-2023-06-03-031200   True        False         3d      Cluster version is 4.13.0-0.nightly-2023-06-03-031200

Rook version:
rook: v4.13.0-0.e5648f0a2577b9bfd2aa256d4853dc3e8d94862a
go: go1.19.6

Ceph version:
ceph version 17.2.6-50.el9cp (c202ddb5589554af0ce43432ff07cd7ce8f35243) quincy (stable)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
no

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:
-

Steps to Reproduce:
1. Deploy on-premise cluster, login to management console
2. Navigate to Storage / Data Foundation / Topology / select any node from the cluster and click on Resources tab on the sidebar
alternatively
Navigate to Storage / Data Foundation / Topology / select any node from the cluster / select any deployment from the cluster and click on Resources tab on the sidebar
3. Compare utilization numbers of Memory and CPU to Metrics of the pod, from the deployment


Actual results:
Results of Memory and CPU of the deployments pod do not match results of the same pod on Metrics (Workloads / Pods / Metrics) 

* for instance Memory in Resources tab of Topology of the osd-pod was more than twice higher than rendered in Metrics of the same osd-pod

Expected results:
Memory of the pods from ODF Topology / any deployment / Resources  should match to the memory shown on Metrics. Values should show correct state of memory and CPU utilization

Additional info:
the snippet of ws requests done from the web browser

{
                  "name":"instance:node_load1_per_cpu:ratio",
                  "query":"(node_load1{job=\"node-exporter\"} / instance:node_num_cpu:sum{job=\"node-exporter\"})",
                  "labels":{
                     "prometheus":"openshift-monitoring/k8s"
                  },
                  "health":"ok",
                  "evaluationTime":0.000620566,
                  "lastEvaluation":"2023-06-05T09:20:14.339329283Z",
                  "type":"recording"
               },
               {
                  "name":"instance:node_memory_utilisation:ratio",
                  "query":"1 - ((node_memory_MemAvailable_bytes{job=\"node-exporter\"} or (node_memory_Buffers_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Slab_bytes{job=\"node-exporter\"})) / node_memory_MemTotal_bytes{job=\"node-exporter\"})",
                  "labels":{
                     "prometheus":"openshift-monitoring/k8s"
                  },
                  "health":"ok",
                  "evaluationTime":0.000801105,
                  "lastEvaluation":"2023-06-05T09:20:14.339953693Z",
                  "type":"recording"
               },

Comment 8 Daniel Osypenko 2023-08-17 14:12:05 UTC

Fixed, see attachment # 1983829 [details]

Comment 11 errata-xmlrpc 2023-11-08 18:51:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832