Bug 2214033

Summary: [UI] Resources utilization in ODF Topology do not match to metrics
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Daniel Osypenko <dosypenk>
Component: management-consoleAssignee: Bipul Adhikari <badhikar>
Status: VERIFIED --- QA Contact: Daniel Osypenko <dosypenk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.13CC: ebenahar, odf-bz-bot, skatiyar, tdesala
Target Milestone: ---   
Target Release: ODF 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.14.0-110 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Osypenko 2023-06-11 08:08:53 UTC
Created attachment 1970282 [details]
odf-console-resources-mem-cpu-utilization-topology

Description of problem (please be detailed as possible and provide log
snippests):

When opening ODF Topology and navigate to any deployment of any node it shows the utilization which does not match to one we see in full metrics.
We may need to rename Resources to something specific if the numbers are correct or we need to make them match Metrics data (see attachments)

Version of all relevant components (if applicable):

OC version:
Client Version: 4.12.0-202208031327
Kustomize Version: v4.5.4
Server Version: 4.13.0-0.nightly-2023-06-03-031200
Kubernetes Version: v1.26.5+7a891f0

OCS verison:
ocs-operator.v4.13.0-207.stable              OpenShift Container Storage   4.13.0-207.stable              Succeeded

Cluster version
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.0-0.nightly-2023-06-03-031200   True        False         3d      Cluster version is 4.13.0-0.nightly-2023-06-03-031200

Rook version:
rook: v4.13.0-0.e5648f0a2577b9bfd2aa256d4853dc3e8d94862a
go: go1.19.6

Ceph version:
ceph version 17.2.6-50.el9cp (c202ddb5589554af0ce43432ff07cd7ce8f35243) quincy (stable)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
no

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:
-

Steps to Reproduce:
1. Deploy on-premise cluster, login to management console
2. Navigate to Storage / Data Foundation / Topology / select any node from the cluster and click on Resources tab on the sidebar
alternatively
Navigate to Storage / Data Foundation / Topology / select any node from the cluster / select any deployment from the cluster and click on Resources tab on the sidebar
3. Compare utilization numbers of Memory and CPU to Metrics of the pod, from the deployment


Actual results:
Results of Memory and CPU of the deployments pod do not match results of the same pod on Metrics (Workloads / Pods / Metrics) 

* for instance Memory in Resources tab of Topology of the osd-pod was more than twice higher than rendered in Metrics of the same osd-pod

Expected results:
Memory of the pods from ODF Topology / any deployment / Resources  should match to the memory shown on Metrics. Values should show correct state of memory and CPU utilization

Additional info:
the snippet of ws requests done from the web browser

{
                  "name":"instance:node_load1_per_cpu:ratio",
                  "query":"(node_load1{job=\"node-exporter\"} / instance:node_num_cpu:sum{job=\"node-exporter\"})",
                  "labels":{
                     "prometheus":"openshift-monitoring/k8s"
                  },
                  "health":"ok",
                  "evaluationTime":0.000620566,
                  "lastEvaluation":"2023-06-05T09:20:14.339329283Z",
                  "type":"recording"
               },
               {
                  "name":"instance:node_memory_utilisation:ratio",
                  "query":"1 - ((node_memory_MemAvailable_bytes{job=\"node-exporter\"} or (node_memory_Buffers_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Slab_bytes{job=\"node-exporter\"})) / node_memory_MemTotal_bytes{job=\"node-exporter\"})",
                  "labels":{
                     "prometheus":"openshift-monitoring/k8s"
                  },
                  "health":"ok",
                  "evaluationTime":0.000801105,
                  "lastEvaluation":"2023-06-05T09:20:14.339953693Z",
                  "type":"recording"
               },

Comment 8 Daniel Osypenko 2023-08-17 14:12:05 UTC
Fixed, see attachment # 1983829 [details]