Bug 1703414 - Pod Overview in cluster console, resouce usage is mismatched
Summary: Pod Overview in cluster console, resouce usage is mismatched
Keywords:
Status: CLOSED DUPLICATE of bug 1712912
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.2.0
Assignee: Sergiusz Urbaniak
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-26 11:04 UTC by Aditya Deshpande
Modified: 2019-08-26 09:26 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-26 09:26:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cluster-console pod overview (91.95 KB, image/png)
2019-04-26 11:05 UTC, Aditya Deshpande
no flags Details
grafana K8S compute resources pod dashborad (60.45 KB, image/png)
2019-04-26 11:06 UTC, Aditya Deshpande
no flags Details
metrics (98.96 KB, image/png)
2019-04-26 11:07 UTC, Aditya Deshpande
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4177321 0 Performance tune None Pod metrics in the cluster console show duplicate data in OCP 3.11 2019-05-28 16:21:00 UTC

Description Aditya Deshpande 2019-04-26 11:04:52 UTC
Description of problem:
In relation with the bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1669718 the usage of CPU is seen doubled in the cluster console as comparison with the metrics shown on the Grafana and top command.

As per the outputs from oc adm top pod and grafana, metrics for pod matches. But cluster console shows doubled usage.

Version-Release number of selected component (if applicable):
OCP v3.11


Actual results:

# oc adm top pod
router-6-9b664             4m           46Mi  

Attaching the screenshots for the cluster console and Grafana (K8s / Compute Resources / Pod dashboard for same pod) and metrics.
 
Expected results:
Cluster console output should match according to the console outputs from Grafana, metrics.

Comment 1 Aditya Deshpande 2019-04-26 11:05:26 UTC
Created attachment 1559029 [details]
cluster-console pod overview

Comment 2 Aditya Deshpande 2019-04-26 11:06:39 UTC
Created attachment 1559030 [details]
grafana K8S compute resources pod dashborad

Comment 3 Aditya Deshpande 2019-04-26 11:07:27 UTC
Created attachment 1559031 [details]
metrics

Comment 4 Samuel Padgett 2019-04-26 13:20:07 UTC
Do you see this consistently for the router pod and other pods? If you click on the chart, it should take you to the Prometheus UI with the same query. Do you see the same data as console there?

Comment 5 Aditya Deshpande 2019-04-29 09:13:55 UTC
Yes. I can see this consistently for router and other pods. Also, the query behind the chart is taking me to the prometheus console and showing the wrong data.
This query is addressed actually in https://bugzilla.redhat.com/show_bug.cgi?id=1669410. I already asked on that and opened a separate bug according to the engineer.
I will also update this information on the Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1669410 and ask for the wrong CPU usage as well.

Comment 15 Frederic Branczyk 2019-05-28 11:24:31 UTC
We have three related BZs in total in this area, I'd suggest we fix all of them at once, by introducing a recording rule for memory/cpu that is used universally across the stack, that way we can have consistency:

* https://bugzilla.redhat.com/show_bug.cgi?id=1712912
* https://bugzilla.redhat.com/show_bug.cgi?id=1703414
* https://bugzilla.redhat.com/show_bug.cgi?id=1701856

I'd expect this to be solved in the 4.2 time frame.

Comment 20 Sergiusz Urbaniak 2019-08-26 09:26:38 UTC

*** This bug has been marked as a duplicate of bug 1712912 ***


Note You need to log in before you can comment on or make changes to this bug.