Bug 1712912

Summary: Cluster console shows double container ram used in OCP 3.11
Product: OpenShift Container Platform Reporter: hgomes
Component: MonitoringAssignee: Sergiusz Urbaniak <surbania>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: adeshpan, anpicker, cruhm, erooth, lcosic, mloibl, pkrupa, rbost, surbania
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:29:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Cluster console double
none
Prometheus view none

Description hgomes 2019-05-22 13:10:57 UTC
Created attachment 1571989 [details]
Cluster console double

Description of problem:

Pod metrics in the cluster console show duplicate data for container_memory_usage_bytes, resulting in usage shown to be double the actual usage.


Version-Release number of selected component (if applicable):

3.11.88

How reproducible:


Steps to Reproduce:
1. Open Prometheus dashboard, apply a metric
Eg. container_memory_usage_bytes{pod_name='logging-fluentd-p5f2q',namespace='openshift-logging'}
2. Observe multiple results.

3.Check on OpenShift UI Metrics for pods. It will show the total amount of 2 values from the Prometheus results.

Actual results:


Expected results:

I would expect the cluster console to have a filter similar to the one used in the prometheus alert rules: e.g.  container_memory_usage_bytes{container_name!=""} 
https://github.com/openshift/cluster-monitoring-operator/blob/master/assets/prometheus-k8s/rules.yaml#L22

Additional info:

Comment 1 hgomes 2019-05-22 13:12:24 UTC
Created attachment 1571990 [details]
Prometheus view

Comment 2 Frederic Branczyk 2019-05-28 11:24:08 UTC
We have three related BZs in total in this area, I'd suggest we fix all of them at once, by introducing a recording rule for memory/cpu that is used universally across the stack, that way we can have consistency:

* https://bugzilla.redhat.com/show_bug.cgi?id=1712912
* https://bugzilla.redhat.com/show_bug.cgi?id=1703414
* https://bugzilla.redhat.com/show_bug.cgi?id=1701856

I'd expect this to be solved in the 4.2 time frame.

Comment 6 Sergiusz Urbaniak 2019-08-26 09:26:38 UTC
*** Bug 1703414 has been marked as a duplicate of this bug. ***

Comment 11 errata-xmlrpc 2019-10-16 06:29:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922