Bug 1804455 - openshift-monitoring memory stats don't match output of 'free' command on a node
Summary: openshift-monitoring memory stats don't match output of 'free' command on a node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Pawel Krupa
QA Contact: Junqi Zhao
URL:
Whiteboard:
: 1809092 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-18 21:28 UTC by Luke Stanton
Modified: 2023-09-07 21:56 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-20 00:12:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 674 0 None closed Bug 1804455: Better memory calculations 2021-02-04 11:39:47 UTC
Red Hat Product Errata RHBA-2020:0793 0 None None None 2020-03-20 00:12:55 UTC

Description Luke Stanton 2020-02-18 21:28:28 UTC
Description of problem:

When comparing the output of some memory related Prometheus rules for a node against the 'free' command buff/cache value on the same node, the numbers don't match. For example:

#--- 'free' output ---#
$ free
            total      used       free   shared   buff/cache   available
Mem:     32764040   6135268    1233868     3704     25394904    25970860
Swap:     4194300    269492    3924808

#--- Prometheus buff/cache output ---#
(node_memory_Cached{job="node-exporter", instance="xx.xx.xx.xx:9100"} + node_memory_Buffers{job="node-exporter", instance="xx.xx.xx.xx:9100"})/1024

--> value returned is: 4483656

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'free' buff/cache output: 25394904
       Prometheus output: 4483656
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

How reproducible: Appears to be consistent


Steps to Reproduce:
Compare the above Prometheus rule output with the 'free' buff/cache value on the same node.


Actual results: Values don't match.


Expected results: Values would match or be very close.

Comment 6 Pawel Krupa 2020-03-02 12:29:35 UTC
*** Bug 1809092 has been marked as a duplicate of this bug. ***

Comment 9 Junqi Zhao 2020-03-09 10:47:54 UTC
Tested with cluster-monitoring-operator-v3.11.187-3, this is not much difference between buff/cache and node_memory_Cached+node_memory_Buffers+node_memory_SReclaimable
# free -b
              total        used        free      shared  buff/cache   available
Mem:     3973369856  2115792896   120659968     4231168  1736916992  1576497152
Swap:             0           0           0

(1736916992 - (node_memory_Cached{instance="10.0.150.172:9100"}) - (node_memory_Buffers{instance="10.0.150.172:9100"}) - (node_memory_SReclaimable{instance="10.0.150.172:9100"})) / 1024 /1024
Element 	Value
{endpoint="https",instance="10.0.150.172:9100",job="node-exporter",namespace="openshift-monitoring",pod="node-exporter-lxq64",service="node-exporter"}	2.12890625

Comment 11 errata-xmlrpc 2020-03-20 00:12:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0793


Note You need to log in before you can comment on or make changes to this bug.