Bug 1874028 - Node filesystem used and total are calculations are wrong
Summary: Node filesystem used and total are calculations are wrong
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: ralpert
QA Contact: Yadan Pei
URL:
Whiteboard:
: 1852770 1877136 (view as bug list)
Depends On:
Blocks: 1883177
TreeView+ depends on / blocked
 
Reported: 2020-08-31 11:20 UTC by Pablo Alonso Rodriguez
Modified: 2020-10-27 16:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The query was wrong. Consequence: The data was wrong. Fix: Updated the query. Result: The data is as expected.
Clone Of:
: 1883177 (view as bug list)
Environment:
Last Closed: 2020-10-27 16:36:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift console pull 6571 0 None closed Bug 1874028: Update node queries to exclude fstype, mountpoint 2021-02-12 14:43:56 UTC
Github openshift console pull 6652 0 None closed Bug 1874028: Add sum to query 2021-02-12 14:43:56 UTC
Red Hat Knowledge Base (Solution) 5398661 0 None None None 2020-09-15 13:58:13 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:36:39 UTC

Description Pablo Alonso Rodriguez 2020-08-31 11:20:39 UTC
Description of problem:

The filesystem used and total column of the nodes page under compute section seem to use the following queries[1]:

- usedStorage: sum by (instance) (node_filesystem_size_bytes - node_filesystem_free_bytes)
- totalStorage: sum by(instance) (instance:node_cpu:rate:sum)

Which basically means that every file system reported by node-exporter is considered. 

This is wrong for a number of reasons:
- RHCOS (as any OSTree-based OS) has at least 4 mount points for the main hard disk: "/", "/sysroot", "/etc" and "/var" and this is expected (due to OSTree internals). This means that the main file system is summed 4 times
- Any tmpfs mounted in the node is added to the SUM, which is not correct.
- Potentially any filesystem of a mounted persistent volume could be added to the sum.
- In addition, `/boot` (and `/boot/efi` in EFI systems) are added to the sum. It would be interesting to discuss if this one is correct, but I think it shouldn't at least for RHCOS (one should not take care about /boot size on RHCOS).

We are not talking about a minor delta of some megabytes, but the filesystem reported in console to be at least 4 times bigger than the real one.

Version-Release number of selected component (if applicable):

4.5.7

How reproducible:

Always, even in a freshly installed cluster.

Steps to Reproduce:
1. Install a cluster
2. Look at the node filesystem size reported at console vs the ones you see with df
3.

Actual results:

Wrong sum

Expected results:

Matching filesystem sizes

Additional info:

[1] - https://github.com/openshift/console/blob/release-4.5/frontend/packages/console-app/src/components/nodes/NodesPage.tsx#L232

Comment 1 Pawel Krupa 2020-08-31 13:58:47 UTC
I am working under assumption that `instance:node_cpu:rate:sum` was meant as `instance:node_filesystem_usage:sum`.

`instance:node_filesystem_usage:sum` is showing only usage of `/` mountpoint and as such we removed this recording rule in 4.6+. `node_filesystem_size_bytes - node_filesystem_avail_bytes` metric should be used instead. More on the topic: https://www.robustperception.io/filesystem-metrics-from-the-node-exporter

Additionally any usage of node_filesystem_* metrics should be restricted either to mountpoint and/or to fs type. Such restrictions can be done by using `fstype` and/or `mountpoint` labels, for example this would show all available storage space excluding `/boot`, tmpfs, and squashfs: `node_filesystem_avail_bytes{fstype!~"tmpfs|squashfs",mountpoint!="/boot"}`

Comment 2 Jakub Hadvig 2020-08-31 14:06:43 UTC
Thanks Pawel for more insight! Much appriciated.

Comment 4 ralpert 2020-09-04 20:52:44 UTC
Is this okay to close given Pawel's insight? We're using the queries he mentioned.

Comment 5 Pablo Alonso Rodriguez 2020-09-07 07:33:33 UTC
No, it is not Ok.

First of all, this issue does reproduce in 4.5, so even if we didn't reproduce in 4.6, we would need to get the fix into 4.5.

However, I have been having a look at current 4.6 cluster monitoring operator code. The --collector.filesystem.ignored-mount-points operator has been expanded but is not enough:

- In 4.5: --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- In 4.6: --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)

With that change, only volumes under /var/lib/kubelet/pods would be ignored (pod volume mounts, emptydirs, etc.). However, the following would still be counted incorrectly:
- The mounts in / , /usr and /var would still make the root filesystem to be accounted 3 times
- tmpfs foilesystems outside /var/lib/kubelet/pods (like /tmp or /dev/shm) would still be added to the count.

So no, not ok to close, we still need a fix.

Comment 7 Pawel Krupa 2020-09-09 06:49:15 UTC
Sorry for not being clear enough. Yes, my suggestion is to use metric labels as filters and remove unwanted data from query.

For example to calculate used storage with applied filter, query would need to look as follows:

sum by (instance) (node_filesystem_size_bytes{fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_free_bytes{fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})

Comment 8 Jakub Hadvig 2020-09-09 08:35:09 UTC
*** Bug 1877136 has been marked as a duplicate of this bug. ***

Comment 10 Yadan Pei 2020-09-16 03:13:08 UTC
  [NodeQueries.FILESYSTEM_USAGE]: _.template(
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!=""} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!=""})`,
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`,
  ),
  [NodeQueries.FILESYSTEM_TOTAL]: _.template(
    `node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"}`,

I think for NodeQueries.FILESYSTEM_TOTAL, the correct query should be `sum(node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`

Assigning back to confirm

Comment 11 Yadan Pei 2020-09-16 03:14:53 UTC
Sorry, this is the current PR fix(one extra line in comment 10)

  [NodeQueries.FILESYSTEM_USAGE]: _.template(
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`,
  ),

  [NodeQueries.FILESYSTEM_TOTAL]: _.template(
    `node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"}`,

Comment 13 Yadan Pei 2020-09-27 02:44:34 UTC
Node Overview page and Nodes list page are showing the same Filesystem Usage and Total value.
And Filesystem Total Usage is showing the same value with query sum(node_filesystem_size_bytes{instance="<node>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})


Moving to VERIFIED

Verified on     4.6.0-0.nightly-2020-09-26-202331

Comment 14 Samuel Padgett 2020-10-01 12:22:49 UTC
*** Bug 1852770 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2020-10-27 16:36:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.