Bug 1874028 - Node filesystem used and total are calculations are wrong
Summary: Node filesystem used and total are calculations are wrong
Keywords:
Status: ON_QA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: ralpert
QA Contact: Yadan Pei
URL:
Whiteboard:
: 1877136 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-31 11:20 UTC by Pablo Alonso Rodriguez
Modified: 2020-09-22 17:52 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift console pull 6571 None closed Bug 1874028: Update node queries to exclude fstype, mountpoint 2020-09-20 10:19:09 UTC
Github openshift console pull 6652 None open Bug 1874028: Add sum to query 2020-09-21 12:44:31 UTC
Red Hat Knowledge Base (Solution) 5398661 None None None 2020-09-15 13:58:13 UTC

Description Pablo Alonso Rodriguez 2020-08-31 11:20:39 UTC
Description of problem:

The filesystem used and total column of the nodes page under compute section seem to use the following queries[1]:

- usedStorage: sum by (instance) (node_filesystem_size_bytes - node_filesystem_free_bytes)
- totalStorage: sum by(instance) (instance:node_cpu:rate:sum)

Which basically means that every file system reported by node-exporter is considered. 

This is wrong for a number of reasons:
- RHCOS (as any OSTree-based OS) has at least 4 mount points for the main hard disk: "/", "/sysroot", "/etc" and "/var" and this is expected (due to OSTree internals). This means that the main file system is summed 4 times
- Any tmpfs mounted in the node is added to the SUM, which is not correct.
- Potentially any filesystem of a mounted persistent volume could be added to the sum.
- In addition, `/boot` (and `/boot/efi` in EFI systems) are added to the sum. It would be interesting to discuss if this one is correct, but I think it shouldn't at least for RHCOS (one should not take care about /boot size on RHCOS).

We are not talking about a minor delta of some megabytes, but the filesystem reported in console to be at least 4 times bigger than the real one.

Version-Release number of selected component (if applicable):

4.5.7

How reproducible:

Always, even in a freshly installed cluster.

Steps to Reproduce:
1. Install a cluster
2. Look at the node filesystem size reported at console vs the ones you see with df
3.

Actual results:

Wrong sum

Expected results:

Matching filesystem sizes

Additional info:

[1] - https://github.com/openshift/console/blob/release-4.5/frontend/packages/console-app/src/components/nodes/NodesPage.tsx#L232

Comment 1 Pawel Krupa 2020-08-31 13:58:47 UTC
I am working under assumption that `instance:node_cpu:rate:sum` was meant as `instance:node_filesystem_usage:sum`.

`instance:node_filesystem_usage:sum` is showing only usage of `/` mountpoint and as such we removed this recording rule in 4.6+. `node_filesystem_size_bytes - node_filesystem_avail_bytes` metric should be used instead. More on the topic: https://www.robustperception.io/filesystem-metrics-from-the-node-exporter

Additionally any usage of node_filesystem_* metrics should be restricted either to mountpoint and/or to fs type. Such restrictions can be done by using `fstype` and/or `mountpoint` labels, for example this would show all available storage space excluding `/boot`, tmpfs, and squashfs: `node_filesystem_avail_bytes{fstype!~"tmpfs|squashfs",mountpoint!="/boot"}`

Comment 2 Jakub Hadvig 2020-08-31 14:06:43 UTC
Thanks Pawel for more insight! Much appriciated.

Comment 4 ralpert 2020-09-04 20:52:44 UTC
Is this okay to close given Pawel's insight? We're using the queries he mentioned.

Comment 5 Pablo Alonso Rodriguez 2020-09-07 07:33:33 UTC
No, it is not Ok.

First of all, this issue does reproduce in 4.5, so even if we didn't reproduce in 4.6, we would need to get the fix into 4.5.

However, I have been having a look at current 4.6 cluster monitoring operator code. The --collector.filesystem.ignored-mount-points operator has been expanded but is not enough:

- In 4.5: --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- In 4.6: --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)

With that change, only volumes under /var/lib/kubelet/pods would be ignored (pod volume mounts, emptydirs, etc.). However, the following would still be counted incorrectly:
- The mounts in / , /usr and /var would still make the root filesystem to be accounted 3 times
- tmpfs foilesystems outside /var/lib/kubelet/pods (like /tmp or /dev/shm) would still be added to the count.

So no, not ok to close, we still need a fix.

Comment 7 Pawel Krupa 2020-09-09 06:49:15 UTC
Sorry for not being clear enough. Yes, my suggestion is to use metric labels as filters and remove unwanted data from query.

For example to calculate used storage with applied filter, query would need to look as follows:

sum by (instance) (node_filesystem_size_bytes{fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_free_bytes{fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})

Comment 8 Jakub Hadvig 2020-09-09 08:35:09 UTC
*** Bug 1877136 has been marked as a duplicate of this bug. ***

Comment 10 Yadan Pei 2020-09-16 03:13:08 UTC
  [NodeQueries.FILESYSTEM_USAGE]: _.template(
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!=""} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!=""})`,
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`,
  ),
  [NodeQueries.FILESYSTEM_TOTAL]: _.template(
    `node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"}`,

I think for NodeQueries.FILESYSTEM_TOTAL, the correct query should be `sum(node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`

Assigning back to confirm

Comment 11 Yadan Pei 2020-09-16 03:14:53 UTC
Sorry, this is the current PR fix(one extra line in comment 10)

  [NodeQueries.FILESYSTEM_USAGE]: _.template(
    `sum(node_filesystem_size_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"} - node_filesystem_avail_bytes{instance="<%= node %>",fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"})`,
  ),

  [NodeQueries.FILESYSTEM_TOTAL]: _.template(
    `node_filesystem_size_bytes{instance='<%= node %>',fstype!~"tmpfs|squashfs",mountpoint!~"/usr|/var"}`,


Note You need to log in before you can comment on or make changes to this bug.