+++ This bug was initially created as a clone of Bug #1955467 +++ Description of problem: We've identified that on some clusters, the node_mountstats_nfs_* metrics account for more than half of the total metrics stored in Prometheus. These metrics aren't used actually anywhere (neither rules nor dashboards) and storing them in Prometheus increases memory usage by a lot for clusters that have nodes configured with NFS. Version-Release number of selected component (if applicable): 4.6 How reproducible: Always Steps to Reproduce: Check the definition of the node-exporter daemonset: oc describe -n openshift-monitoring daemonset node-exporter Actual results: The '--collector.mountstats' flag is listed in the node-exporter container's argument list. Expected results: The '--collector.mountstats' flag isn't set. Additional info: The mountstats collector had been enabled in [1] following a customer request for enhancement. But looking at the history, the customer was asking for the kubelet_volume_* metrics which weren't supported by their storage provider at this time (it's been fixed since then [2]). The mountstats metrics don't fill the same need and are superfluous. [1] https://github.com/openshift/cluster-monitoring-operator/pull/409 [2] https://github.com/NetApp/trident/issues/134
Test with PR oc describe -n openshift-monitoring daemonset node-exporter ... Containers: node-exporter: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1c2a456aa6dc253f47f67d1aeb55b0781173a36b78e33a794cd1644c40dbd852 Port: <none> Host Port: <none> Args: --web.listen-address=127.0.0.1:9100 --path.sysfs=/host/sys --path.rootfs=/host/root --no-collector.wifi --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) --collector.netclass.ignored-devices=^(veth.*)$ --collector.netdev.device-exclude=^(veth.*)$ --collector.cpu.info --collector.textfile.directory=/var/node_exporter/textfile ...
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring` $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/label/__name__/values' | jq | grep -e node_mountstats_nfs_ no results
create nfs sc and create pvc based on it, checked no metric name with prefix node_mountstats_nfs_
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.12 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1561