+++ This bug was initially created as a clone of Bug #1955469 +++ +++ This bug was initially created as a clone of Bug #1955467 +++ Description of problem: We've identified that on some clusters, the node_mountstats_nfs_* metrics account for more than half of the total metrics stored in Prometheus. These metrics aren't used actually anywhere (neither rules nor dashboards) and storing them in Prometheus increases memory usage by a lot for clusters that have nodes configured with NFS. Version-Release number of selected component (if applicable): 4.6 How reproducible: Always Steps to Reproduce: Check the definition of the node-exporter daemonset: oc describe -n openshift-monitoring daemonset node-exporter Actual results: The '--collector.mountstats' flag is listed in the node-exporter container's argument list. Expected results: The '--collector.mountstats' flag isn't set. Additional info: The mountstats collector had been enabled in [1] following a customer request for enhancement. But looking at the history, the customer was asking for the kubelet_volume_* metrics which weren't supported by their storage provider at this time (it's been fixed since then [2]). The mountstats metrics don't fill the same need and are superfluous. [1] https://github.com/openshift/cluster-monitoring-operator/pull/409 [2] https://github.com/NetApp/trident/issues/134
create nfs sc and create pvc based on it, checked no metric name with prefix node_mountstats_nfs_ # token=`oc sa get-token prometheus-k8s -n openshift-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/label/__name__/values' | jq | grep node_mountstats_nfs no result # oc -n openshift-monitoring get ds node-exporter -oyaml | grep "\--collector" - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) - --collector.cpu.info - --collector.textfile.directory=/var/node_exporter/textfile
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.31 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2100