Description of problem: Our cluster keeps firing the KubePersistentVolumeFullInFourDays alerts many times This is a false positive due to alert sensibility This is a similar behaviour than https://github.com/kubernetes-monitoring/kubernetes-mixin/issues/262 It seems to be fixed upstream and in OCP 4.x with a modification in alert definition (clause "for") https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/288 - alert: KubePersistentVolumeFullInFourDays annotations: message: Based on recent sampling, the persistent volume claimed by {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ $value }} bytes are available. expr: | kubelet_volume_stats_available_bytes{namespace=~"(openshift-.*|kube-.*|default|logging)",job="kubelet"} and predict_linear(kubelet_volume_stats_available_bytes{namespace=~"(openshift-.*|kube-.*|default|logging)",job="kubelet"}[6h], 4 * 24 * 3600) < 0 for: 5m labels: severity: critical Ocp 3.11 - https://github.com/openshift/cluster-monitoring-operator/blob/release-3.11/assets/prometheus-k8s/rules.yaml Ocp 4.3 - https://github.com/openshift/cluster-monitoring-operator/blob/release-4.3/assets/prometheus-k8s/rules.yaml Is it possible to backport for 4.x to 3.11 Version-Release number of selected component (if applicable): 3.11.x How reproducible: Monitor an application storage with a similar behaviour than describe here https://github.com/kubernetes-monitoring/kubernetes-mixin/issues/262 Actual results: KubePersistentVolumeFullInFourDays is firing and solved automatically due to alert sensibility Expected results: KubePersistentVolumeFullInFourDays should avoid storage pic since it is a long term alert Additional info:
Yes its possible to backport, can't promise it will be ASAP. Assigning to Serg.
*** Bug 1810838 has been marked as a duplicate of this bug. ***
This alert was improved in 4.1 [1] and we don't have plans for backport. [1]: https://github.com/openshift/cluster-monitoring-operator/blob/release-4.1/assets/prometheus-k8s/rules.yaml#L777-L789
Hello, Is there option to plan a backport of this alert to 3.11 ?
Targetting this to 4.2.x, and then we can create additional bugs for backporting to the other versions.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2477