4.9 had a guard that kept it from firing, so we were reducing the number of false positives, and the old bug summary made sense there. But 4.8 doesn't have that guard today, so in 4.8 we are reducing the number of false positives. I'm adjusting the summary to reflect that. It will still be the same alert logic that we brought to 4.9, the summary change just reflects the different starting point that each branch is moving from.
> 4.9 had a guard that kept it from firing, so we were reducing the number of false positives... Hit send too soon, I meant we were reducing false negatives in 4.9.
tested with the PR, KubePodCrashLooping expr is changed to below, used the deployment file from https://bugzilla.redhat.com/show_bug.cgi?id=2006767#c1 and watched for a few minutes, the result for the expr is continuous, no gaps, see the picture - alert: KubePodCrashLooping annotations: description: 'Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is in waiting state (reason: "CrashLoopBackOff").' summary: Pod is crash looping. expr: | max_over_time(kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff", namespace=~"(openshift-.*|kube-.*|default|logging)",job="kube-state-metrics"}[5m]) >= 1 for: 15m labels: severity: warning
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.39 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1427