Bug 2030698
Summary: | KubePodCrashLooping may fire when pod is not in CrashLoopBackOff | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Philip Gough <pgough> |
Component: | Monitoring | Assignee: | Arunprasad Rajkumar <arajkuma> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.8 | CC: | amuller, anpicker, aos-bugs, erooth, juzhao, shishika, wking |
Target Milestone: | --- | ||
Target Release: | 4.8.z | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 2013617 | Environment: | |
Last Closed: | 2022-04-27 11:46:15 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2013617 | ||
Bug Blocks: |
Comment 1
W. Trevor King
2021-12-10 00:08:52 UTC
> 4.9 had a guard that kept it from firing, so we were reducing the number of false positives...
Hit send too soon, I meant we were reducing false negatives in 4.9.
tested with the PR, KubePodCrashLooping expr is changed to below, used the deployment file from https://bugzilla.redhat.com/show_bug.cgi?id=2006767#c1 and watched for a few minutes, the result for the expr is continuous, no gaps, see the picture - alert: KubePodCrashLooping annotations: description: 'Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is in waiting state (reason: "CrashLoopBackOff").' summary: Pod is crash looping. expr: | max_over_time(kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff", namespace=~"(openshift-.*|kube-.*|default|logging)",job="kube-state-metrics"}[5m]) >= 1 for: 15m labels: severity: warning Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.39 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1427 |