Bug 1992493
| Summary: | 3 alerts have no annotations summary and description | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | hongyan li <hongyli> |
| Component: | Monitoring | Assignee: | Philip Gough <pgough> |
| Status: | CLOSED ERRATA | QA Contact: | hongyan li <hongyli> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.9 | CC: | amuller, anpicker, aos-bugs, erooth, jfajersk, juzhao, pgough |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:45:51 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
checked with 4.9.0-0.nightly-2021-08-19-184748, the reported alerts have summary and description part
# oc get prometheusrules -n openshift-monitoring -oyaml |grep -E -A10 'ClusterMonitoringOperatorReconciliationErrors|AlertmanagerReceiversNotConfigured|MultipleContainersOOMKilled'
- alert: ClusterMonitoringOperatorReconciliationErrors
annotations:
description: Errors are occurring during reconciliation cycles. Inspect
the cluster-monitoring-operator log for potential root causes.
summary: Cluster Monitoring Operator is experiencing unexpected reconciliation
errors.
expr: max_over_time(cluster_monitoring_operator_last_reconciliation_successful[5m])
== 0
for: 1h
labels:
severity: warning
- alert: AlertmanagerReceiversNotConfigured
annotations:
description: Alerts are not configured to be sent to a notification system,
meaning that you may not be notified in a timely fashion when important
failures occur. Check the OpenShift documentation to learn how to configure
notifications with Alertmanager.
summary: Receivers (notification integrations) are not configured on Alertmanager
expr: cluster:alertmanager_integrations:max == 0
for: 10m
labels:
severity: warning
--
- alert: MultipleContainersOOMKilled
annotations:
description: Multiple containers were out of memory killed within the past
15 minutes. There are many potential causes of OOM errors, however issues
on a specific node or containers breaching their limits is common.
summary: Containers are being killed due to OOM
expr: sum(max by(namespace, container, pod) (increase(kube_pod_container_status_restarts_total[12m]))
and max by(namespace, container, pod) (kube_pod_container_status_last_terminated_reason{reason="OOMKilled"})
== 1) > 5
for: 15m
labels:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |
Description of problem: 3 alerts have no annotations summary and description Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-07-175228 How reproducible: always Steps to Reproduce: $ oc get prometheusrules -n openshift-monitoring -oyaml |grep -E -A5 'ClusterMonitoringOperatorReconciliationErrors|AlertmanagerReceiversNotConfigured|MultipleContainersOOMKilled' - alert: ClusterMonitoringOperatorReconciliationErrors annotations: message: Cluster Monitoring Operator is experiencing unexpected reconciliation errors. Inspect the cluster-monitoring-operator log for potential root causes. expr: max_over_time(cluster_monitoring_operator_last_reconciliation_successful[5m]) -- - alert: AlertmanagerReceiversNotConfigured annotations: message: Alerts are not configured to be sent to a notification system, meaning that you may not be notified in a timely fashion when important failures occur. Check the OpenShift documentation to learn how to configure notifications with Alertmanager. -- - alert: MultipleContainersOOMKilled annotations: message: Multiple containers were out of memory killed within the past 15 minutes. expr: sum(max by(namespace, container, pod) (increase(kube_pod_container_status_restarts_total[12m])) and max by(namespace, container, pod) (kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}) Actual results: Expected results: Additional info: all alerts shipped by CMO, the "summary" and "description" annotations comply with the OpenShift alerting guidelines https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#documentation-required