Hide Forgot
Hello, The OpenShift Monitoring Team has published a set guidelines for writing alerting rules in OpenShift, including a basic style guide. You can find these here: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide A subset of these are now being enforced in OpenShift End-to-End tests [1], with temporary exceptions for existing non-compliant rules. This component was found to have the following issues: * Alerts without summary and/or description annotations: - KubeControllerManagerDown - PodDisruptionBudgetAtLimit - PodDisruptionBudgetLimit Alerts MUST include summary and description annotations. Think of summary as the first line of a commit message, or an email subject line. It should be brief but informative. The description is the longer, more detailed explanation of the alert. The enhancement document linked above has examples of alerts with these annotations. * Alerts found to not include a namespace label: - KubeControllerManagerDown Alerts SHOULD include a namespace label indicating the alert's source. This requirement originally comes from our SRE team, as they use the namespace label as the first means of routing alerts. Many alerts already include a namespace label as a result of the PromQL expressions used, others may require a static label. Example of a change to PromQL to include a namespace label: https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-9024dcef0fd244c0267c46858da24fbd1f45633515fafae0f98781b20805ff1dL22-R22 Example of adding a static namespace label: https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-352702e71122d34a1be04c0588356cd8cb8a10df547f1c3c39fec18fa75b1593R304 If you have questions about how to best to modify your alerting rules to include a namespace label, please reach out to the OpenShift Monitoring Team in the #forum-monitoring channel on Slack, or on our mailing list: team-monitoring Thank you! Repo: openshift/cluster-kube-controller-manager-operator [1]: https://github.com/openshift/origin/commit/097e7a6
PR is up. Thanks for the explanations.
*** Bug 1992537 has been marked as a duplicate of this bug. ***
Confirmed with latest ocp , the issue has fixed: [root@localhost ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-11-22-195410 True False 130m Cluster version is 4.10.0-0.nightly-2021-11-22-195410 name: PodDisruptionBudgetAtLimit expr: max by(namespace, poddisruptionbudget) (kube_poddisruptionbudget_status_current_healthy == kube_poddisruptionbudget_status_desired_healthy) for: 1h labels: severity: warning annotations: description: The pod disruption budget is at minimum disruptions allowed level. The number of current healthy pods is equal to desired healthy pods. summary: The pod disruption budget is preventing further disruption to pods.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056