Hello, The OpenShift Monitoring Team has published a set guidelines for writing alerting rules in OpenShift, including a basic style guide. You can find these here: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide A subset of these are now being enforced in OpenShift End-to-End tests [1], with temporary exceptions for existing non-compliant rules. This component was found to have the following issues: * Alerts without summary and/or description annotations: - ClusterProxyApplySlow - NodeProxyApplySlow - NodeProxyApplyStale - NodeWithoutSDNPod - SDNPodNotReady Alerts MUST include summary and description annotations. Think of summary as the first line of a commit message, or an email subject line. It should be brief but informative. The description is the longer, more detailed explanation of the alert. The enhancement document linked above has examples of alerts with these annotations. * Alerts found to not include a namespace label: - ClusterProxyApplySlow Alerts SHOULD include a namespace label indicating the alert's source. This requirement originally comes from our SRE team, as they use the namespace label as the first means of routing alerts. Many alerts already include a namespace label as a result of the PromQL expressions used, others may require a static label. Example of a change to PromQL to include a namespace label: https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-9024dcef0fd244c0267c46858da24fbd1f45633515fafae0f98781b20805ff1dL22-R22 Example of adding a static namespace label: https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-352702e71122d34a1be04c0588356cd8cb8a10df547f1c3c39fec18fa75b1593R304 If you have questions about how to best to modify your alerting rules to include a namespace label, please reach out to the OpenShift Monitoring Team in the #forum-monitoring channel on Slack, or on our mailing list: team-monitoring Thank you! Repo: openshift/cluster-network-operator [1]: https://github.com/openshift/origin/commit/097e7a6
*** Bug 2035364 has been marked as a duplicate of this bug. ***
@memodi could you help verified this bug? thanks
This PR can help you verify: https://github.com/openshift/origin/pull/26687
@zzhao - sure, I'll work on it.
Verified on OCP nightly build: 4.11.0-0.nightly-2022-03-27-140854 based on: 1.visually inspected the changes on OCP Web Console under Observe -> Alerting -> Alert rules verify new summary and description are being reflected. 2.checked out above PR to run the dev tests around alerting rules: memodi@memodi-mac:/Users/memodi/workspaces/repos/openshift/origin (remove_sdn_ovn_prom_rule) $ ./openshift-tests run all --dry-run | egrep "OpenShift alerting rules" | ./openshift-tests run -f - openshift-tests version: v4.1.0-4663-g65dcdaa started: (0/1/3) "[sig-instrumentation][Late] OpenShift alerting rules should have description and summary annotations [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" started: (0/2/3) "[sig-instrumentation][Late] OpenShift alerting rules should have a runbook_url annotation if the alert is critical [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" started: (0/3/3) "[sig-instrumentation][Late] OpenShift alerting rules should have a valid severity label [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" passed: (31.7s) 2022-03-28T18:33:09 "[sig-instrumentation][Late] OpenShift alerting rules should have description and summary annotations [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" passed: (32.2s) 2022-03-28T18:33:10 "[sig-instrumentation][Late] OpenShift alerting rules should have a valid severity label [Skipped:Disconnected] [Suite:openshift/conformance/parallel]"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069