Hello, The OpenShift Monitoring Team has published a set guidelines for writing alerting rules in OpenShift, including a basic style guide. You can find these here: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide A subset of these are now being enforced in OpenShift End-to-End tests [1], with temporary exceptions for existing non-compliant rules. This component was found to have the following issues: * Alerts without summary and/or description annotations: - NetworkPodsCrashLooping - NoOvnMasterLeader - NoRunningOvnMaster - NodeWithoutOVNKubeNodePodRunning - NorthboundStale - SouthboundStale - V4SubnetAllocationThresholdExceeded - V6SubnetAllocationThresholdExceeded Alerts MUST include summary and description annotations. Think of summary as the first line of a commit message, or an email subject line. It should be brief but informative. The description is the longer, more detailed explanation of the alert. The enhancement document linked above has examples of alerts with these annotations. Thank you! Repo: openshift/cluster-network-operator (ovn-kubernetes subcomponent) [1]: https://github.com/openshift/origin/commit/097e7a6
Do we need to update metrics prefixed with component name? I see thats a requirement in style guide. We could be breaking any client consumer scripts if we do this.
No, that's not a strict requirement. No need to change existing metric names in this case.
@mik
@mifiedle hey, I didnt see it in the change log either but I launch the latest nightly there (4.10.0-0.nightly-2022-01-21-074618) and I could see the results of my PR in OpenShift console -> observability -> alerts -> alerts rules.
Verified on 4.10.0-0.nightly-2022-01-25-023600 via Alert Rule inspection in the console. Note: bz description references NetworkPodsCrashLooping, but no rule with that name exists in the console nor was changed in the PR for this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056