Bug 2010359 - OpenShift Alerting Rules Style-Guide Compliance
Summary: OpenShift Alerting Rules Style-Guide Compliance
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.10.0
Assignee: Michael McCune
QA Contact: Huali Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-04 13:56 UTC by Brad Ison
Modified: 2022-04-11 08:33 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:16:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-machine-approver pull 138 0 None open Bug 2010359: add summary and description to alerts 2021-11-02 19:46:00 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:16:43 UTC

Description Brad Ison 2021-10-04 13:56:34 UTC
Hello,

The OpenShift Monitoring Team has published a set guidelines for
writing alerting rules in OpenShift, including a basic style guide.
You can find these here:

  https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md
  https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide

A subset of these are now being enforced in OpenShift End-to-End
tests [1], with temporary exceptions for existing non-compliant rules.

This component was found to have the following issues:

* Alerts without summary and/or description annotations:

  - MachineApproverMaxPendingCSRsReached

Alerts MUST include summary and description annotations.

Think of summary as the first line of a commit message, or an email
subject line. It should be brief but informative. The description is
the longer, more detailed explanation of the alert.

The enhancement document linked above has examples of alerts with
these annotations.

Thank you!

Repo: openshift/cluster-machine-approver

[1]: https://github.com/openshift/origin/commit/097e7a6

Comment 3 Huali Liu 2021-11-22 07:06:08 UTC
Set up cluster using cluster-bot with https://github.com/openshift/cluster-machine-approver/pull/138.

Verified MachineApproverMaxPendingCSRsReached alert with summary and description annotations now.

liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2021-11-22-061614-ci-ln-w81ycxb-latest   True        False         81s     Cluster version is 4.10.0-0.ci.test-2021-11-22-061614-ci-ln-w81ycxb-latest
liuhuali@Lius-MacBook-Pro huali-test % oc get prometheusrule  machineapprover-rules -n openshift-cluster-machine-approver -o yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  annotations:
    exclude.release.openshift.io/internal-openshift-hosted: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
  creationTimestamp: "2021-11-22T06:24:00Z"
  generation: 1
  labels:
    prometheus: k8s
    role: alert-rules
  name: machineapprover-rules
  namespace: openshift-cluster-machine-approver
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 822e2653-acf9-4662-9520-9012064ab66f
  resourceVersion: "1782"
  uid: eb426fcf-c5ff-47a1-8ec5-71a4ec4b2118
spec:
  groups:
  - name: cluster-machine-approver.rules
    rules:
    - alert: MachineApproverMaxPendingCSRsReached
      annotations:
        description: |
          The number of pending CertificateSigningRequests has exceeded the
          maximum threshold (current number of machine + 100). Check the
          pending CSRs to determine which machines need approval, also check
          that the nodelink controller is running in the openshift-machine-api
          namespace.
        summary: max pending CSRs threshold reached.
      expr: |
        mapi_current_pending_csr > mapi_max_pending_csr
      for: 5m
      labels:
        severity: warning
liuhuali@Lius-MacBook-Pro huali-test %

Comment 8 errata-xmlrpc 2022-03-10 16:16:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.