Bug 1992560 - all the alert rules' annotations "summary" and "description" should comply with the OpenShift alerting guidelines
Summary: all the alert rules' annotations "summary" and "description" should comply wi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Tuning Operator
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Jiří Mencák
QA Contact: Simon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-11 10:16 UTC by hongyan li
Modified: 2021-10-18 17:46 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:45:51 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-node-tuning-operator pull 263 0 None None None 2021-08-18 13:08:27 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:46:09 UTC

Description hongyan li 2021-08-11 10:16:22 UTC
Description of problem:
all the alert rules'  annotations "summary" and "description"  should comply with the OpenShift alerting guidelines

Version-Release number of selected component (if applicable):
4.9.0-0.nightly-2021-08-07-175228

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:
$ oc get prometheusrules -n openshift-cluster-node-tuning-operator -oyaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: PrometheusRule
  metadata:
    annotations:
      include.release.openshift.io/ibm-cloud-managed: "true"
      include.release.openshift.io/self-managed-high-availability: "true"
      include.release.openshift.io/single-node-developer: "true"
    creationTimestamp: "2021-08-10T23:12:00Z"
    generation: 1
    labels:
      role: alert-rules
    name: node-tuning-operator
    namespace: openshift-cluster-node-tuning-operator
    ownerReferences:
    - apiVersion: config.openshift.io/v1
      kind: ClusterVersion
      name: version
      uid: 9fc7b5b6-6c23-4335-be07-ecfe1b9a142f
    resourceVersion: "1692"
    uid: 7ed69e51-adfa-4bc0-9f2c-6ccf026e030a
  spec:
    groups:
    - name: node-tuning-operator.rules
      rules:
      - alert: NTOPodsNotReady
        annotations:
          message: Pod {{ $labels.pod }} on node {{ $labels.node }} is not ready.
        expr: |
          kube_pod_status_ready{namespace='openshift-cluster-node-tuning-operator', condition='true'} == 0
        for: 30m
        labels:
          severity: warning
      - alert: NTODegraded
        annotations:
          message: |
            The Node Tuning Operator is degraded. Review the "node-tuning" ClusterOperator object for further details.
        expr: nto_degraded_info == 1
        for: 2h
        labels:
          severity: warning
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


Expected results:
alert rules have annotations "summary" and "description"

Additional info:
the "summary" and "description" annotations comply with the OpenShift alerting guidelines [1]

[1] https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#documentation-required

Comment 2 Simon 2021-08-23 13:02:08 UTC
$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-08-22-070405   True        False         54m     Cluster version is 4.9.0-0.nightly-2021-08-22-070405

$ oc get prometheusrules -n openshift-cluster-node-tuning-operator -o yaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: PrometheusRule
  metadata:
    annotations:
      include.release.openshift.io/ibm-cloud-managed: "true"
      include.release.openshift.io/self-managed-high-availability: "true"
      include.release.openshift.io/single-node-developer: "true"
    creationTimestamp: "2021-08-23T11:40:55Z"
    generation: 1
    labels:
      role: alert-rules
    name: node-tuning-operator
    namespace: openshift-cluster-node-tuning-operator
    ownerReferences:
    - apiVersion: config.openshift.io/v1
      kind: ClusterVersion
      name: version
      uid: 6b989aab-65d4-4c21-b305-f908bae291b8
    resourceVersion: "1560"
    uid: b36eff4b-64fb-4f8f-9a57-13fa526b7257
  spec:
    groups:
    - name: node-tuning-operator.rules
      rules:
      - alert: NTOPodsNotReady
        annotations:
          description: |
            Pod {{ $labels.pod }} is not ready.
            Review the "Event" objects in "openshift-cluster-node-tuning-operator" namespace for further details.
          summary: Pod {{ $labels.pod }} is not ready.
        expr: |
          kube_pod_status_ready{namespace='openshift-cluster-node-tuning-operator', condition='true'} == 0
        for: 30m
        labels:
          severity: warning
      - alert: NTODegraded
        annotations:
          description: The Node Tuning Operator is degraded. Review the "node-tuning"
            ClusterOperator object for further details.
          summary: The Node Tuning Operator is degraded.
        expr: nto_degraded_info == 1
        for: 2h
        labels:
          severity: warning
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 5 errata-xmlrpc 2021-10-18 17:45:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.