Bug 1992537

Summary: all the alert rules' annotations "summary" and "description" should comply with the OpenShift alerting guidelines
Product: OpenShift Container Platform Reporter: hongyan li <hongyli>
Component: kube-controller-managerAssignee: Filip Krepinsky <fkrepins>
Status: CLOSED DUPLICATE QA Contact: zhou ying <yinzhou>
Severity: medium Docs Contact:
Priority: low    
Version: 4.9CC: aos-bugs, mfojtik
Target Milestone: ---Flags: mfojtik: needinfo?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 20:01:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description hongyan li 2021-08-11 09:41:36 UTC
Description of problem:
all the alert rules'  annotations "summary" and "description"  should comply with the OpenShift alerting guidelines

Version-Release number of selected component (if applicable):
4.9.0-0.nightly-2021-08-07-175228

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:
$ oc get prometheusrules -n openshift-kube-controller-manager-operator -oyaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: PrometheusRule
  metadata:
    annotations:
      exclude.release.openshift.io/internal-openshift-hosted: "true"
      include.release.openshift.io/self-managed-high-availability: "true"
      include.release.openshift.io/single-node-developer: "true"
    creationTimestamp: "2021-08-10T23:12:09Z"
    generation: 1
    name: kube-controller-manager-operator
    namespace: openshift-kube-controller-manager-operator
    ownerReferences:
    - apiVersion: config.openshift.io/v1
      kind: ClusterVersion
      name: version
      uid: 9fc7b5b6-6c23-4335-be07-ecfe1b9a142f
    resourceVersion: "1982"
    uid: 72b3a19d-4267-45b0-b596-9dc83edba1f7
  spec:
    groups:
    - name: cluster-version
      rules:
      - alert: KubeControllerManagerDown
        annotations:
          message: KubeControllerManager has disappeared from Prometheus target discovery.
        expr: |
          absent(up{job="kube-controller-manager"} == 1)
        for: 15m
        labels:
          severity: critical
      - alert: PodDisruptionBudgetAtLimit
        annotations:
          message: The pod disruption budget is preventing further disruption to pods
            because it is at the minimum allowed level.
        expr: |
          max by(namespace, poddisruptionbudget) (kube_poddisruptionbudget_status_current_healthy == kube_poddisruptionbudget_status_desired_healthy)
        for: 60m
        labels:
          severity: warning
      - alert: PodDisruptionBudgetLimit
        annotations:
          message: The pod disruption budget is below the minimum number allowed pods.
        expr: |
          max by (namespace, poddisruptionbudget) (kube_poddisruptionbudget_status_current_healthy < kube_poddisruptionbudget_status_desired_healthy)
        for: 15m
        labels:
          severity: critical
kind: List
metadata:
  resourceVersion: ""

  selfLink: ""


Expected results:
alert rules have annotations "summary" and "description"

Additional info:
the "summary" and "description" annotations comply with the OpenShift alerting guidelines [1]

[1] https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#documentation-required

Comment 1 Gabe Montero 2021-08-11 13:35:23 UTC
So this is the https://github.com/openshift/cluster-kube-controller-manager-operator/tree/master/manifests set of files, which is not under openshift-controller-manager

Correcting to correct bugzilla component.

Comment 2 Filip Krepinsky 2021-09-03 16:37:16 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.

Comment 3 Michal Fojtik 2021-09-10 14:27:56 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 4 Michal Fojtik 2021-10-10 20:58:34 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 5 hongyan li 2021-10-11 02:00:21 UTC
Just checked version 4.9.0-0.nightly-2021-10-08-232649, the issue is still there.  
Annotations of alert rule still have no summary and description.

Comment 6 Michal Fojtik 2021-11-14 15:35:16 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 7 Filip Krepinsky 2021-11-16 20:01:35 UTC
Thanks for the issue. I posted a fix to https://bugzilla.redhat.com/show_bug.cgi?id=2010352 since it is a superset of this issue, so closing this one.

*** This bug has been marked as a duplicate of bug 2010352 ***