Description of problem: all the alert rules' annotations "summary" and "description" should comply with the OpenShift alerting guidelines Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-07-175228 How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: $ oc get prometheusrules -n openshift-kube-scheduler-operator -oyaml apiVersion: v1 items: - apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2021-08-10T23:12:04Z" generation: 1 name: kube-scheduler-operator namespace: openshift-kube-scheduler-operator ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 9fc7b5b6-6c23-4335-be07-ecfe1b9a142f resourceVersion: "1798" uid: 5b4fb182-09ca-4606-98d2-cd2db004e218 spec: groups: - name: cluster-version rules: - alert: KubeSchedulerDown annotations: message: KubeScheduler has disappeared from Prometheus target discovery. expr: | absent(up{job="scheduler"} == 1) for: 15m labels: severity: critical - name: scheduler-legacy-policy-deprecated rules: - alert: SchedulerLegacyPolicySet annotations: message: The scheduler is currently configured to use a legacy scheduler policy API. Use of the policy API is deprecated and removed in 4.10. expr: | cluster_legacy_scheduler_policy > 0 for: 60m labels: severity: warning kind: List metadata: resourceVersion: "" selfLink: "" Expected results: alert rules have annotations "summary" and "description" Additional info: the "summary" and "description" annotations comply with the OpenShift alerting guidelines [1] [1] https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#documentation-required
Ross sync with Mike about those changes, he knows the code in https://github.com/openshift/cluster-kube-scheduler-operator/ While at it also check if the alerts in https://github.com/openshift/cluster-kube-controller-manager-operator/ are following these rules.
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
just checked, the the alert rules' annotations "summary" and "description" still not comply with the OpenShift alerting guidelines, it should be like this ``` - alert: KubeAPIDown annotations: summary: Target disappeared from Prometheus target discovery. description: KubeAPI has disappeared from Prometheus target discovery. runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/KubeAPIDown.md expr: ```
KCM addressed as well in https://bugzilla.redhat.com/show_bug.cgi?id=2010352
Only critical fixes as backported to 4.9.