+++ This bug was initially created as a clone of Bug #1948701 +++ Description of problem: CVO already is responsible for alerting on whether its operands are unhealthy. No need for CCO to have its own alert. Version-Release number of selected component (if applicable): 4.7 How reproducible: 100% Steps to Reproduce: 1. Put CCO into an unhealthy state. Actual results: Witness CVO and CCO alerts reporting the same information. Expected results: Only need a single alert. Additional info:
PR to fix this is open and under review https://github.com/openshift/cloud-credential-operator/pull/324
*** Bug 1957424 has been marked as a duplicate of this bug. ***
Verified on 4.7.0-0.nightly-2021-05-07-004616 1. Login to prometheus console, check CloudCredentialOperatorDown has remove from CloudCredentialOperator 2. Create an invalid cr request, check when cco down cvo will fire the alerts $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 5740 0 5740 0 0 400k 0 --:--:-- --:--:-- --:--:-- 400k { "status": "success", "data": [ { "labels": { "alertname": "CloudCredentialOperatorTargetNamespaceMissing", "condition": "MissingTargetNamespace", "container": "kube-rbac-proxy", "endpoint": "metrics", "instance": "10.129.0.69:8443", "job": "cco-metrics", "namespace": "openshift-cloud-credential-operator", "pod": "cloud-credential-operator-7fd7b8c7d5-8t5fv", "prometheus": "openshift-monitoring/k8s", "service": "cco-metrics", "severity": "warning" }, "annotations": { "message": "CredentialsRequest(s) pointing to non-existent namespace" }, "startsAt": "2021-05-07T04:33:42.851Z", "endsAt": "2021-05-07T05:02:12.851Z", "generatorURL": "https://prometheus-k8s-openshift-monitoring.apps.lwan47bug.qe.devcluster.openshift.com/graph?g0.expr=cco_credentials_requests_conditions%7Bcondition%3D%22MissingTargetNamespace%22%7D+%3E+0&g0.tab=1", "status": { "state": "active", "silencedBy": [], "inhibitedBy": [] }, "receivers": [ "Default" ], "fingerprint": "06b742835ceb6c49" }, { "labels": { "alertname": "ClusterOperatorDown", "endpoint": "metrics", "instance": "10.0.162.255:9099", "job": "cluster-version-operator", "name": "cloud-credential", "namespace": "openshift-cluster-version", "pod": "cluster-version-operator-84676c6b47-hp54f", "prometheus": "openshift-monitoring/k8s", "service": "cluster-version-operator", "severity": "critical", "version": "4.7.0-0.nightly-2021-05-07-004616" }, "annotations": { "message": "Cluster operator cloud-credential has not been available for 10 minutes. Operator may be down or disabled, cluster will not be kept up to date and upgrades will not be possible." }, "startsAt": "2021-05-07T04:36:59.213Z", "endsAt": "2021-05-07T05:01:59.213Z", "generatorURL": "https://prometheus-k8s-openshift-monitoring.apps.lwan47bug.qe.devcluster.openshift.com/graph?g0.expr=cluster_operator_up%7Bjob%3D%22cluster-version-operator%22%7D+%3D%3D+0&g0.tab=1", "status": { "state": "active", "silencedBy": [], "inhibitedBy": [] }, "receivers": [ "Critical" ], "fingerprint": "bc22e2964c0ab173" }, { "labels": { "alertname": "ClusterOperatorDegraded", "condition": "Degraded", "endpoint": "metrics", "instance": "10.0.162.255:9099", "job": "cluster-version-operator", "name": "cloud-credential", "namespace": "openshift-cluster-version", "pod": "cluster-version-operator-84676c6b47-hp54f", "prometheus": "openshift-monitoring/k8s", "reason": "CredentialsFailing", "service": "cluster-version-operator", "severity": "critical" }, "annotations": { "message": "Cluster operator cloud-credential has been degraded for 10 minutes. Operator is degraded because CredentialsFailing and cluster upgrades will be unstable." }, "startsAt": "2021-05-07T04:36:59.213Z", "endsAt": "2021-05-07T05:01:59.213Z", "generatorURL": "https://prometheus-k8s-openshift-monitoring.apps.lwan47bug.qe.devcluster.openshift.com/graph?g0.expr=cluster_operator_conditions%7Bcondition%3D%22Degraded%22%2Cjob%3D%22cluster-version-operator%22%7D+%3D%3D+1&g0.tab=1", "status": { "state": "active", "silencedBy": [], "inhibitedBy": [] }, "receivers": [ "Critical" ], "fingerprint": "d0b00c0a6b1e0e75" }, }
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.11 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1550