Hello, The OpenShift Monitoring Team has published a set guidelines for writing alerting rules in OpenShift, including a basic style guide. You can find these here: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide A subset of these are now being enforced in OpenShift End-to-End tests [1], with temporary exceptions for existing non-compliant rules. This component was found to have the following issues: * Alerts without summary and/or description annotations: - CloudCredentialOperatorDeprovisioningFailed - CredentialOperatorDeprovisioningFailed - CloudCredentialOperatorInsufficientCloudCreds - CloudCredentialOperatorProvisioningFailed - CloudCredentialOperatorTargetNamespaceMissing Alerts MUST include summary and description annotations. Think of summary as the first line of a commit message, or an email subject line. It should be brief but informative. The description is the longer, more detailed explanation of the alert. The enhancement document linked above has examples of alerts with these annotations. Thank you! Repo: openshift/cloud-credential-operator [1]: https://github.com/openshift/origin/commit/097e7a6
Looks like a new alert that we missed before also needs the annotations: - CloudCredentialOperatorStaleCredentials
Verified with 4.10.0-0.nightly-2021-10-09-022511 1. Login on prometheus webpage with openshift account. All CCO alerts are with description and summary informations. https://prometheus-k8s-openshift-monitoring.apps.jshu-1009-test3.qe.devcluster.openshift.com/alerts /etc/prometheus/rules/prometheus-k8s-rulefiles-0/openshift-cloud-credential-operator-cloud-credential-operator-alerts.yaml > CloudCredentialOperator CloudCredentialOperatorTargetNamespaceMissing (0 active) name: CloudCredentialOperatorTargetNamespaceMissing expr: cco_credentials_requests_conditions{condition="MissingTargetNamespace"} > 0 for: 5m labels: severity: warning annotations: description: At least one CredentialsRequest custom resource has specified in its .spec.secretRef.namespace field a namespace which does not presently exist. This means the Cloud Credential Operator in the openshift-cloud-credential-operator namespace cannot process the CredentialsRequest resource. Check the conditions of all CredentialsRequests with 'oc get credentialsrequest -A' to find any CredentialsRequest(s) with a .status.condition showing a condition type of MissingTargetNamespace set to True. message: CredentialsRequest(s) pointing to non-existent namespace summary: One ore more CredentialsRequest CRs are asking to save credentials to a non-existent namespace. CloudCredentialOperatorProvisioningFailed (0 active) name: CloudCredentialOperatorProvisioningFailed expr: cco_credentials_requests_conditions{condition="CredentialsProvisionFailure"} > 0 for: 5m labels: severity: warning annotations: description: While processing a CredentialsRequest, the Cloud Credential Operator encountered an issue. Check the conditions of all CredentialsRequets with 'oc get credentialsrequest -A' to find any CredentialsRequest(s) with a .stats.condition showing a condition type of CredentialsProvisionFailure set to True for more details on the issue. message: CredentialsRequest(s) unable to be fulfilled summary: One or more CredentialsRequest CRs are unable to be processed. CloudCredentialOperatorDeprovisioningFailed (0 active) name: CloudCredentialOperatorDeprovisioningFailed expr: cco_credentials_requests_conditions{condition="CredentialsDeprovisionFailure"} > 0 for: 5m labels: severity: warning annotations: description: While processing a CredentialsRequest marked for deletion, the Cloud Credential Operator encountered an issue. Check the conditions of all CredentialsRequests with 'oc get credentialsrequest -A' to find any CredentialsRequest(s) with a .status.condition showing a condition type of CredentialsDeprovisionFailure set to True for more details on the issue. message: CredentialsRequest(s) unable to be cleaned up summary: One or more CredentialsRequest CRs are unable to be deleted. CloudCredentialOperatorInsufficientCloudCreds (0 active) name: CloudCredentialOperatorInsufficientCloudCreds expr: cco_credentials_requests_conditions{condition="InsufficientCloudCreds"} > 0 for: 5m labels: severity: warning annotations: description: The Cloud Credential Operator has determined that there are insufficient permissions to process one or more CredentialsRequest CRs. Check the conditions of all CredentialsRequests with 'oc get credentialsrequest -A' to find any CredentialsRequest(s) with a .status.condition showing a condition type of InsufficientCloudCreds set to True for more details. message: Cluster's cloud credentials insufficient for minting or passthrough summary: Problem with the available platform credentials. CloudCredentialOperatorStaleCredentials (0 active) name: CloudCredentialOperatorStaleCredentials expr: cco_credentials_requests_conditions{condition="StaleCredentials"} > 0 for: 5m labels: severity: warning annotations: description: The Cloud Credential Operator (CCO) has detected one or more stale CredentialsRequest CRs that need to be manually deleted. When the CCO is in Manual credentials mode, it will not automatially clean up stale CredentialsRequest CRs (that may no longer be necessary in the present version of OpenShift because it could involve needing to clean up manually created cloud resources. Check the conditions of all CredentialsRequests with 'oc get credentialsrequest -A' to find any CredentialsRequest(s) with a .status.condition showing a condition type of StaleCredentials set to True. Determine the appropriate steps to clean up/deprovision any previously provisioned cloud resources. Finally, delete the CredentialsRequest with an 'oc delete'. message: 1 or more credentials requests are stale and should be deleted. Check the status.conditions on CredentialsRequest CRs to identify the stale one(s). summary: One or more CredentialsRequest CRs are stale and should be deleted. 2. Create one CredentialsRequest with namespace doesn't exist, then alert CloudCredentialOperatorTargetNamespaceMissing is generated apiVersion: cloudcredential.openshift.io/v1 kind: CredentialsRequest metadata: name: my-cred-request namespace: openshift-cloud-credential-operator spec: secretRef: name: my-cred-request-secret namespace: namespace-does-not-exist providerSpec: apiVersion: cloudcredential.openshift.io/v1 kind: AWSProviderSpec statementEntries: - effect: Allow action: - s3:CreateBucket - s3:DeleteBucket resource: "*" 3. Change CCO to "Manual" mode and create a CredentialsRequest with the namespace/name of openshift-cloud-credential-operator/cloud-credential-operator-s3, then alert CloudCredentialOperatorStaleCredentials is generated apiVersion: cloudcredential.openshift.io/v1 kind: CredentialsRequest metadata: name: cloud-credential-operator-s3 namespace: openshift-cloud-credential-operator annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" spec: secretRef: name: cloud-credential-operator-s3-creds namespace: openshift-cloud-credential-operator providerSpec: apiVersion: cloudcredential.openshift.io/v1 kind: AWSProviderSpec statementEntries: - effect: Allow action: - s3:CreateBucket - s3:PutBucketTagging - s3:PutObject - s3:PutObjectAcl resource: "*"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056