Description of problem: While the deployment ships with many useful alerts, there currently isn't a way of knowing if the apiserver is failing to write audit logs. This is needed in some deployments for compliance reasons. One currently has to create an alert manually to do this. Version-Release number of selected component (if applicable): All Additional info: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1166 originally added this alert, but due to an oversight the alert wasn't being created.
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-18-144658 True False 10m Cluster version is 4.9.0-0.nightly-2021-08-18-144658 oc -n openshift-kube-apiserver get prometheusrule audit-errors -o yaml apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: creationTimestamp: "2021-08-19T03:38:29Z" generation: 1 managedFields: - apiVersion: monitoring.coreos.com/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:groups: {} manager: cluster-kube-apiserver-operator operation: Update time: "2021-08-19T03:38:29Z" name: audit-errors namespace: openshift-kube-apiserver resourceVersion: "6217" uid: ac9f627f-d493-4c5f-b691-a0eae17f6799 spec: groups: - name: apiserver-audit rules: - alert: AuditLogError annotations: description: An API Server had an error writing to an audit log. summary: |- An API Server instance was unable to write audit logs. This could be triggered by the node running out of space, or a malicious actor tampering with the audit logs. expr: | sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0 for: 1m labels: severity: warning
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759