1994257 – Audit errors alert not created

Bug 1994257 - Audit errors alert not created

Summary: Audit errors alert not created

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.9
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Juan Antonio Osorio
QA Contact:	Rahul Gangwar
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-08-17 07:24 UTC by Juan Antonio Osorio
Modified:	2021-10-18 17:47 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-10-18 17:46:44 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-kube-apiserver-operator pull 1206	0	None	None	None	2021-08-17 07:29:08 UTC
Red Hat Product Errata	RHSA-2021:3759	0	None	None	None	2021-10-18 17:47:00 UTC

Description Juan Antonio Osorio 2021-08-17 07:24:49 UTC

Description of problem:

While the deployment ships with many useful alerts, there currently isn't a way of knowing if the apiserver is failing to write audit logs. This is needed in some deployments for compliance reasons. One currently has to create an alert manually to do this.

Version-Release number of selected component (if applicable):
All


Additional info:

https://github.com/openshift/cluster-kube-apiserver-operator/pull/1166 originally added this alert, but due to an oversight the alert wasn't being created.

Comment 2 Rahul Gangwar 2021-08-19 04:11:43 UTC


NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-08-18-144658   True        False         10m     Cluster version is 4.9.0-0.nightly-2021-08-18-144658

oc -n openshift-kube-apiserver get prometheusrule audit-errors -o yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  creationTimestamp: "2021-08-19T03:38:29Z"
  generation: 1
  managedFields:
  - apiVersion: monitoring.coreos.com/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:groups: {}
    manager: cluster-kube-apiserver-operator
    operation: Update
    time: "2021-08-19T03:38:29Z"
  name: audit-errors
  namespace: openshift-kube-apiserver
  resourceVersion: "6217"
  uid: ac9f627f-d493-4c5f-b691-a0eae17f6799
spec:
  groups:
  - name: apiserver-audit
    rules:
    - alert: AuditLogError
      annotations:
        description: An API Server had an error writing to an audit log.
        summary: |-
          An API Server instance was unable to write audit logs. This could be
          triggered by the node running out of space, or a malicious actor
          tampering with the audit logs.
      expr: |
        sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0
      for: 1m
      labels:
        severity: warning

Comment 5 errata-xmlrpc 2021-10-18 17:46:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Note You need to log in before you can comment on or make changes to this bug.