Bug 1994257 - Audit errors alert not created
Summary: Audit errors alert not created
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Juan Antonio Osorio
QA Contact: Rahul Gangwar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-17 07:24 UTC by Juan Antonio Osorio
Modified: 2021-10-18 17:47 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:46:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 1206 0 None None None 2021-08-17 07:29:08 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:47:00 UTC

Description Juan Antonio Osorio 2021-08-17 07:24:49 UTC
Description of problem:

While the deployment ships with many useful alerts, there currently isn't a way of knowing if the apiserver is failing to write audit logs. This is needed in some deployments for compliance reasons. One currently has to create an alert manually to do this.

Version-Release number of selected component (if applicable):
All


Additional info:

https://github.com/openshift/cluster-kube-apiserver-operator/pull/1166 originally added this alert, but due to an oversight the alert wasn't being created.

Comment 2 Rahul Gangwar 2021-08-19 04:11:43 UTC

NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-08-18-144658   True        False         10m     Cluster version is 4.9.0-0.nightly-2021-08-18-144658

oc -n openshift-kube-apiserver get prometheusrule audit-errors -o yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  creationTimestamp: "2021-08-19T03:38:29Z"
  generation: 1
  managedFields:
  - apiVersion: monitoring.coreos.com/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:groups: {}
    manager: cluster-kube-apiserver-operator
    operation: Update
    time: "2021-08-19T03:38:29Z"
  name: audit-errors
  namespace: openshift-kube-apiserver
  resourceVersion: "6217"
  uid: ac9f627f-d493-4c5f-b691-a0eae17f6799
spec:
  groups:
  - name: apiserver-audit
    rules:
    - alert: AuditLogError
      annotations:
        description: An API Server had an error writing to an audit log.
        summary: |-
          An API Server instance was unable to write audit logs. This could be
          triggered by the node running out of space, or a malicious actor
          tampering with the audit logs.
      expr: |
        sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0
      for: 1m
      labels:
        severity: warning

Comment 5 errata-xmlrpc 2021-10-18 17:46:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.