Bug 2011083 - Backport audit log silence change
Summary: Backport audit log silence change
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: All
OS: All
unspecified
urgent
Target Milestone: ---
: 4.8.z
Assignee: Kirsten Garrison
QA Contact: Manoj Hans
URL:
Whiteboard:
Depends On: 2011087
Blocks: 2011375
TreeView+ depends on / blocked
 
Reported: 2021-10-05 22:27 UTC by Matthew Robson
Modified: 2021-10-21 22:01 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2011087 2011375 (view as bug list)
Environment:
Last Closed: 2021-10-19 20:35:31 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2793 0 None open [release-4.8] Bug 2011083: templates: Silence audit events from container infra by default 2021-10-05 22:55:16 UTC
Red Hat Knowledge Base (Solution) 6387661 0 None None None 2021-10-06 01:45:04 UTC
Red Hat Product Errata RHBA-2021:3821 0 None None None 2021-10-19 20:35:46 UTC

Description Matthew Robson 2021-10-05 22:27:02 UTC
Description of problem:

Backport https://github.com/openshift/machine-config-operator/pull/2633 for 4.8 and 4.7

Version-Release number of MCO (Machine Config Operator) (if applicable):
4.7/4.8

Platform (AWS, VSphere, Metal, etc.):
All

Actual results:

Large OCP 4.7.32 cluster seeing 10 million logs per hour during their upgrade from 4.6.25. Overwhelming cluster logging and local SSDs.

Comment 5 Scott Dodson 2021-10-07 20:37:56 UTC
@Manoj,

That's very likely due to the fact that 4.7 also has the same problem, while we've backported the fix there too, we'll want to make sure that the version used during that upgrade includes the fix which I suspect is not the case as that's a stable-4.7 to 4.8 upgrade and these fixes haven't made it to the stable stream yet.

Comment 6 Manoj Hans 2021-10-08 09:49:53 UTC
Yes @Scott, it was the reason for failure. I have validated manually using the latest build, it is working fine. Audit logs do not contain NETFILTER_CFG msg=audit. Below are the steps:

Upgrade from 4.7.0-0.nightly-2021-10-07-235007 to 4.8.0-0.ci-2021-10-08-041634:
 
oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release:4.8.0-0.ci-2021-10-08-041634 --force --allow-explicit-upgrade

warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.  You have used --allow-explicit-upgrade to the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.ci.openshift.org/ocp/release:4.8.0-0.ci-2021-10-08-041634

Execute below command to see the progress:
oc get clusterversion -w
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-10-07-235007   True        True          7s      Working towards registry.ci.openshift.org/ocp/release:4.8.0-0.ci-2021-10-08-041634: downloading update
version   4.7.0-0.nightly-2021-10-07-235007   True        True          15s     Working towards 4.8.0-0.ci-2021-10-08-041634: 11 of 699 done (1% complete)

oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci-2021-10-08-041634   True        False         8m31s   Cluster version is 4.8.0-0.ci-2021-10-08-041634

oc get node
NAME                                        STATUS   ROLES    AGE    VERSION
ip-10-0-54-25.us-east-2.compute.internal    Ready    worker   121m   v1.21.1+a620f50
ip-10-0-56-14.us-east-2.compute.internal    Ready    master   127m   v1.21.1+a620f50
ip-10-0-58-77.us-east-2.compute.internal    Ready    worker   119m   v1.21.1+a620f50
ip-10-0-61-13.us-east-2.compute.internal    Ready    master   125m   v1.21.1+a620f50
ip-10-0-72-79.us-east-2.compute.internal    Ready    master   127m   v1.21.1+a620f50
ip-10-0-77-154.us-east-2.compute.internal   Ready    worker   122m   v1.21.1+a620f50

oc debug node/ip-10-0-54-25.us-east-2.compute.internal
Starting pod/ip-10-0-54-25us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.54.25
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# vi var/log/audit/audit.log

type=DAEMON_START msg=audit(1633678490.184:75): op=start ver=3.0 format=enriched kernel=4.18.0-305.19.1.el8_4.x86_64 auid=4294967295 pid=1209 uid=0 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=success^]AUID="unse" UID="root"
type=SERVICE_START msg=audit(1633678490.375:5): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=coreos-update-ca-trust comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? erminal=? res=success'^]UID="root" AUID="unset"
type=CONFIG_CHANGE msg=audit(1633678490.506:6): op=set audit_backlog_limit=8192 old=64 auid=4294967295 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 res=1^]AUID="unset"
type=SYSCALL msg=audit(1633678490.506:6): arch=c000003e syscall=44 success=yes exit=56 a0=3 a1=7ffe3699ed60 a2=38 a3=0 items=0 ppid=1213 pid=1232 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty(none) ses=4294967295 comm="auditctl" exe="/usr/sbin/auditctl" subj=system_u:system_r:unconfined_service_t:s0 key=(null)^]ARCH=x86_64 SYSCALL=sendto AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" GID="root" SGID="root" FSGID="root"
.................
type=PROCTITLE msg=audit(1633684643.804:55): proctitle=2F7573722F6C6962657865632F706C6174666F726D2D707974686F6E002D4573002F7573722F7362696E2F74756E6564002D2D6E6F2D64627573
type=SERVICE_STOP msg=audit(1633684699.155:56): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=rpm-ostreed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? es=success'^]UID="root" AUID="unset"
type=SERVICE_START msg=audit(1633685477.966:57): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-clean comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=?terminal=? res=success'^]UID="root" AUID="unset"
type=SERVICE_STOP msg=audit(1633685477.966:58): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-clean comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? erminal=? res=success'^]UID="root" AUID="unset"

Comment 13 errata-xmlrpc 2021-10-19 20:35:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.15 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3821

Comment 14 Sudarshan Chaudhari 2021-10-21 21:03:31 UTC
Hello, 

I see the issue is fixed in 4.8.15 but this bugzill is pointing to OCP 4.7. 
I am looking to know if the fix is also available in OCP 4.7 latest errata. If its available, can we have the errata link?

Thanks.

Comment 15 W. Trevor King 2021-10-21 22:01:29 UTC
(In reply to Sudarshan Chaudhari from comment #14)
> I am looking to know if the fix is also available in OCP 4.7 latest errata.

Up in this bug's metadata^^, you can see:

  Blocks: 2011375

Clicking through to the 4.7.z bug 2011375, you can see that the fix shipped in 4.7.34 [1].

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2011375#c10


Note You need to log in before you can comment on or make changes to this bug.