1879837 – DonNew audit profiles AllRequestBodies and WriteRequestBodies generate log entries that are too big

Bug 1879837 - DonNew audit profiles AllRequestBodies and WriteRequestBodies generate log entries that are too big

Summary: DonNew audit profiles AllRequestBodies and WriteRequestBodies generate log en...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Abu Kashem
QA Contact:	Ke Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-17 06:52 UTC by Frederic Giloux
Modified:	2020-10-27 16:42 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-27 16:41:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Request record in the audit logs (295.38 KB, text/plain) 2020-09-17 06:52 UTC, Frederic Giloux	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift kubernetes pull 375	None	closed	Bug 1879837: UPSTREAM: 94986: drop managed fields from audit entries	2020-12-02 11:27:43 UTC
Red Hat Knowledge Base (Solution)	5449981	None	None	None	2020-09-30 21:54:56 UTC
Red Hat Product Errata	RHBA-2020:4196	None	None	None	2020-10-27 16:42:12 UTC

Description Frederic Giloux 2020-09-17 06:52:02 UTC

Created attachment 1715176 [details]
Request record in the audit logs

Description of problem:
The new profiles AllRequestBodies and WriteRequestBodies store an awful amount of data in the audit logs.
For a single request listing pods and returning ~30 instances:
$  oc get pods -n openshift-kube-apiserver
A json content of 296KB was stored in the logs.
As list commands happen all the time in a cluster (reconciliation loops) and thousands of requests per second may be processed in big clusters under load this may be a significant amount of storage that gets consumed very quickly.
SIEM Consumers may also not be able to cope. 

As some of the data stored is not useful it could be trimmed out, for instance:
- Managed fields (most of the payload)
- Column definitions and descriptions

Version-Release number of selected component (if applicable):
4.6.0-fc.5

How reproducible:
Always

Steps to Reproduce:
1. Patch the cluster configuration:
$ oc patch apiservers.config.openshift.io cluster --type='json' -p='[{"op": "replace", "path": "/spec/audit/profile", "value":"AllRequestBodies"}]'
2. Send a test query:
$ date & oc get pods -n openshift-kube-apiserver
3. Retrieve the query in the logs. I used:
$ oc adm node-logs ci-ln-4hgkjxk-f76d1-nrx5r-master-0 --path=kube-apiserver/audit-2020-09-16T16-01-24.053.log | grep "openshift-kube-apiserver.*pods.*list.*system\:admin"

Actual results:
See attachment

Expected results:
Slimmer result. At least without the "managed fields".

Additional info:

Comment 1 Stefan Schimanski 2020-09-17 08:42:47 UTC

Managed fields sounds legit.

But other than that, what do you expect when telling the system to log everything?

I tend to close this as NOTABUG. It works exactly as designed.

Comment 2 Frederic Giloux 2020-09-17 08:59:55 UTC

Hi Stefan. As you know "managedFields" are a new addition (Kubernetes 1.18), which was not available in 3.x. Would you not agree that they don't bring anything to audit logs? I believe audit profiles intent is to make audit customization possible in 4.x as it was in 3.x. Now due to the point reported I am not sure that the new profiles could be used in production clusters, which may question the usefulness of the feature. If you have evidences of the opposite feel free to close as NOTABUG or change the bugzilla to an RFE.

Comment 3 Stefan Schimanski 2020-09-17 09:23:52 UTC

No, I agree that the value of managed fields is limited and we should probably not log them.

Comment 4 Stefan Schimanski 2020-09-17 09:25:35 UTC

We will look into the managedFields topic. This sounds easy to do.

Comment 5 Frederic Giloux 2020-09-17 09:28:39 UTC

Great!

Comment 6 Abu Kashem 2020-09-21 17:48:23 UTC

I have discussed a strategy with Stefan. An upstream fix is in the works. Moving it to 4.7 now. If we can get the upstream fix done in time then we will take a stab t carrying/porting it to 4.6.

Comment 7 Abu Kashem 2020-09-24 18:13:23 UTC

We have opened a PR upstream to discard the managed fields - https://github.com/kubernetes/kubernetes/pull/94986. If it lands in time we may have a chance to get it to 4.6.

Comment 8 Frederic Giloux 2020-09-24 18:18:34 UTC

Very good. Fingers crossed.

Comment 9 Abu Kashem 2020-09-25 13:47:04 UTC

moving it to 4.6 since we are carrying the upstream PR - https://github.com/openshift/kubernetes/pull/375

Comment 11 Ke Wang 2020-09-27 10:19:31 UTC

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-26-202331   True        False         3h56m   Cluster version is 4.6.0-0.nightly-2020-09-26-202331

$ oc get apiserver/cluster -ojson | jq .spec.audit
{
  "profile": "AllRequestBodies"
}

$ oc debug node/kewang27dr1-btdxn-m-0.c.openshift-qe.internal
...
sh-4.4# chroot /host
sh-4.4# cd /var/log/kube-apiserver/
sh-4.4# jq . audit.log > audit.json
sh-4.4# ls
audit-2020-09-27T08-53-40.952.log  audit-2020-09-27T09-03-49.795.log  audit-2020-09-27T09-14-52.678.log  audit-2020-09-27T09-25-26.290.log  termination.log
audit-2020-09-27T08-57-18.934.log  audit-2020-09-27T09-07-27.321.log  audit-2020-09-27T09-18-00.608.log  audit.json
audit-2020-09-27T09-00-50.873.log  audit-2020-09-27T09-11-14.594.log  audit-2020-09-27T09-21-50.698.log  audit.log
sh-4.4# cat audit.json | grep managedFields
      "managedFields": null,
      "managedFields": null,
      "managedFields": null,

The ManagedFields of request and response bodies are discarded in audit logs as expected, so move the bug verified.

Comment 14 errata-xmlrpc 2020-10-27 16:41:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.