Bug 1879837

Summary: DonNew audit profiles AllRequestBodies and WriteRequestBodies generate log entries that are too big
Product: OpenShift Container Platform Reporter: Frederic Giloux <fgiloux>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: aos-bugs, jhou, mfojtik, sttts, xxia
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:41:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Request record in the audit logs none

Description Frederic Giloux 2020-09-17 06:52:02 UTC
Created attachment 1715176 [details]
Request record in the audit logs

Description of problem:
The new profiles AllRequestBodies and WriteRequestBodies store an awful amount of data in the audit logs.
For a single request listing pods and returning ~30 instances:
$  oc get pods -n openshift-kube-apiserver
A json content of 296KB was stored in the logs.
As list commands happen all the time in a cluster (reconciliation loops) and thousands of requests per second may be processed in big clusters under load this may be a significant amount of storage that gets consumed very quickly.
SIEM Consumers may also not be able to cope. 

As some of the data stored is not useful it could be trimmed out, for instance:
- Managed fields (most of the payload)
- Column definitions and descriptions

Version-Release number of selected component (if applicable):
4.6.0-fc.5

How reproducible:
Always

Steps to Reproduce:
1. Patch the cluster configuration:
$ oc patch apiservers.config.openshift.io cluster --type='json' -p='[{"op": "replace", "path": "/spec/audit/profile", "value":"AllRequestBodies"}]'
2. Send a test query:
$ date & oc get pods -n openshift-kube-apiserver
3. Retrieve the query in the logs. I used:
$ oc adm node-logs ci-ln-4hgkjxk-f76d1-nrx5r-master-0 --path=kube-apiserver/audit-2020-09-16T16-01-24.053.log | grep "openshift-kube-apiserver.*pods.*list.*system\:admin"

Actual results:
See attachment

Expected results:
Slimmer result. At least without the "managed fields".

Additional info:

Comment 1 Stefan Schimanski 2020-09-17 08:42:47 UTC
Managed fields sounds legit.

But other than that, what do you expect when telling the system to log everything?

I tend to close this as NOTABUG. It works exactly as designed.

Comment 2 Frederic Giloux 2020-09-17 08:59:55 UTC
Hi Stefan. As you know "managedFields" are a new addition (Kubernetes 1.18), which was not available in 3.x. Would you not agree that they don't bring anything to audit logs? I believe audit profiles intent is to make audit customization possible in 4.x as it was in 3.x. Now due to the point reported I am not sure that the new profiles could be used in production clusters, which may question the usefulness of the feature. If you have evidences of the opposite feel free to close as NOTABUG or change the bugzilla to an RFE.

Comment 3 Stefan Schimanski 2020-09-17 09:23:52 UTC
No, I agree that the value of managed fields is limited and we should probably not log them.

Comment 4 Stefan Schimanski 2020-09-17 09:25:35 UTC
We will look into the managedFields topic. This sounds easy to do.

Comment 5 Frederic Giloux 2020-09-17 09:28:39 UTC
Great!

Comment 6 Abu Kashem 2020-09-21 17:48:23 UTC
I have discussed a strategy with Stefan. An upstream fix is in the works. Moving it to 4.7 now. If we can get the upstream fix done in time then we will take a stab t carrying/porting it to 4.6.

Comment 7 Abu Kashem 2020-09-24 18:13:23 UTC
We have opened a PR upstream to discard the managed fields - https://github.com/kubernetes/kubernetes/pull/94986. If it lands in time we may have a chance to get it to 4.6.

Comment 8 Frederic Giloux 2020-09-24 18:18:34 UTC
Very good. Fingers crossed.

Comment 9 Abu Kashem 2020-09-25 13:47:04 UTC
moving it to 4.6 since we are carrying the upstream PR - https://github.com/openshift/kubernetes/pull/375

Comment 11 Ke Wang 2020-09-27 10:19:31 UTC
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-26-202331   True        False         3h56m   Cluster version is 4.6.0-0.nightly-2020-09-26-202331

$ oc get apiserver/cluster -ojson | jq .spec.audit
{
  "profile": "AllRequestBodies"
}

$ oc debug node/kewang27dr1-btdxn-m-0.c.openshift-qe.internal
...
sh-4.4# chroot /host
sh-4.4# cd /var/log/kube-apiserver/
sh-4.4# jq . audit.log > audit.json
sh-4.4# ls
audit-2020-09-27T08-53-40.952.log  audit-2020-09-27T09-03-49.795.log  audit-2020-09-27T09-14-52.678.log  audit-2020-09-27T09-25-26.290.log  termination.log
audit-2020-09-27T08-57-18.934.log  audit-2020-09-27T09-07-27.321.log  audit-2020-09-27T09-18-00.608.log  audit.json
audit-2020-09-27T09-00-50.873.log  audit-2020-09-27T09-11-14.594.log  audit-2020-09-27T09-21-50.698.log  audit.log
sh-4.4# cat audit.json | grep managedFields
      "managedFields": null,
      "managedFields": null,
      "managedFields": null,

The ManagedFields of request and response bodies are discarded in audit logs as expected, so move the bug verified.

Comment 14 errata-xmlrpc 2020-10-27 16:41:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196