1917847 – The "Limit of total fields [1000] in index [audit-000094] has been exceeded" logs in Elasticsearch.

Bug 1917847 - The "Limit of total fields [1000] in index [audit-000094] has been exceeded" logs in Elasticsearch.

Summary: The "Limit of total fields [1000] in index [audit-000094] has been exceeded" ...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.7.z
Assignee:	Periklis Tsirakidis
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:	logging-exploration
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-01-19 14:25 UTC by Apurva Nisal
Modified:	2024-12-20 19:32 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	Elasticsearch keeps the default to 1000 fields to limit the exponential growth of data it is indexing. If the audit log policy is configured to `AllRequestBodies`, it causes a resource overhead. As a result, "Limit of total fields [1000] in index [audit-0000XX] has been exceeded " messages appear in Elasticsearch pod logs. To work around the issue, see https://access.redhat.com/solutions/5677061.
Clone Of:
Environment:
Last Closed:	2021-11-17 16:02:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	5677061	0	None	None	None	2021-01-26 11:13:02 UTC

Description Apurva Nisal 2021-01-19 14:25:42 UTC

Description of problem:
Getting below logs in Elasticsearch:
~~~
Elasticsearch: java.lang.IllegalArgumentException: Limit of total fields [1000] in index [audit-000094] has been exceeded 
~~~


Additional info:

# oc get csv -n openshift-logging
NAME                                           DISPLAY                                VERSION                 REPLACES                                       PHASE
clusterlogging.4.6.0-202011221454.p0           Cluster Logging                        4.6.0-202011221454.p0   clusterlogging.4.6.0-202011041933.p0           Succeeded
elasticsearch-operator.4.6.0-202011221454.p0   Elasticsearch Operator                 4.6.0-202011221454.p0   elasticsearch-operator.4.6.0-202011051050.p0   Succeeded

#  oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.9     True        False         21d     Cluster version is 4.6.9

Comment 3 Jeff Cantrill 2021-01-20 20:04:20 UTC

@lukas can you advise the workaround here.  I presume its possible to either update the config or modify/add a template to expand the field limit with the full knowledge about the impact this will have on ES

Comment 6 Lukas Vlcek 2021-03-02 15:10:16 UTC

Hi,

the "workaround" is to increase the soft limit on number of allowed fields per index. Either via manual command updating the settings against specific index (or all applicable indices at the time of the command execution) or/and by adding index template to setup filed limit to a new value for new indices.

And all this has been already pointed out in several resources above.
I can add quick link to similar external resource like: https://stackoverflow.com/questions/55372330/what-does-limit-of-total-fields-1000-in-index-has-been-exceeded-means-in

We can elaborate more specific instructions if needed (aka RHKB).
That was for the immediate workaround.

But, the point is that increasing the limit itself may not be a good long term solution.
We need to learn more about the data being indexed into that index. If I understood correctly the audit log level was configured to the highest granularity possible (see comment #1) and we need to understand what this means for the data - how the data model is impacted by this. The chance is that it may be better to exclude some portion of the data from indexing if there is no value in the data in this context.

More specifically, should we record all logs and all log fields when "AllRequestBodies" profile is used? See https://docs.openshift.com/container-platform/4.6/security/audit-log-policy-config.html

- If yes, we need to find what value the field limit needs to be set to (right now I do not know what the new field limit should be, i.e. is 2000 enough?)
- If not, then we need to decide which parts of the data we can ignore (i.e. which fields do not have to be indexed)

In both cases we need to update our index templates.

Lukáš

Comment 10 david.gabrysch 2021-03-30 06:02:50 UTC

Hello,

thank you for taking care of this issue. 

We tried to always increae "total_fields_limit", but we needed to do this for each new index which was kind of annoying because it was a rather manual process and we have many indexes.

E.g. a max limit of 2000 was not enough at some point.

As for the data, from our point of view, we need all of it since auditing is very important for us.


Kind regards,

David

Comment 11 Lukas Vlcek 2021-03-30 11:50:27 UTC

David,

I believe you should be able to push custom index template into ES to override just the setting for the total field limit.
See https://www.elastic.co/guide/en/elasticsearch/reference/6.8/mapping.html#mapping-limit-settings
See https://www.elastic.co/guide/en/elasticsearch/reference/6.8/indices-templates.html

(You will want to make the index template to match only specific indices, like the one matching "audit-*" index name template, ie. do not make index template to match all indices)

Let me know if you need more details we can turn this into documentation update.

Lukáš

Comment 12 Rolfe Dlugy-Hegwer 2021-03-30 16:10:23 UTC

I'm tracking a doc task related to this engineering bz in https://issues.redhat.com/browse/RHDEVDOCS-2888

Comment 13 Rolfe Dlugy-Hegwer 2021-04-20 10:32:12 UTC

I have closed https://issues.redhat.com/browse/RHDEVDOCS-2888. Please notify me if and when this issue requires changes to the OpenShift documentation.

Comment 28 David J. M. Karlsen 2022-05-20 18:23:04 UTC

I think the default here is too low - a vanilla cluster will quite fast reach this w/o any workload, if you add a couple of operators like knative, gitops (argo) etc

Comment 30 Red Hat Bugzilla 2023-09-15 01:31:55 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days

Note You need to log in before you can comment on or make changes to this bug.