1885667 – Performance tweaking for Elasticsearch

Bug 1885667 - Performance tweaking for Elasticsearch

Summary: Performance tweaking for Elasticsearch

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	4.3.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.6.z
Assignee:	ewolinet
QA Contact:	Mike Fiedler
Docs Contact:
URL:
Whiteboard:	logging-exploration
Depends On:	1883444
Blocks:
TreeView+	depends on / blocked

Reported:	2020-10-06 16:20 UTC by OpenShift BugZilla Robot
Modified:	2021-01-26 19:42 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Elasticsearch Operator would set the primary shard count equal to the number of data nodes in the cluster. Consequence: This caused very high sharding for large clusters and resulted in excessive synchronization of the cluster state across nodes leading to poor performance. Fix: The Elasticsearch Operator now sets the upper bounds of primary shards for an index at 5. Result: No matter how many data nodes are in the Elasticsearch cluster, EO will limit the number of primary shards for an index to 5 for better performance.
Clone Of:
Environment:
Last Closed:	2021-01-25 20:21:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift elasticsearch-operator pull 515	None	closed	[release-4.6] Bug 1885667: capping primary shard count at 5 for performance reasons	2021-01-26 18:14:17 UTC
Github	openshift elasticsearch-operator pull 575	None	closed	[release-4.6] Bug 1885667: Updating index template logic to cap primary shards	2021-01-26 18:14:17 UTC
Red Hat Product Errata	RHBA-2021:0173	None	None	None	2021-01-25 20:21:12 UTC

Comment 1 Jeff Cantrill 2020-10-07 13:11:42 UTC

this is not an immediate 4.6 blocker and can go out with a 4.6.z stream

Comment 2 Jeff Cantrill 2020-10-23 15:20:23 UTC

Setting UpcomingSprint as unable to resolve before EOD

Comment 5 Mike Fiedler 2020-11-13 01:05:00 UTC

Using elasticsearch operator 4.6.0-202011051050.p0 and clusterlogging operator 4.6.0-202011041933.p0 

I see the same issue as in https://bugzilla.redhat.com/show_bug.cgi?id=1883444#c23 and https://bugzilla.redhat.com/show_bug.cgi?id=1883444#c24

Initially the infra and app indices have 7 primary replicas on a 7 node ES cluster.   When the subsequent (e.g. infra-000002 and app-000002 ) indices are created, they have 5 primary replicas.  

Is it ok that the initial indices have n="number of ES nodes" replicas and subsequent indices are capped at 5?

Comment 7 ewolinet 2020-11-13 21:30:12 UTC

Opened https://github.com/openshift/elasticsearch-operator/pull/575 to address comment 5

Comment 10 Mike Fiedler 2021-01-19 18:07:33 UTC

Verified on logging/ES 4.6.0-202101162152.p0

No more than 5 primary replicas created regardless of ES cluster size.

Comment 12 errata-xmlrpc 2021-01-25 20:21:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.13 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0173

Note You need to log in before you can comment on or make changes to this bug.