Bug 1629069
| Summary: | All nodes in elasticsearch cluster not busy simultaneously - busy in sequence when logging from 2000 nodes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
| Component: | Logging | Assignee: | Jeff Cantrill <jcantril> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | Anping Li <anli> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.11.0 | CC: | aos-bugs, rmeggins | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.11.z | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | aos-scalability-311 | ||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-04-22 19:44:56 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Mike Fiedler
2018-09-14 19:19:08 UTC
The columns are bulk requests active, queued and rejected. These are representative samples, not complete data. For the first third of the test the bulk stats looked like this: Fri Sep 14 14:22:02 EDT 2018 logging-es-data-master-mf4gygmf bulk 0 0 898 logging-es-data-master-40vugi05 bulk 32 200 55943 This means the server had 55943 rejected bulk indexing operations????? You might want to also capture the number of bulk operations completed (bc) so that we can compare the number of ops successfully completed vs. number which were rejected e.g. I'd like to know for logging-es-data-master-mf4gygmf how many successful operations there were to compare against the 898 bulk rejections. But yes, it is troubling that the load is not being evenly spread out among the cluster. The connection reloading functionality in fluentd should help with this _as long as the haproxy in the logging-es Service is working correctly_. Can we get haproxy statistics from the Service to make sure it is load balancing the connections among all 3 es? The other problem may be sharding - the optimal setting for sharding here would be 3 shards per index e.g. https://github.com/richm/docs/blob/master/increase-number-of-shards-for-index-template.md However, that would only help if there are only 1 or 2 indices which have a large number of documents, and the other indices have almost no documents. How many namespaces are there in this test? Are there load pods running in all of those namespaces? Are they generating roughly the same amount of traffic? I will repeat starting with fresh ES pods and capture bulk completed. and shards set to 3 per index. > How many namespaces are there in this test? 1000 >Are there load pods running in all of those namespaces? Are they generating roughly the same amount of traffic? Yes and Yes - exactly the same amount of traffic. Changing the shards to 3 made a huge difference. I'm seeing even distribution of bulk completions (and rejections) over the ES cluster. At 50 messages/second/node x 2000 nodes = 100,000 1K messages/second total for the cluster, bulk rejections are running 15-20% of completions. Going to back the rate off some to try to find a sustainable rate with a lower number of bulk rejections. Also trying to reproduce https://bugzilla.redhat.com/show_bug.cgi?id=1629015 with fluentd debug on. (In reply to Mike Fiedler from comment #3) > Changing the shards to 3 made a huge difference. I'm seeing even > distribution of bulk completions (and rejections) over the ES cluster. At > 50 messages/second/node x 2000 nodes = 100,000 1K messages/second total for > the cluster, bulk rejections are running 15-20% of completions. Thanks! Yes, this is the "high performance but unsafe" configuration - if you lose a node, you lose 1/3rd of your data, so better backup your /elasticsearch partition somewhere . . . > Going to > back the rate off some to try to find a sustainable rate with a lower number > of bulk rejections. Yes, it would be nice to find the cutoff to say when to add more infra/es nodes. > > Also trying to reproduce https://bugzilla.redhat.com/show_bug.cgi?id=1629015 > with fluentd debug on. ok, thanks We can resolve by modifying the shard count to the number of data nodes which will affect every shard in the cluster [1]. Alternatively users can follow the instructions from #c6 to target the operations indices specifically. Closing as WORSKSFORME. [1] https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_logging_elasticsearch/tasks/main.yaml#L473 |