| Summary: | etcd3 cluster keeps electing new leaders during OpenShift cluster load to 1K namespaces | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Mike Fiedler <mifiedle> |
| Component: | etcd3 | Assignee: | Timothy St. Clair <tstclair> |
| Status: | CLOSED NOTABUG | QA Contact: | Martin Jenner <mjenner> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.3 | CC: | jeder, mifiedle, sjr, tstclair, vlaad |
| Target Milestone: | rc | Keywords: | Extras |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | aos-scalability-34 | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-25 13:56:43 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Mike Fiedler
2016-10-28 17:07:13 UTC
Issue log'd upstream https://github.com/coreos/etcd/issues/6753 The only thing I can think of is if there is write contention on the VMs. Could you check to make certain that the etcd nodes are landing on different hypervisors. An easy way to do this in our environment is to make the instance sizes so large that the eat a whole host. I am running into this problem with 1000 nodes cluster when trying to run conformance tests. (In reply to Vikas Laad from comment #9) > I am running into this problem with 1000 nodes cluster when trying to run > conformance tests. There are other issues related to network etc in this env, please ignore this comment. Closing this issue as we rooted the causes on a couple of conditions due to the storage subsystems write latency on openstack environments. 1. Was host anti-affinity is needed if using local storage 2. Shared ceph cluster write latency issues occur during fsyncs Once putting etcd on dedicated storage, issues were resolved. Please reference upstream guidelines on deployment: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/hardware.md#hardware-recommendations |