Bug 1806700 - Large number of etcd leader elections on Azure [NEEDINFO]
Summary: Large number of etcd leader elections on Azure
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.4
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.4.0
Assignee: Sam Batschelet
QA Contact: ge liu
: 1798785 (view as bug list)
Depends On:
Blocks: 1807278 1807279
TreeView+ depends on / blocked
Reported: 2020-02-24 19:10 UTC by Jim Minter
Modified: 2020-06-10 15:09 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1807278 1807279 (view as bug list)
Last Closed: 2020-05-21 17:59:44 UTC
Target Upstream Version:
bleanhar: needinfo? (sbatsche)

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-etcd-operator pull 218 0 None closed Bug 1806700: pkg/operator/targetconfigcontroller: add platform raft tunables 2021-02-19 14:10:45 UTC
Red Hat Bugzilla 1802768 0 high CLOSED Azure IPI rafthttp dial tcp i/o timeout prober ROUND_TRIPPER_RAFT_MESSAGE 2021-03-10 08:48:04 UTC

Description Jim Minter 2020-02-24 19:10:23 UTC
Even when running an idle OCP 4 cluster on Azure there are a lot of etcd leadership elections.  Example:

2020-02-21 01:26:19.140279 I | raft: cf452c7e4ed8ffa9 is starting a new election at term 105
2020-02-21 01:26:20.440293 I | raft: cf452c7e4ed8ffa9 is starting a new election at term 106
2020-02-21 01:26:22.340240 I | raft: cf452c7e4ed8ffa9 is starting a new election at term 107

It seems that the fdatasync time on the Azure storage stack is regularly longer than the etcd hearbeat timeout configured by OCP.

Please ensure that etcd is tuned appropriately for the characteristics of the underlying storage stack on Azure OCP clusters in order to reduce leadership elections.

I don't know if more needs to be tuned than the heartbeat timeout; I also don't know what a suitable heartbeat timeout value for Azure is or what the tradeoff is between hard-coding an alternative value or making it tunable.

I also don't know if there are any monitoring/alerting configuration changes that are needed if the heartbeat timeout is changed?

Please ensure this work goes into 4.3.z.

Comment 8 Sam Batschelet 2020-04-02 21:16:33 UTC
*** Bug 1798785 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.