Bug 1927942 - etcd should use socket option (SO_REUSEADDR) instead of wait for port release on process restart
Summary: etcd should use socket option (SO_REUSEADDR) instead of wait for port release...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.6
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.8.0
Assignee: Sam Batschelet
QA Contact: ge liu
Depends On:
TreeView+ depends on / blocked
Reported: 2021-02-11 21:59 UTC by Sam Batschelet
Modified: 2021-07-27 22:44 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Use the socket option SO_REUSEADDR for quick reuse of the ports. Reason: SO_REUSEADDR allows for the port to be reused by the new etcd process without waiting for the old process to die. Result: Reduced delay when etcd is restarted.
Clone Of:
Last Closed: 2021-07-27 22:44:18 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github etcd-io etcd pull 12702 0 None open *: add support for socket options 2021-02-22 14:13:00 UTC
Github openshift cluster-etcd-operator pull 553 0 None open Bug 1927942: pkg/etcdenvvar: enable SO_REUSEADDR 2021-03-10 21:18:58 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:44:35 UTC

Description Sam Batschelet 2021-02-11 21:59:01 UTC
Description of problem: Today etcd will wait for the ports[1] to be released from the previous process until it starts again. While this seems logical it takes a very long time. SO_REUSEADDR allows for the port to be reused by multiple processes and support has been available since golang 1.11.

# example of port wait logging from etcd container

> Waiting for ports 2379, 2380 and 9978 to be released.............................................................ETCD_PORT_2379_TCP_PORT=2379

each dot here represents 1-second sleep ~62 seconds.

kube-apiserver already has this ability[2]. consideration and care should be taken to reduce exposure to etcd attempting to read already flocked data file. While we will block and wait, bugs in the past have existed and we can be smarter during init.

# logging of etcd blocking start if another process is holding lock on data file.

> {"level":"info","ts":"2021-02-11T20:02:06.532Z","caller":"etcdserver/backend.go:86","msg":"db file is flocked by another process, or taking too long","path":"/var/lib/etcd/member/snap/db","took":"10.000115249s"}

[1] https://github.com/openshift/cluster-etcd-operator/blob/release-4.7/bindata/etcd/pod.yaml#L135
[2] https://github.com/kubernetes/kubernetes/pull/88893 

Version-Release number of selected component (if applicable):

How reproducible: 100%

Steps to Reproduce:
1. kill etcd process (oc -n openshift-etcd rsh -c etcd -T $pod kill 1) and tail logs of new etcd process.

Actual results: etcd waits a considerable amount of time before it can start new process.

Expected results: etcd process should recover quickly and in the case of quorum loss (killl majority of etcd processes) "<5s and p95"

Additional info:

Comment 10 errata-xmlrpc 2021-07-27 22:44:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.