Bug 1927942

Summary: etcd should use socket option (SO_REUSEADDR) instead of wait for port release on process restart
Product: OpenShift Container Platform Reporter: Sam Batschelet <sbatsche>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: travi, wking
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Use the socket option SO_REUSEADDR for quick reuse of the ports. Reason: SO_REUSEADDR allows for the port to be reused by the new etcd process without waiting for the old process to die. Result: Reduced delay when etcd is restarted.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:44:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sam Batschelet 2021-02-11 21:59:01 UTC
Description of problem: Today etcd will wait for the ports[1] to be released from the previous process until it starts again. While this seems logical it takes a very long time. SO_REUSEADDR allows for the port to be reused by multiple processes and support has been available since golang 1.11.

# example of port wait logging from etcd container

> Waiting for ports 2379, 2380 and 9978 to be released.............................................................ETCD_PORT_2379_TCP_PORT=2379


each dot here represents 1-second sleep ~62 seconds.

kube-apiserver already has this ability[2]. consideration and care should be taken to reduce exposure to etcd attempting to read already flocked data file. While we will block and wait, bugs in the past have existed and we can be smarter during init.

# logging of etcd blocking start if another process is holding lock on data file.

> {"level":"info","ts":"2021-02-11T20:02:06.532Z","caller":"etcdserver/backend.go:86","msg":"db file is flocked by another process, or taking too long","path":"/var/lib/etcd/member/snap/db","took":"10.000115249s"}


[1] https://github.com/openshift/cluster-etcd-operator/blob/release-4.7/bindata/etcd/pod.yaml#L135
[2] https://github.com/kubernetes/kubernetes/pull/88893 


Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce:
1. kill etcd process (oc -n openshift-etcd rsh -c etcd -T $pod kill 1) and tail logs of new etcd process.
2.
3.

Actual results: etcd waits a considerable amount of time before it can start new process.


Expected results: etcd process should recover quickly and in the case of quorum loss (killl majority of etcd processes) "<5s and p95"


Additional info:

Comment 10 errata-xmlrpc 2021-07-27 22:44:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438