Bug 2063831

Summary:	etcd quorum pods landing on same node
Product:	OpenShift Container Platform	Reporter:	Devan Goodwin <dgoodwin>
Component:	Etcd	Assignee:	Haseeb Tariq <htariq>
Status:	CLOSED ERRATA	QA Contact:	ge liu <geliu>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.11	CC:	htariq, kenzhang, ngirard, wking
Target Milestone:	---
Target Release:	4.11.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2070783 (view as bug list)		Environment:	openshift-tests-upgrade.[sig-scheduling][Early] The openshift-etcd pods should be scheduled on different nodes [Suite:openshift/conformance/parallel]
Last Closed:	2022-08-10 10:54:06 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2070783

Description Devan Goodwin 2022-03-14 13:36:26 UTC

TRT recently added a test to monitor for this and it exposed that etcd quorum pods are actually landing on the same node for periods of time:

https://sippy.ci.openshift.org/sippy-ng/tests/4.11/analysis?test=openshift-tests-upgrade.[sig-scheduling][Early]%20The%20openshift-etcd%20pods%20should%20be%20scheduled%20on%20different%20nodes%20[Suite:openshift/conformance/parallel]

Sample job: 

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-aws-upgrade/1503258288765014016

This seems to be happening alarmingly often:

https://search.ci.openshift.org/?search=The+openshift-etcd+pods+should+be+scheduled+on+different+nodes&maxAge=48h&context=0&type=junit&name=4.11&excludeName=quorum&maxMatches=5&maxBytes=20971520&groupBy=job

Marking sev high as this has potential to cause loss of quorum. 

Backporting to 4.10 should probably be discussed.

Jan Chaloupka did some work to allow force assign PDB pods to nodes instead of relying on scheduler, may be a good idea to make use of this for etcd.

Comment 1 Devan Goodwin 2022-03-14 13:37:19 UTC

TRT is double checking the results to make absolutely sure the test is catching something real.

Comment 2 Ken Zhang 2022-03-14 15:10:50 UTC

I confirmed that for both HAProxy and ETCD cases, the test is catching real problems. There is a bug with image-registry that is being fixed.

Comment 3 Haseeb Tariq 2022-03-14 21:20:55 UTC

Working on an update to replace the etcd-operator's quorum guard controller with the staticpod quorum guard controller.
This would also include a new readyz server sidecar on the etcd-pods for the guard controller to be able to check for pod readiness.

Comment 4 W. Trevor King 2022-03-21 22:17:28 UTC

*** Bug 2065454 has been marked as a duplicate of this bug. ***

Comment 11 ge liu 2022-04-21 02:45:19 UTC

Verified with 4.11.0-0.nightly-2022-04-20-045714,
quorum guard controller have been updated, I suppose it should resolve this problem, 
sh-4.4# crictl ps|grep etcd
cf3e865a0bcb2       d6eace900ed8aa9f2bb76d7f34981a34bf0cad1ee69ff3b05fd9b408d4645349                                                         13 minutes ago      Running             etcd-readyz                                   0                   6bead9a291bc8

Comment 12 Devan Goodwin 2022-04-21 12:15:33 UTC

Haseeb would you expect this to NEVER happen now? It looks like it's improved somewhat, but it's also still happening:

https://sippy.ci.openshift.org/sippy-ng/tests/4.11/analysis?test=openshift-tests-upgrade.[sig-scheduling][Early]%20The%20openshift-etcd%20pods%20should%20be%20scheduled%20on%20different%20nodes%20[Suite:openshift/conformance/parallel]

Current week pass rate: 97.83%
Prev week pass rate:    96.85%

Most of the hits seem to be coming from 4.11 upgraded from 4.10 jobs: https://sippy.ci.openshift.org/sippy-ng/jobs/4.11/runs?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22failed_test_names%22%2C%22operatorValue%22%3A%22contains%22%2C%22value%22%3A%22openshift-tests-upgrade.%5Bsig-scheduling%5D%5BEarly%5D%20The%20openshift-etcd%20pods%20should%20be%20scheduled%20on%20different%20nodes%20%5BSuite%3Aopenshift%2Fconformance%2Fparallel%5D%22%7D%5D%7D&sortField=timestamp&sort=desc

Comment 13 Thomas Jungblut 2022-04-27 10:02:34 UTC

*** Bug 2027744 has been marked as a duplicate of this bug. ***

Comment 15 errata-xmlrpc 2022-08-10 10:54:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 16 Red Hat Bugzilla 2023-09-15 01:52:43 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days