Bug 2070783

Summary:	Add a cleanup controller for static guard pods for 4.11 to 4.10 downgrades
Product:	OpenShift Container Platform	Reporter:	Haseeb Tariq <htariq>
Component:	Etcd	Assignee:	Haseeb Tariq <htariq>
Status:	CLOSED NOTABUG	QA Contact:	ge liu <geliu>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.10	CC:	dgoodwin, geliu, htariq, kenzhang, wking
Target Milestone:	---
Target Release:	4.10.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	2063831	Environment:	openshift-tests-upgrade.[sig-scheduling][Early] The openshift-etcd pods should be scheduled on different nodes [Suite:openshift/conformance/parallel]
Last Closed:	2022-09-07 22:33:52 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2063831
Bug Blocks:

Description Haseeb Tariq 2022-03-31 23:18:35 UTC

Related to BZ 2063831:
Once the above BZ is fixed [1] with the controller that deploys the static guard pods, we would need a controller that handles clean up of those guard pods in event of a downgrade from 4.11->4.10.

[1]: https://github.com/openshift/cluster-etcd-operator/pull/763

+++ This bug was initially created as a clone of Bug #2063831 +++

TRT recently added a test to monitor for this and it exposed that etcd quorum pods are actually landing on the same node for periods of time:

https://sippy.ci.openshift.org/sippy-ng/tests/4.11/analysis?test=openshift-tests-upgrade.[sig-scheduling][Early]%20The%20openshift-etcd%20pods%20should%20be%20scheduled%20on%20different%20nodes%20[Suite:openshift/conformance/parallel]

Sample job: 

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-aws-upgrade/1503258288765014016

This seems to be happening alarmingly often:

https://search.ci.openshift.org/?search=The+openshift-etcd+pods+should+be+scheduled+on+different+nodes&maxAge=48h&context=0&type=junit&name=4.11&excludeName=quorum&maxMatches=5&maxBytes=20971520&groupBy=job

Marking sev high as this has potential to cause loss of quorum. 

Backporting to 4.10 should probably be discussed.

Jan Chaloupka did some work to allow force assign PDB pods to nodes instead of relying on scheduler, may be a good idea to make use of this for etcd.

--- Additional comment from Devan Goodwin on 2022-03-14 13:37:19 UTC ---

TRT is double checking the results to make absolutely sure the test is catching something real.

--- Additional comment from Ken Zhang on 2022-03-14 15:10:50 UTC ---

I confirmed that for both HAProxy and ETCD cases, the test is catching real problems. There is a bug with image-registry that is being fixed.

--- Additional comment from Haseeb Tariq on 2022-03-14 21:20:55 UTC ---

Working on an update to replace the etcd-operator's quorum guard controller with the staticpod quorum guard controller.
This would also include a new readyz server sidecar on the etcd-pods for the guard controller to be able to check for pod readiness.

--- Additional comment from W. Trevor King on 2022-03-21 22:17:28 UTC ---

Comment 1 Haseeb Tariq 2022-09-07 22:33:52 UTC

Closing this as we don't officially support minor version downgrades e.g 4.11 -> 4.10