Bug 1805872

Summary: [4.4] improve reliability during upgrade by using a deployment
Product: OpenShift Container Platform Reporter: Maru Newby <mnewby>
Component: openshift-apiserverAssignee: Maru Newby <mnewby>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: high Docs Contact:
Priority: high    
Version: 4.4CC: aos-bugs, deads, mfojtik, sttts, xxia
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1804717 Environment:
Last Closed: 2020-05-13 21:59:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1804717    
Bug Blocks: 1805388    

Description Maru Newby 2020-02-21 16:51:14 UTC
+++ This bug was initially created as a clone of Bug #1804717 +++

DaemonSets are special cased during host shutdown and don't get a chance to gracefully fail over.  This can cause problems with both connectivity on a kill (non-grace shutdown) and sudden loss of etcd connection.

Switching to a deployment will alleviate symptoms.  We original chose a daemonset for spreading and for scaling.  For spreading, we now have anti-affinity rules.  For scaling, we should wire into something like an HPA anyway.

The migration will require PRs back to 4.3 which remove the deployment on downgrade and in 4.4 to remove the daemonset.

Comment 5 errata-xmlrpc 2020-05-13 21:59:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.