If you update the OCM pods, they don't release their lease on shutdown. The kube election library has been updated to make this possible (ReleaseOnCancel, see k8s.io/client-go/examples/leader-election/main.go) but it requires changes to your controllers to shutdown gracefully before the lock is released. By releasing the lease you minimize the time no controller is running.
If possible to fix this easily (to ensure the client is shutdown) we should implement it because it reduces the duration in a failure before we recover. If it is complex or requires rewiring the controller our current logic is fine.
@gabe if it's ok to verfiy the bug with follow steps, it cost about 51 seconds to new pod running
[wewang@wangwen work]$ oc get configmap openshift-master-controllers -oyaml -n openshift-controller-manager
[wewang@wangwen work]$ date ; oc delete pod controller-manager-srqzr -n openshift-controller-manager ; date; oc get pods -n openshift-controller-manager
Tue May 26 16:08:50 CST 2020
pod "controller-manager-srqzr" deleted
Tue May 26 16:09:41 CST 2020
NAME READY STATUS RESTARTS AGE
controller-manager-k5ckk 1/1 Running 0 91m
controller-manager-lxsj8 1/1 Running 0 91m
controller-manager-xgt6h 1/1 Running 0 7s
Perfect @Wen ... looks good
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.