The CSR signer is used to sign for kubelet client-certs. Without valid kubelet signer certs, the kubelet's signed CSR is not trusted by the kube-apiserver. Without being trusted by the kube-apiserver, it's not possible for the kubelet to get a list of pods to create. Without a list of pods to create, the kubelet will never create a new operator pod. Without a new operator pod, the rest of control-plane recover will not happen. without the control-plane up, the rest of the cluster never comes back.
https://github.com/openshift/cluster-kube-apiserver-operator/pull/469 addresses this
Disaster Recover Fix. Making this a 4.1.0 blocker
Maciej, I see you reviewed above fix PR. WDYT about above question of the steps to verify this bug? Thank you in advance.
I'll defer to Tomas since he worked on the overall recovery tooling.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.