Description: The server certs for cluster-etcd-operator pod metrics endpoint are provisioned by service.alpha.openshift.io/serving-cert-secret-name: etcd-operator-serving-cert which can be rotated. The operator binary needs suicider on the metrics serving cert change
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.
moving to 4.6 we operator does do not perform cert rotation today. But Alay is right certs are not auto reloaded on every request like etcd server. So cert change would require a restart of metrics container.
This bug hasn't had any activity 7 days after it was marked as LifecycleStale, so we are closing this bug as WONTFIX. If you consider this bug still valuable, please reopen it or create new bug.
We may be able to solve this by adding the `--terminate-on-files` flag to the operator container command in the operator's deployment so that the process is restarted when the certs change.
If we _also_ need to bounce the grpc metrics proxy in front of etcd itself, we'll have to add logic to the init container which can induce exit on change to the file. I don't know if there's already an established pattern or piece of code we can use in this context.
Having discussed this a little more with Sam, because new cert contents for etcd itself imply a new revision, a restart is also implied and so there's nothing extra to do on the operand side. We do want to cause the operator itself to restart to reload certs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196