In 4.4, ARO had to recreate a failed master VM. In the process of recreation, the master IP changed. The instructions at [0] are inadequate to recover full etcd membership in this case and it was necessary to manually delete the etcd-peer- Secret for master-0 to cause a new certificate to be created with the new IP address. There are a couple of asks here: 1- The above use case needs to be added to documentation and QE flows. 2- Ideally, it would be good if the cert signer were able to automatically detect changes of IP and do them. Failing that, better documentation is needed. [0] https://docs.openshift.com/container-platform/4.4/backup_and_restore/replacing-unhealthy-etcd-member.html#restore-replace-crashlooping-etcd-member_replacing-unhealthy-etcd-member
> 1- The above use case needs to be added to documentation and QE flows. As the below backport might take a bit to land in 4.4 we will address docs as p[0]. > 2- Ideally, it would be good if the cert signer were able to automatically detect changes of IP and do them. I 100% agree that we can check the SAN[1] and invalidate certs based on IP SAN not matching hostIP for the node. This is not blocking 4.6 but will be addressed with high priority, thanks for the report. [1] https://github.com/openshift/cluster-etcd-operator/blob/release-4.4/pkg/operator/etcdcertsigner/etcdcertsignercontroller.go#L282
*** Bug 1886771 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633