Description of problem: In 4.4, the cluster-etcd-operator(CEO) scales the etcd cluster from bootstrap node to 4 member control plane (3 etcd pods for each master). Sometimes, the scaling times out because CEO pod is not able to talk to the bootstrap etcd in order to add other etcd nodes as members of etcd. The error from operator logs is: ------ I0201 18:29:56.506190 1 util.go:37] checking against etcd-2.ci-op-1yrd4g86-e4498.origin-ci-int-gce.dev.openshift.com. W0201 18:29:57.079291 1 clientconn.go:1156] grpc: addrConn.createTransport failed to connect to {https://10.0.0.5:2379 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority". Reconnecting... How reproducible: This is probably major component of bootstrapping failures in CI. grep for "Err :connection" in [1][2][3] Expected results: The operator pod is expected to be able to have correct certs to talk to bootstrap etcd Additional info: Another quick way to spot this bug in CI is looking for etcd resource in must-gather. If one member is in Ready state, and other two are in unknown state, it is because the etcd-operaror is likely erroring out on auth failures in adding the member to the cluster, example as follows: --------- observedConfig: cluster: members: - name: etcd-bootstrap peerURLs: https://10.0.0.6:2380 status: Unknown pending: - name: etcd-member-ci-op-kd2mp-m-1.c.openshift-gce-devel-ci.internal peerURLs: https://etcd-1.ci-op-9d6rs79x-15937.origin-ci-int-gce.dev.openshift.com:2380 status: Unknown - name: etcd-member-ci-op-kd2mp-m-0.c.openshift-gce-devel-ci.internal peerURLs: https://etcd-0.ci-op-9d6rs79x-15937.origin-ci-int-gce.dev.openshift.com:2380 status: Ready - name: etcd-member-ci-op-kd2mp-m-2.c.openshift-gce-devel-ci.internal peerURLs: https://etcd-2.ci-op-9d6rs79x-15937.origin-ci-int-gce.dev.openshift.com:2380 status: Unknown 1. https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/2745/pull-ci-openshift-installer-master-e2e-gcp/222/artifacts/e2e-gcp/pods/openshift-etcd-operator_etcd-operator-f78f5b65c-jzqlz_operator.log 2.https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-etcd-operator/68/pull-ci-openshift-cluster-etcd-operator-master-e2e-gcp-upgrade/195/artifacts/e2e-gcp-upgrade/pods/openshift-etcd-operator_etcd-operator-55f94bfd85-hhvck_operator.log 3.https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/6986/rehearse-6986-pull-ci-openshift-origin-master-e2e-conformance-k8s/5/artifacts/e2e-conformance-k8s/pods/openshift-etcd-operator_etcd-operator-bbd958bb7-k476j_operator.log
seems like similar failures https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.4/986 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.4/987 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_installer/3059/pull-ci-openshift-installer-master-e2e-gcp/234
I suspect we see the same for ovirt and its blocking us: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/3047/pull-ci-openshift-installer-master-e2e-ovirt/601/artifacts/e2e-ovirt/pods/openshift-etcd-operator_etcd-operator-6fbdf775c5-blkcw_operator.log
*** This bug has been marked as a duplicate of bug 1807169 ***
*** This bug has been marked as a duplicate of bug 1808060 ***