I don't know how common is the issue. It was captured by the scaling test we added. A newly created machine was promoted to a voting member by never made it to the etcd-endpoints configmap. Timeline: At 19:37:48: CEO successfully promoted learner member https://10.0.0.7:2380 At ~19:42:33 newly promoted member (ID: 9fc4382989977f7e) was elected as a leader at term 8 At ~19.44:22 the test deleted the machine The machine was never deleted because the removal controller reads data from the etcd-endpoints configmap which indicated no excessive machines. Link to CI run: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-gcp-fips-serial/1527713140295340032
I think I have another run with this issue https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.11-ocp-e2e-serial-aws-arm64/1533937747629182976
more results https://search.ci.openshift.org/?search=unexpected+number+of+voting+members+in+the+openshift-etcd%2Fetcd-endpoints&maxAge=48h&context=1&type=all&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
After discussing with @tjungblu and @htariq, we decided that this shouldn't be a blocker+ because it isn't perma-failing, currently doesn't have a reproducer, and haven't heard anything from the field.