For example, in 4.1.18 -> 4.3.1 CI [1]: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade/39/build-log.txt | sort | uniq | grep -c 'etcdserver: leader changed' 6 $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade/39/build-log.txt | sort | uniq | grep -c 'etcdserver: request timed out' 46 Some previous discussion of slow Azure disks in [2]. Some previous discussion on the sorts of things that can go wrong when etcd cannot keep up in bug 1775878. Additional examples: 4.3 [3]: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.3/891/build-log.txt | sort | uniq | grep -c 'etcdserver: leader changed' 15 and 4.4 [4]: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.4/702/build-log.txt | sort | uniq | grep -c 'etcdserver: leader changed' 4 Query to search for these: $ curl -s 'https://search.svc.ci.openshift.org/search?search=etcdserver%3A+leader+changed&maxAge=336h&context=0&type=build-log' | jq -r '[. | to_entries[] | .hits = ([.value["etcdserver: leader changed"][].context[]] | unique | length)] |sort_by(.hits)[] | (.hits | tostring) + " " + .key' | tail 5 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/16355 5 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_installer/3036/pull-ci-openshift-installer-release-4.3-e2e-azure/113 6 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade/39 6 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/24405/pull-ci-openshift-origin-release-4.3-e2e-azure/196 8 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/16457 10 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-compact-4.3/105 10 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-kube-apiserver-operator/746/pull-ci-openshift-cluster-kube-apiserver-operator-master-e2e-aws-upgrade/1245 10 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_installer/2433/pull-ci-openshift-installer-master-e2e-aws-upgrade/4676 12 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-ovn-4.3/385 13 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.3/891 You can see that this is not unique to Azure, but Azure is over-represented. [1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade/39#1:build-log.txt%3A1148 [2]: https://github.com/openshift/installer/pull/2186 [3]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.3/891 [4]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.4/702
Also in this space is work to alert cluster admins when their hardware underperforms: bug 1793183. That would help with diagnosing problems like this, but would obviously not magically make Azure's disks faster, so distinct from this bug ;).
*** This bug has been marked as a duplicate of bug 1806700 ***