Please backport https://github.com/kubernetes/kubernetes/pull/77069 to OCP 3.11. It provides substantial robustness fixes when managing Azure OpenShift clusters using VM scale sets.
Backporting PR: https://github.com/openshift/origin/pull/22742
Additional notes: backporting kubernetes/kubernetes#77069 enables use of a new Microsoft API for VMSS management which was not GA at the point where 1.11 was released. Using this API removes an entire class of race conditions from OpenShift/VMSS and hence ARO cluster management. By inspection, I am confident in the Microsoft team's decision-making wrt the additional 1.11 fixes that we would like to backport here. They improve upstream robustness; applying them here will remove the delta between us and upstream (having a delta is not in our favour) and make it easier to collaborate with MSFT on any issues raised on this codebase in the future.
We don't find regressions on azure 3.11.141. Mark as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2580