Description of problem: Ingress failing on Azure with 'SyncLoadBalancerFailed' Azure cluster setup fails because ingress is broken. KCM reports: level=error msg=Cluster operator ingress Degraded is True with IngressControllersDegraded: Some ingresscontrollers are degraded: ingresscontroller "default" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: invalid ip config ID /subscriptions/d38f1e38-4bed-438e-b227-833f997adf6a/resourceGroups/ci-op-xsr7hy3v-9b656-xrfd2-rg/providers/Microsoft.Network/networkInterfaces/ci-op-xsr7hy3v-9b656-xrfd2-master0-nic/ipConfigurations/pipConfig Version-Release number of selected component (if applicable): 4.7 How reproducible: 100% Additional info: Example failing job: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-4.7/1339191619219361792 https://search.ci.openshift.org/?search=failed+to+ensure+load+balancer%3A+invalid+ip+config+ID&maxAge=336h&context=1&type=bug%2Bjunit&name=azure&maxMatches=5&maxBytes=20971520&groupBy=job First appeared shortly after the 1.20 rebase: https://github.com/openshift/kubernetes/pull/471#event-4110268165
Sending this over to network team who own the ingress operator to identify what is missing and needs updating after getting k8s 1.20
*** Bug 1908052 has been marked as a duplicate of this bug. ***
We created an issue upstream: https://github.com/kubernetes/enhancements/pull/1116 The person that introduced the breaking change has assigned themselves. Not sure on time table, we might want a patch to land downstream first with an upstream fix hopefully in the works.
I've filed [1] upstream with the Availability Set issue. [1]: https://github.com/kubernetes/kubernetes/issues/97375
*** Bug 1909006 has been marked as a duplicate of this bug. ***
*** Bug 1908489 has been marked as a duplicate of this bug. ***
Commenting for the benefit of build watchers and Sippy to link this BZ to the following tests that are currently failing because of the failed cluster installation in an azure environment. - operator conditions authentication - operator conditions console - operator conditions ingress - operator install authentication - operator install console - operator install ingress Latest failure: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.7/1346875212452335616
verified with 4.7.0-0.nightly-2021-01-07-080803 and passed. # oc -n openshift-ingress get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.104.234 52.252.144.92 80:32233/TCP,443:32292/TCP 36m router-internal-default ClusterIP 172.30.208.255 <none> 80/TCP,443/TCP,1936/TCP 36m # oc get co/ingress NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE ingress 4.7.0-0.nightly-2021-01-07-080803 True False False 30m creating one custom ingresscontroller also works well # oc -n openshift-ingress get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.104.234 52.252.144.92 80:32233/TCP,443:32292/TCP 38m router-internal-default ClusterIP 172.30.208.255 <none> 80/TCP,443/TCP,1936/TCP 38m router-internal-test ClusterIP 172.30.211.21 <none> 80/TCP,443/TCP,1936/TCP 53s router-test LoadBalancer 172.30.211.65 10.0.32.7 80:30966/TCP,443:32636/TCP 53s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633