Description of problem: Installation in Azure environment ends with a failure. This is consistently noted across multiple installation attempts with different nightly images made where the following error is most commonly noted in the ingress controller deployment logs: ----- 2020-12-18T01:49:09.371Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2020-12-18T01:49:09.568Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: invalid ip config ID /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/xxia18az-djhkb-rg/providers/Microsoft.Network/networkInterfaces/xxia18az-djhkb-master0-nic/ipConfigurations/pipConfig\nThe kube-controller-manager logs may contain more details.), CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)"} ----- The problem is not seen for other envs like AWS/GCP. Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2020-12-17-224915 4.7.0-0.nightly-2020-12-17-201522 How reproducible: Frequently Steps to Reproduce: 1. Initiate deployment of cluster using latest ocp v4.7 nightly images in Azure environment Actual results: The deployment will end up in a failure and the following errors could be seen in the ingress operator logs: ----- 2020-12-18T04:37:18.108Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "48.906300155s", "error": "IngressController may become degraded soon: LoadBalancerReady=False, CanaryChecksSucceeding=False"} 2020-12-18T04:38:07.002Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2020-12-18T04:38:07.118Z ERROR operator.canary_controller wait/wait.go:155 error performing canary route check {"error": "error sending canary HTTP request: DNS erro r: Get \"http://canary-openshift-ingress-canary.apps.hongli-az47.qe.azure.devcluster.openshift.com\": dial tcp: lookup canary-openshift-ingress-canary.apps.hongli-az47.qe.azure.devcluster.op enshift.com on 172.30.0.10:53: no such host"} 2020-12-18T04:38:07.150Z INFO operator.status_controller controller/controller.go:235 Reconciling {"request": "openshift-ingress-operator/default"} 2020-12-18T04:39:07.301Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)"} 2020-12-18T04:40:07.301Z INFO operator.ingress_controller controller/controller.go:235 reconciling {"request": "openshift-ingress-operator/default"} 2020-12-18T04:40:07.344Z ERROR operator.canary_controller wait/wait.go:155 error performing canary route check {"error": "error sending canary HTTP request: DNS error: Get \"http://canary-openshift-ingress-canary.apps.hongli-az47.qe.azure.devcluster.openshift.com\": dial tcp: lookup canary-openshift-ingress-canary.apps.hongli-az47.qe.azure.devcluster.openshift.com on 172.30.0.10:53: no such host"} 2020-12-18T04:40:07.462Z ERROR operator.ingress_controller controller/controller.go:235 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: invalid ip config ID /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/hongli-az47-b4tcb-rg/providers/Microsoft.Network/networkInterfaces/hongli-az47-b4tcb-bootstrap-nic/ipConfigurations/bootstrap-nic-ip-v4\nThe kube-controller-manager logs may contain more details.), CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)"} ------ Expected results: The installation should succeed in the Azure environment.
Created attachment 1740167 [details] Reference must-gather data from one of the failing clusters
*** This bug has been marked as a duplicate of bug 1908389 ***