Bug 1837564 - Generic error when installer fails to create resources using terraform
Summary: Generic error when installer fails to create resources using terraform
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Abhinav Dahiya
QA Contact: Mike Gahagan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-19 16:32 UTC by Abhinav Dahiya
Modified: 2020-07-13 17:40 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:40:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 3535 0 None closed Bug 1837564: pkg/terraform: add diagnostics errors for terraform apply operations 2020-09-11 09:06:43 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:40:18 UTC

Description Abhinav Dahiya 2020-05-19 16:32:08 UTC
Description of problem:

When every the installer fails to create resources using terraform, it outputs the errors from terraform as-is but the actual FATAL error is completely generic and doesn't provide any insight into what could have cause the error.

see https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-api-provider-azure/130/pull-ci-openshift-cluster-api-provider-azure-master-e2e-azure/435/artifacts/e2e-azure/container-logs/setup.log

```
level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform"
```

and the tf error
```
level=error
level=error msg="Error: Error Creating/Updating Subnet \"ci-op-641yvqj0-64576-lx9tt-worker-subnet\" (Virtual Network \"ci-op-641yvqj0-64576-lx9tt-vnet\" / Resource Group \"ci-op-641yvqj0-64576-lx9tt-rg\"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code=\"AnotherOperationInProgress\" Message=\"Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/d38f1e38-4bed-438e-b227-833f997adf6a/providers/Microsoft.Network/locations/eastus2/operations/ad2d125e-f7c7-4da3-97f5-8df128e7e8e5?api-version=2019-09-01.\" Details=[]"
level=error
level=error msg="  on ../tmp/openshift-install-786806570/vnet/vnet.tf line 22, in resource \"azurerm_subnet\" \"worker_subnet\":"
level=error msg="  22: resource \"azurerm_subnet\" \"worker_subnet\" {"
level=error
```

The installer should provide more context in the error reported to the error, maybe summarize the actual error and provide user with some action item.


Steps to Reproduce:

An easy way is to create an azure cluster with invalid release image, this causes the bootstrap to fail and the master nodes cannot get their ignition.

this manifests as a terraform error `OSProvisioningTimedOut` which doens't really explain anything.

Comment 3 Mike Gahagan 2020-06-02 14:41:45 UTC
Looks like error messages are better now. Here is a failure caused by referencing a vnet that is in a different region than the region that was specified for the cluster:

INFO Credentials loaded from file "/home/m/.azure/osServicePrincipal.json" 
INFO Consuming Install Config from target directory 
INFO Creating infrastructure resources...         
ERROR                                              
ERROR Error: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidResourceReference" Message="Resource /subscriptions/REDACTED/resourceGroups/aro-v4-eastus/providers/Microsoft.Network/virtualNetworks/aro-vnet/subnets/master-subnet referenced by resource /subscriptions/REDACTED/resourceGroups/mgahagan-100206-hjj7f-rg/providers/Microsoft.Network/networkInterfaces/mgahagan-100206-hjj7f-bootstrap-nic was not found. Please make sure that the referenced resource exists, and that both resources are in the same region." Details=[] 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-280793528/bootstrap/main.tf line 100, in resource "azurerm_network_interface" "bootstrap": 
ERROR  100: resource "azurerm_network_interface" "bootstrap" { 
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Error: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidResourceReference" Message="Resource /subscriptions/REDACTED/resourceGroups/aro-v4-eastus/providers/Microsoft.Network/virtualNetworks/aro-vnet/subnets/master-subnet referenced by resource /subscriptions/REDACTED/resourceGroups/mgahagan-100206-hjj7f-rg/providers/Microsoft.Network/networkInterfaces/mgahagan-100206-hjj7f-master1-nic was not found. Please make sure that the referenced resource exists, and that both resources are in the same region." Details=[] 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-280793528/master/master.tf line 9, in resource "azurerm_network_interface" "master": 
ERROR    9: resource "azurerm_network_interface" "master" { 
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Error: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidResourceReference" Message="Resource /subscriptions/REDACTED/resourceGroups/aro-v4-eastus/providers/Microsoft.Network/virtualNetworks/aro-vnet/subnets/master-subnet referenced by resource /subscriptions/REDACTED/resourceGroups/mgahagan-100206-hjj7f-rg/providers/Microsoft.Network/networkInterfaces/mgahagan-100206-hjj7f-master0-nic was not found. Please make sure that the referenced resource exists, and that both resources are in the same region." Details=[] 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-280793528/master/master.tf line 9, in resource "azurerm_network_interface" "master": 
ERROR    9: resource "azurerm_network_interface" "master" { 
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Error: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidResourceReference" Message="Resource /subscriptions/REDACTED/resourceGroups/aro-v4-eastus/providers/Microsoft.Network/virtualNetworks/aro-vnet/subnets/master-subnet referenced by resource /subscriptions/REDACTED/resourceGroups/mgahagan-100206-hjj7f-rg/providers/Microsoft.Network/networkInterfaces/mgahagan-100206-hjj7f-master2-nic was not found. Please make sure that the referenced resource exists, and that both resources are in the same region." Details=[] 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-280793528/master/master.tf line 9, in resource "azurerm_network_interface" "master": 
ERROR    9: resource "azurerm_network_interface" "master" { 
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Error: Error Creating/Updating Load Balancer "mgahagan-100206-hjj7f-internal" (Resource Group "mgahagan-100206-hjj7f-rg"): network.LoadBalancersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidResourceReference" Message="Resource /subscriptions/REDACTED/resourceGroups/ARO-V4-EASTUS/providers/Microsoft.Network/virtualNetworks/ARO-VNET referenced by resource /subscriptions/REDACTED/resourceGroups/mgahagan-100206-hjj7f-rg/providers/Microsoft.Network/loadBalancers/mgahagan-100206-hjj7f-internal was not found. Please make sure that the referenced resource exists, and that both resources are in the same region." Details=[{"code":"NotFound","message":"Resource /subscriptions/REDACTED/resourceGroups/ARO-V4-EASTUS/providers/Microsoft.Network/virtualNetworks/ARO-VNET not found."}] 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-280793528/vnet/internal-lb.tf line 6, in resource "azurerm_lb" "internal": 
ERROR    6: resource "azurerm_lb" "internal" {     
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change

The "Please make sure that the referenced resource exists, and that both resources are in the same region." comes directly from Azure.
Verified with 4.5.0-0.nightly-2020-06-01-165039

Comment 4 errata-xmlrpc 2020-07-13 17:40:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.