When installing an Azure IPv6 cluster with the 4.3 installer, it errors out while destroying the bootstrap resources ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR ERROR ERROR Error: rpc error: code = Unavailable desc = transport is closing ERROR ERROR FATAL Terraform destroy: failed to destroy using Terraform This does not seem to happen with IPv4 installs, and does not seem to happen with the 4.4 installer.
lol, ok, it happens in 4.4 too, I just hadn't noticed because apparently I had never gotten a 4.4 IPv6 install to the point where the installer tears down the bootstrap host before...
> Target Release: --- → 4.5.0 note that if you're going to try to debug this against git master, you'll have to fix bug 1805251 first...
I can confirm that I receive the same errors when following the instructions.
I have found the problem and have a patch for it. I will write up the details tomorrow.
The problem seems to be with the address_prefix field when terraform is refreshing state for the subnets within the virtual network. The "address_prefix" field is null, while "address_prefixes" gets populated. The crash occurs in this section of the terraform code: terraform-provider-azurerm/azurerm/internal/services/network/resource_arm_virtual_network.go: func resourceAzureSubnetHash(v interface{}) int { var buf bytes.Buffer if m, ok := v.(map[string]interface{}); ok { buf.WriteString(m["name"].(string)) // This is causing the crash buf.WriteString(m["address_prefix"].(string)) if v, ok := m["security_group"]; ok { buf.WriteString(v.(string)) } } return hashcode.String(buf.String()) } The fix I have tested with is: func resourceAzureSubnetHash(v interface{}) int { var buf bytes.Buffer if m, ok := v.(map[string]interface{}); ok { buf.WriteString(m["name"].(string)) if v, ok := m["address_prefix"]; ok { buf.WriteString(v.(string)) } if v, ok := m["security_group"]; ok { buf.WriteString(v.(string)) } } return hashcode.String(buf.String()) } There are still problems with this fixed though. I don't know if this is masking another problem. Was this known to work before this problem was found? I can submit a patch for this, but it looks like more work is required.
> Was this known to work before this problem was found? It works in 4.4. Presumably this function changed in the rebase? Probably it needs to look at both address_prefix and address_prefixes, and use whichever is set.
I was no able to get a 4.4 IPv6 install to work. I am trying with image release:4.3.0-0.nightly-2020-02-17-205936-ipv6.1d3. Is there a 4.4 image known to work? I was following the instructions as per the document in this ticket. For 4.3, with the following diff I get an almost complete install: diff --git a/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_loadbalancer.go b/pkg/terraform/exec/plugins/ven index 4950399..ad92244 100644 --- a/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_loadbalancer.go +++ b/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_loadbalancer.go @@ -68,10 +68,10 @@ func resourceArmLoadBalancer() *schema.Resource { }, "private_ip_address": { - Type: schema.TypeString, - Optional: true, - Computed: true, - ValidateFunc: validate.IPv4AddressOrEmpty, + Type: schema.TypeString, + Optional: true, + //Computed: true, + //ValidateFunc: validate.IPv4AddressOrEmpty, }, "private_ip_address_version": { diff --git a/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_virtual_network.go b/pkg/terraform/exec/plugins/ index 968fc70..762449b 100644 --- a/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_virtual_network.go +++ b/pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/resource_arm_virtual_network.go @@ -416,7 +416,10 @@ func resourceAzureSubnetHash(v interface{}) int { if m, ok := v.(map[string]interface{}); ok { buf.WriteString(m["name"].(string)) - buf.WriteString(m["address_prefix"].(string)) + //buf.WriteString(m["address_prefix"].(string)) + if a, ok := m["address_prefix"]; ok { + buf.WriteString(a.(string)) + } if v, ok := m["security_group"]; ok { buf.WriteString(v.(string)) The install does bomb out waiting for operators to become stable. With image release:4.3.0-0.nightly-2020-02-17-205936-ipv6.1d3, there are several. With image elease:4.3.0-0.nightly-2020-02-21-091838-ipv6.2d9 it is just authentication and console. As for the address_prefix, nothing changed in 4.4, except for some dependency clauses in the terraform and removable of a non-used route table: diff --git a/dns/dns.tf b/dns/dns.tf index 69c0431..5816448 100644 --- a/dns/dns.tf +++ b/dns/dns.tf @@ -6,6 +6,8 @@ locals { resource "azureprivatedns_zone" "private" { name = var.cluster_domain resource_group_name = var.resource_group_name + + depends_on = [azurerm_dns_cname_record.api_external_v4, azurerm_dns_cname_record.api_external_v6] } resource "azureprivatedns_zone_virtual_network_link" "network" { diff --git a/vnet/vnet.tf b/vnet/vnet.tf index 7328fba..ddbd632 100644 --- a/vnet/vnet.tf +++ b/vnet/vnet.tf @@ -7,12 +7,6 @@ resource "azurerm_virtual_network" "cluster_vnet" { address_space = concat(var.vnet_v4_cidrs, var.vnet_v6_cidrs) } -resource "azurerm_route_table" "route_table" { - name = "${var.cluster_id}-node-routetable" - location = var.region - resource_group_name = var.resource_group_name -} - resource "azurerm_subnet" "master_subnet" { count = var.preexisting_network ? 0 : 1 Is anything here making any sense? Should a pull request be opened with the proposed fixes? Any pointers?
(cc: Clayton since it relates to your original Azure IPv6 modifications and your pending external PR. cc: Dan Mace because this may be part of the cause of bug 1806067.) (In reply to John Hixson from comment #14) > I was no able to get a 4.4 IPv6 install to work. I am trying with image > release:4.3.0-0.nightly-2020-02-17-205936-ipv6.1d3. Is there a 4.4 image > known to work? There is not yet any image that will get you a complete successful install on Azure, but I think at this point the latest official 4.3 and 4.4 images will get you to the point where the installer destroys the bootstrap resources. > For 4.3, with the following diff I get an almost complete install: > "private_ip_address": { > - Type: > schema.TypeString, > - Optional: true, > - Computed: true, > - ValidateFunc: > validate.IPv4AddressOrEmpty, > + Type: > schema.TypeString, > + Optional: true, > + //Computed: true, > + //ValidateFunc: > validate.IPv4AddressOrEmpty, This is one of the files that Clayton modified when adding Azure IPv6 support (https://github.com/openshift/installer/commit/3d00b6c#diff-f5330fc6f6380bb4171ec763f15d803a / https://github.com/terraform-providers/terraform-provider-azurerm/pull/5590). Presumably it's failing because "private_ip_address" actually contains an IPv6 address here and you need to add an appropriate validator for that to pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-azurerm/azurerm/helpers/validate/network.go. > - buf.WriteString(m["address_prefix"].(string)) > + //buf.WriteString(m["address_prefix"].(string)) > + if a, ok := m["address_prefix"]; ok { > + buf.WriteString(a.(string)) > + } I think you need to use "address_prefixes" here if "address_prefix" isn't set. > The install does bomb out waiting for operators to become stable. Yup. Expected. Ingress isn't working yet. (Although maybe that's partly because of the problems here?) > As for the address_prefix, nothing changed in 4.4, except for some > dependency clauses in the terraform and removable of a non-used route table: Yeah, it seems like I was just confused about it being 4.3-only.
(In reply to Dan Winship from comment #8) > lol, ok, it happens in 4.4 too, I just hadn't noticed because apparently I > had never gotten a 4.4 IPv6 install to the point where the installer tears > down the bootstrap host before... I haven't had any luck with 4.4. It always fails with a OSProvisioningTimedOut. For now, I think I will just work on the 4.3 problem and open up a PR for it. I will handle 4.4 separately.
I just tried 4.4.0-0.nightly-2020-03-03-110909 (with an appropriate install-config and OPENSHIFT_INSTALL_AZURE_EMULATE_SINGLESTACK_IPV6=true) and got the terraform errors.
Can this be synthesized without a complete and successful IPv6 install? ie: 1) `openshift-install create cluster`, wait for failure 2) `openshift-install destroy cluster`, confirm bootstrap host destroy failure 3) fix `openshift-install` 4) `openshift-install destroy cluster`, successful removal of bootstrap host
*** This bug has been marked as a duplicate of bug 1805251 ***