Description: GCP upi installation failed because the instance group name exceeds the gcp limitation which is at most 63. UPI template defines instance group name as 'name': context.properties['infra_id'] + '-master-' + zone + '-instance-group'. In IPI it's name = "${var.cluster_id}-master-${var.zones[count.index]}". It would be good to shorten the instance group name in UPI and align it with IPI gcloud deployment-manager deployments create ${INFRA_ID}-infra --config 02_infra.yaml ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1611632607807-5b9c57518c84f-43398d57-d16056e6]: errors: - code: RESOURCE_ERROR location: /deployments/storage-47h3p-5z6c9-lb/resources/storage-47h3p-5z6c9-master-northamerica-northeast1-c-instance-group message: "{\"ResourceType\":\"compute.v1.instanceGroup\",\"ResourceErrorCode\":\"\ 400\",\"ResourceErrorMessage\":{\"code\":400,\"errors\":[{\"domain\":\"global\"\ ,\"message\":\"Invalid value for field 'resource.name': 'storage-47h3p-5z6c9-master-northamerica-northeast1-c-instance-group'.\ \ Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)'\" Version: 4.7.0-0.nightly-2021-01-22-134922 Platform: GCP Please specify: * UPI (semi-manual installation on customized infrastructure) Actual result: GCP UPI failed Expected result: GCP UPI passed
Problem originally found through our automation using: cluster name: storage-47h3p region: northamerica-northeast1
With name: tsze-longname-1234567890-1234567890-1234567890-1a region: us-central1 We get slightly different error: $ gcloud deployment-manager deployments create ${INFRA_ID}-infra --config 02_infra.yaml The fingerprint of the deployment is b'7OO8Dixsswx3FLSVKnstoQ==' Waiting for create [operation-1611860224697-5b9fa741ee455-05c0b600-5f0735e7]...failed. ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1611860224697-5b9fa741ee455-05c0b600-5f0735e7]: errors: - code: RESOURCE_ERROR location: /deployments/tsze-longname-1234567-82bbw-infra/resources/tsze-longname-1234567-82bbw-cluster-ip message: "{\"ResourceType\":\"compute.v1.address\",\"ResourceErrorCode\":\"400\"\ ,\"ResourceErrorMessage\":{\"code\":400,\"errors\":[{\"domain\":\"global\",\"\ message\":\"Invalid value for field 'resource.subnetwork': 'https://www.googleapis.com/compute/v1/projects/openshift-qe/regions/northamerica-northeast1/subnetworks/tsze-upi-long-1234567-4mfs6-master-subnet'.\ \ Subnetwork must be in the same region.\",\"reason\":\"invalid\"}],\"message\"\ :\"Invalid value for field 'resource.subnetwork': 'https://www.googleapis.com/compute/v1/projects/openshift-qe/regions/northamerica-northeast1/subnetworks/tsze-upi-long-1234567-4mfs6-master-subnet'.\ \ Subnetwork must be in the same region.\",\"statusMessage\":\"Bad Request\",\"\ requestPath\":\"https://compute.googleapis.com/compute/v1/projects/openshift-qe/regions/us-central1/addresses\"\ ,\"httpMethod\":\"POST\"}}"
Personally I think "storage-47h3p" is not some unacceptable long cluster name. Compared with IPI install with the same cluster name and region, ipi is using a different format to name these instance group.
Please ignore my last comment. Looks like I made a mistake. Tried again for a UPI with long name in us-central1. Install finished and the nodes are: $ ./oc get nodes NAME STATUS ROLES AGE VERSION tsze-alongname-123456-65kln-m-0.c.openshift-qe.internal Ready master 120m v1.20.0+4b40bb4 tsze-alongname-123456-65kln-m-1.c.openshift-qe.internal Ready master 120m v1.20.0+4b40bb4 tsze-alongname-123456-65kln-m-2.c.openshift-qe.internal Ready master 120m v1.20.0+4b40bb4 tsze-alongname-123456-65kln-worker-a-gdbr6 Ready worker 111m v1.20.0+4b40bb4 tsze-alongname-123456-65kln-worker-b-pnf7d Ready worker 111m v1.20.0+4b40bb4 Sorry.
We're lowering the severity since the templates are mostly for reference.
The differences between IPI and UPI can be seen here: IPI: https://github.com/openshift/installer/blob/3bc71bb64699e5f96b0e9716609bf8ed0560fcfe/data/data/gcp/master/main.tf#L74 UPI: https://github.com/openshift/installer/blob/3bc71bb64699e5f96b0e9716609bf8ed0560fcfe/upi/gcp/02_lb_int.py#L54 Basically, the UPI template appends `-instance-group` to the end of the name used by IPI. This is because the GCP Template Manager objects must have unique names and so we appended the resource type to the end of all names. We could shorten the name here to be `-ig`, which would reduce the overall count.
Note that even with an IPI install, a cluster name over 22 characters will fail due to the instance group name being too long when the region is northamerica-northeast1. With unadulterated UPI scripts, the cluster name can only be 8 characters long in northamerica-northeast1.
(In reply to Jeremiah Stuever from comment #6) > We could shorten the name here to be `-ig`, which would reduce the overall > count. I like the idea of reducing the suffix from "-instance-group" to something smaller, such as "-ig".
Moving this out of the 4.8.0 release. There are two PR that still need to be merged for this BZ. This BZ is low enough severity to justify skipping entirely for 4.8.
Needs PR review.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056