Description of problem: Creating multiple clusters with the same Cluster Name and base domain should not be allowed and generate an exception. The installer should catch the error on `openshift-install create` and `openshift-install delete` and not create the resources. Version-Release number of the following components: Provider Azure OpenShift Version 4.2.0-0.okd-2019-07-21-162146 How reproducible: Easily Steps to Reproduce: 1. Create a cluster $ bin/openshift-install create cluster --dir=./cluster_a ? SSH Public Key /path_to/id_rsa.pub ? Platform azure ? Region centralus ? Base Domain qe.azure.devcluster.openshift.com ? Cluster Name qe-esimardtest ... INFO Access the OpenShift web-console here: https://console-openshift-console.apps.qe-esimardtest.qe.azure.devcluster.openshift.com ... 2. Login and retrieve cluster information: Cluster ID 099bcaff-3e87-4f11-9aea-e3cbc891682d Provider Azure OpenShift Version 4.2.0-0.okd-2019-07-21-162146 3. Create a Cluster with the same Cluster Name and base domain: $ bin/openshift-install create cluster --dir=./cluster_b ? SSH Public Key /path_to/id_rsa.pub ? Platform azure ? Region centralus ? Base Domain qe.azure.devcluster.openshift.com ? Cluster Name qe-esimardtest ... Actual results: 1. Second cluster is created without any errors from the installer 2. Shared domain DNS entry is overwritten ( *.apps.qe-esimardtest.qe.azure.devcluster.openshift.com. ) 3. You can login into the new cluster, but can't login into the old one because of the DNS mix up Cluster ID d247c7cd-1c57-40c1-9d02-92cf362b694c Provider Azure OpenShift Version 4.2.0-0.okd-2019-07-21-162146 4. Destroying cluster_a will delete the Resource Group and also the shared DNS entries rendering cluster_b unusable $ bin/openshift-install destroy cluster --dir=./cluster_a INFO deleted record=api.qe-esimardtest INFO deleted record="*.apps.qe-esimardtest" INFO deleted resource group=qe-esimardtest-bttln-rg Expected results: The installer should catch an error when it check if the DNS entry for the new cluster already exists and it should not move forward with any resource creation unless we provide a different cluster name that isn't already in use. Example from AWS installer test case: Installer should prompt there is already existing cluster with the same cluster name # ./openshift-install create cluster --dir demo2 WARNING Found override for ReleaseImage. Please be warned, this is not advised INFO Consuming "Install Config" from target directory INFO Creating infrastructure resources... ERROR ERROR Error: Error applying plan: ERROR ERROR 1 error occurred: ERROR * module.dns.aws_route53_record.api_external: 1 error occurred: ERROR * aws_route53_record.api_external: [ERR]: Error building changeset: InvalidChangeBatch: [Tried to create resource record set [name='api.qe-jialiu.qe.devcluster.openshift.com.', type='A'] but it already exists] ERROR status code: 400, request id: 7e522cc8-5442-11e9-89ab-d78b444a875d ERROR ERROR ERROR ERROR ERROR ERROR Terraform does not automatically rollback in the face of errors. ERROR Instead, your Terraform state file has been partially updated with ERROR any resources that successfully completed. Please address the error ERROR above and apply again to incrementally change your infrastructure. ERROR ERROR FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply using Terraform
PR: https://github.com/openshift/installer/pull/3120
Verified with: ./openshift-install 4.5.0-0.nightly-2020-05-08-200452 built from commit 94f6539c438c876cf43f87c576692e7213d62a91 release image registry.svc.ci.openshift.org/ocp/release@sha256:a01ce6c188065715bcb805064df1713e1e63c08970ebbd8a1d5151a9ee3967e4 `openshift-install` fails gracefully with the following error shortly after you try to create a second cluster with the exact same name in the same DNS zone: "FATAL failed to fetch Cluster: failed to fetch dependency of "Cluster": failed to generate asset "Platform Provisioning Check": api.qe.azure.devcluster.openshift.com CNAME record already exists in cluster01 and might be in use by another cluster, please remove it to continue"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409