Bug 1732124 - Creating multiple clusters with the same Cluster Name and base domain should not be allowed in Azure
Summary: Creating multiple clusters with the same Cluster Name and base domain should ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: John Hixson
QA Contact: Etienne Simard
URL:
Whiteboard:
Depends On:
Blocks: 1804856
TreeView+ depends on / blocked
 
Reported: 2019-07-22 18:17 UTC by Etienne Simard
Modified: 2020-07-13 17:11 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1804856 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:11:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 3120 0 None closed Bug 1732124: Azure: don't allow installing with the same cluster name as an existing install 2021-02-10 10:10:47 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:11:19 UTC

Description Etienne Simard 2019-07-22 18:17:49 UTC
Description of problem:

Creating multiple clusters with the same Cluster Name and base domain should not be allowed and generate an exception. 

The installer should catch the error on `openshift-install create` and `openshift-install delete` and not create the resources. 

Version-Release number of the following components:

Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146


How reproducible: 
Easily

Steps to Reproduce:
1. Create a cluster

$  bin/openshift-install create cluster --dir=./cluster_a
? SSH Public Key /path_to/id_rsa.pub
? Platform azure
? Region centralus
? Base Domain qe.azure.devcluster.openshift.com
? Cluster Name qe-esimardtest
...
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.qe-esimardtest.qe.azure.devcluster.openshift.com 
...


2. Login and retrieve cluster information:

Cluster ID
099bcaff-3e87-4f11-9aea-e3cbc891682d
Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146

3. Create a Cluster with the same Cluster Name and base domain:


$  bin/openshift-install create cluster --dir=./cluster_b
? SSH Public Key /path_to/id_rsa.pub
? Platform azure
? Region centralus
? Base Domain qe.azure.devcluster.openshift.com
? Cluster Name qe-esimardtest
...

Actual results:

1. Second cluster is created without any errors from the installer
2. Shared domain DNS entry is overwritten ( *.apps.qe-esimardtest.qe.azure.devcluster.openshift.com. )
3. You can login into the new cluster, but can't login into the old one because of the DNS mix up

Cluster ID
d247c7cd-1c57-40c1-9d02-92cf362b694c
Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146

4. Destroying cluster_a will delete the Resource Group and also the shared DNS entries rendering cluster_b unusable

$  bin/openshift-install destroy cluster --dir=./cluster_a
INFO deleted                                       record=api.qe-esimardtest
INFO deleted                                       record="*.apps.qe-esimardtest"
INFO deleted                                       resource group=qe-esimardtest-bttln-rg

Expected results:

The installer should catch an error when it check if the DNS entry for the new cluster already exists and it should not move forward with any resource creation unless we provide a different cluster name that isn't already in use.

Example from AWS installer test case:

Installer should prompt there is already existing cluster with the same cluster name
# ./openshift-install create cluster --dir demo2
WARNING Found override for ReleaseImage. Please be warned, this is not advised
INFO Consuming "Install Config" from target directory
INFO Creating infrastructure resources...         
ERROR                                              
ERROR Error: Error applying plan:                  
ERROR                                              
ERROR 1 error occurred:                            
ERROR     * module.dns.aws_route53_record.api_external: 1 error occurred:
ERROR     * aws_route53_record.api_external: [ERR]: Error building changeset: InvalidChangeBatch: [Tried to create resource record set [name='api.qe-jialiu.qe.devcluster.openshift.com.', type='A'] but it already exists]
ERROR     status code: 400, request id: 7e522cc8-5442-11e9-89ab-d78b444a875d
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Terraform does not automatically rollback in the face of errors.
ERROR Instead, your Terraform state file has been partially updated with
ERROR any resources that successfully completed. Please address the error
ERROR above and apply again to incrementally change your infrastructure.
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply using Terraform

Comment 3 John Hixson 2020-02-18 02:05:05 UTC
PR: https://github.com/openshift/installer/pull/3120

Comment 7 Etienne Simard 2020-05-11 21:49:26 UTC
Verified with:

./openshift-install 4.5.0-0.nightly-2020-05-08-200452
built from commit 94f6539c438c876cf43f87c576692e7213d62a91
release image registry.svc.ci.openshift.org/ocp/release@sha256:a01ce6c188065715bcb805064df1713e1e63c08970ebbd8a1d5151a9ee3967e4


`openshift-install` fails gracefully with the following error shortly after you try to create a second cluster with the exact same name in the same DNS zone: "FATAL failed to fetch Cluster: failed to fetch dependency of "Cluster": failed to generate asset "Platform Provisioning Check": api.qe.azure.devcluster.openshift.com CNAME record already exists in cluster01 and might be in use by another cluster, please remove it to continue"

Comment 9 errata-xmlrpc 2020-07-13 17:11:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.