Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1732124

Summary: Creating multiple clusters with the same Cluster Name and base domain should not be allowed in Azure
Product: OpenShift Container Platform Reporter: Etienne Simard <esimard>
Component: InstallerAssignee: John Hixson <jhixson>
Installer sub component: openshift-installer QA Contact: Etienne Simard <esimard>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium    
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1804856 (view as bug list) Environment:
Last Closed: 2020-07-13 17:11:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1804856    

Description Etienne Simard 2019-07-22 18:17:49 UTC
Description of problem:

Creating multiple clusters with the same Cluster Name and base domain should not be allowed and generate an exception. 

The installer should catch the error on `openshift-install create` and `openshift-install delete` and not create the resources. 

Version-Release number of the following components:

Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146


How reproducible: 
Easily

Steps to Reproduce:
1. Create a cluster

$  bin/openshift-install create cluster --dir=./cluster_a
? SSH Public Key /path_to/id_rsa.pub
? Platform azure
? Region centralus
? Base Domain qe.azure.devcluster.openshift.com
? Cluster Name qe-esimardtest
...
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.qe-esimardtest.qe.azure.devcluster.openshift.com 
...


2. Login and retrieve cluster information:

Cluster ID
099bcaff-3e87-4f11-9aea-e3cbc891682d
Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146

3. Create a Cluster with the same Cluster Name and base domain:


$  bin/openshift-install create cluster --dir=./cluster_b
? SSH Public Key /path_to/id_rsa.pub
? Platform azure
? Region centralus
? Base Domain qe.azure.devcluster.openshift.com
? Cluster Name qe-esimardtest
...

Actual results:

1. Second cluster is created without any errors from the installer
2. Shared domain DNS entry is overwritten ( *.apps.qe-esimardtest.qe.azure.devcluster.openshift.com. )
3. You can login into the new cluster, but can't login into the old one because of the DNS mix up

Cluster ID
d247c7cd-1c57-40c1-9d02-92cf362b694c
Provider
Azure
OpenShift Version
4.2.0-0.okd-2019-07-21-162146

4. Destroying cluster_a will delete the Resource Group and also the shared DNS entries rendering cluster_b unusable

$  bin/openshift-install destroy cluster --dir=./cluster_a
INFO deleted                                       record=api.qe-esimardtest
INFO deleted                                       record="*.apps.qe-esimardtest"
INFO deleted                                       resource group=qe-esimardtest-bttln-rg

Expected results:

The installer should catch an error when it check if the DNS entry for the new cluster already exists and it should not move forward with any resource creation unless we provide a different cluster name that isn't already in use.

Example from AWS installer test case:

Installer should prompt there is already existing cluster with the same cluster name
# ./openshift-install create cluster --dir demo2
WARNING Found override for ReleaseImage. Please be warned, this is not advised
INFO Consuming "Install Config" from target directory
INFO Creating infrastructure resources...         
ERROR                                              
ERROR Error: Error applying plan:                  
ERROR                                              
ERROR 1 error occurred:                            
ERROR     * module.dns.aws_route53_record.api_external: 1 error occurred:
ERROR     * aws_route53_record.api_external: [ERR]: Error building changeset: InvalidChangeBatch: [Tried to create resource record set [name='api.qe-jialiu.qe.devcluster.openshift.com.', type='A'] but it already exists]
ERROR     status code: 400, request id: 7e522cc8-5442-11e9-89ab-d78b444a875d
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Terraform does not automatically rollback in the face of errors.
ERROR Instead, your Terraform state file has been partially updated with
ERROR any resources that successfully completed. Please address the error
ERROR above and apply again to incrementally change your infrastructure.
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply using Terraform

Comment 3 John Hixson 2020-02-18 02:05:05 UTC
PR: https://github.com/openshift/installer/pull/3120

Comment 7 Etienne Simard 2020-05-11 21:49:26 UTC
Verified with:

./openshift-install 4.5.0-0.nightly-2020-05-08-200452
built from commit 94f6539c438c876cf43f87c576692e7213d62a91
release image registry.svc.ci.openshift.org/ocp/release@sha256:a01ce6c188065715bcb805064df1713e1e63c08970ebbd8a1d5151a9ee3967e4


`openshift-install` fails gracefully with the following error shortly after you try to create a second cluster with the exact same name in the same DNS zone: "FATAL failed to fetch Cluster: failed to fetch dependency of "Cluster": failed to generate asset "Platform Provisioning Check": api.qe.azure.devcluster.openshift.com CNAME record already exists in cluster01 and might be in use by another cluster, please remove it to continue"

Comment 9 errata-xmlrpc 2020-07-13 17:11:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409