Bug 1756457

Summary: Need health check for openshift_master_cluster_hostname while installing multi master OCP environment
Product: OpenShift Container Platform Reporter: Aditya Deshpande <adeshpan>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Johnny Liu <jialiu>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: gblomqui, rteague, vjaypurk
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-16 11:57:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aditya Deshpande 2019-09-27 17:07:20 UTC
Description of problem:

When inventory contains multiple masters and has not configured openshift_master_cluster_hostname installation will not fail. 

Person who is installing cluster is thinking that has HA cluster but it is not HA cluster. When 1st master node fails or shut down, all cluster stops working.

It is because when openshift_master_cluster_hostname has no value in inventory, by default it will take 1st master hostname. It is good for single master installation but fatal for multi-master installation.


Version-Release number of the following components:
OCP 3.11 

How reproducible:
Always

Steps to Reproduce:
Install multi master cluster without openshift_master_cluster_hostname


Actual results:
The masterURL would be set as first master hostname.

Expected results:
- Report error when multiple master nodes are defined in inventory and openshift_master_cluster_hostname is empty

Comment 2 Russell Teague 2019-11-21 20:51:12 UTC
PR opened to address the issue.

Comment 4 Johnny Liu 2019-12-02 03:31:59 UTC
Verified this bug with openshift-ansible-3.11.156-1.git.0.5f2cfa4.el7.noarch, and PASS.


Define 3 masters in inventory, but not define openshift_master_cluster_hostname, the following message is thrown out.

TASK [openshift_sanitize_inventory : Ensure openshift_master_cluster_hostname is set when deploying multiple masters] ***
Monday 02 December 2019  11:25:25 +0800 (0:00:00.302)       0:00:09.878 ******* 
fatal: [ci-vm-10-0-151-173.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}
fatal: [ci-vm-10-0-150-0.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}
fatal: [ci-vm-10-0-151-121.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}

fatal: [ci-vm-10-0-150-142.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}
fatal: [ci-vm-10-0-151-143.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}
fatal: [ci-vm-10-0-151-135.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "openshift_master_cluster_hostname must be set when deploying multiple masters to ensure the loadbalancer name is used for accessing the cluster API\n"}
	to retry, use: --limit @/home/slave2/workspace/Launch Environment Flexy/private-openshift-ansible/playbooks/prerequisites.retry

Comment 6 errata-xmlrpc 2019-12-16 11:57:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4050