Bug 1569621

Summary: openshift-ansible aws provisioning masters need to be detached from scale group
Product: OpenShift Container Platform Reporter: Matt Woodson <mwoodson>
Component: InstallerAssignee: Chris Callegari <ccallega>
Status: CLOSED ERRATA QA Contact: sheng.lao <shlao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, bleanhar, ccallega, jokerman, mmccomas, sspeiche
Target Milestone: ---Keywords: OpsBlocker
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Master nodes will now be deployed as standalone EC2 instances. Reason: When deploying to AWS OCP master instances cannot be EC2s under an AutoScale Group due to unexpected redeployments that will damage static etcd membership. Result: OCP on AWS will have stable control plane instances
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-11 07:19:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matt Woodson 2018-04-19 15:24:05 UTC
Description of problem:

Currently the openshift-ansible cluster provisioning code provisions the masters inside of a scale group.  masters need to not be in an AWS scalegroup while running.  

Currently masters are too much of a pet to be added to a scalegroup.

Running in the scalegroup has too many potentials issues:

- A node may not be shutdown/stopped.  If so, the scalegroup will terminate them.  Even if enabling termination protection, the aws instance will be terminated.

- A master node may never be resized.  In order to resize a node, it must be shutdown.  When this happens, the scalegroup will delete it.

- If a scalegroup spans multi az's, and an AZ goes down, the instance will be terminated and created in a new AZ.


Version-Release number of selected component (if applicable):

openshift-ansible 3.9, 3.10


Additional info:

One solution would be to provision the nodes with a scalegroup, but once done, could detach the instances from

Comment 1 Steve Speicher 2018-05-17 20:13:56 UTC
Moving to high, since ops blocker

Comment 5 Chris Callegari 2018-08-27 13:42:52 UTC
pull/9736 has been accepted and ready for QE

Comment 6 Matt Woodson 2018-08-27 14:36:35 UTC
Is this change multi-az aware?

Comment 7 Chris Callegari 2018-08-27 15:16:08 UTC
Matt, yes!  Masters can also to be resized without redeployment as well.

Comment 8 sheng.lao 2018-08-31 06:07:05 UTC
Fixed at: openshift-ansible-3.11.0-0.25.0-34-g04f8519

1. checking: standalone master instances
# aws autoscaling describe-auto-scaling-groups |grep "master group name"
#
Empty means standalone master.

Comment 11 errata-xmlrpc 2018-10-11 07:19:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652