Bug 1721619

Summary: AWS Installer chooses incorrect availability zones
Product: OpenShift Container Platform Reporter: Abhinav Dahiya <adahiya>
Component: InstallerAssignee: Abhinav Dahiya <adahiya>
Installer sub component: openshift-installer QA Contact: sheng.lao <shlao>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: jialiu, veer
Version: 4.1.z   
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1716548 Environment:
Last Closed: 2019-07-04 09:01:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1716548    
Bug Blocks:    

Description Abhinav Dahiya 2019-06-18 17:55:04 UTC
+++ This bug was initially created as a clone of Bug #1716548 +++

Description of problem:
IPI installation with AWS. If you change the default configuration to increase the number of workers, the installer chooses an AZ that doesnt support the instance type.

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:
can be reproduced

Steps to Reproduce:

1. Create an install config
Example: ./openshift-install create install-config --dir=second
select us-west-2 as the region.

2.Edit install-config.yaml and change the number of workers from 3 to 4
compute:
- hyperthreading: Enabled
  name: worker
  platform: {}
  replicas: 4

3. Run the installation
./openshift-install create cluster --dir=second

4. It only creates 3 workers and not 4.

Actual results:

oc get machinesets -n openshift-machine-api
NAME                             DESIRED   CURRENT   READY   AVAILABLE   AGE
second-hb7vq-worker-us-west-2a   1         1         1       1           87m
second-hb7vq-worker-us-west-2b   1         1         1       1           87m
second-hb7vq-worker-us-west-2c   1         1         1       1           87m
second-hb7vq-worker-us-west-2d   1         1                             87m


$ oc get machines -n openshift-machine-api
NAME                                   INSTANCE              STATE     TYPE        REGION      ZONE         AGE
second-hb7vq-master-0                  i-0ba4c5c0a949af07b   running   m4.xlarge   us-west-2   us-west-2a   99m
second-hb7vq-master-1                  i-0a1ed6818c4091dda   running   m4.xlarge   us-west-2   us-west-2b   99m
second-hb7vq-master-2                  i-0c100e6e76d64c8fa   running   m4.xlarge   us-west-2   us-west-2c   99m
second-hb7vq-worker-us-west-2a-5qwgp   i-01a26741af6e0dac5   running   m4.large    us-west-2   us-west-2a   97m
second-hb7vq-worker-us-west-2b-qt7mx   i-09bdde677ace6cfd8   running   m4.large    us-west-2   us-west-2b   97m
second-hb7vq-worker-us-west-2c-fb9vh   i-03dfe21426a3a029f   running   m4.large    us-west-2   us-west-2c   97m
second-hb7vq-worker-us-west-2d-xm6dh                                   m4.large    us-west-2   us-west-2d   97m


Machine is not running. Here is the status

      Message:               error launching instance: error creating EC2 instance: Unsupported: Your requested instance type (m4.large) is not supported in your requested Availability Zone (us-west-2d). Please retry your request by not specifying an Availability Zone or choosing us-west-2c, us-west-2b, us-west-2a.
                             status code: 400, request id: 7809acda-2740-460b-a807-d1a9a6db12be
      Reason:                MachineCreationFailed
      Status:                True
      Type:                  MachineCreation
    Kind:                    AWSMachineProviderStatus


So, the installer is ending up selecting availability zones where the instance type i not available


Expected results:

Installer should not select the AZs that support the machine types that it plans to use. 

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Abhinav Dahiya 2019-06-24 20:25:03 UTC
https://github.com/openshift/installer/pull/1787

Comment 2 sheng.lao 2019-06-26 02:41:15 UTC
It is Tested with version 4.1.0-0.nightly-2019-06-20-015058

# oc get machinesets -n openshift-machine-api
NAME                                      DESIRED   CURRENT   READY   AVAILABLE   AGE
shlao-bz1721619-fh979-worker-us-west-2a   1         1         1       1           18m
shlao-bz1721619-fh979-worker-us-west-2b   1         1         1       1           18m
shlao-bz1721619-fh979-worker-us-west-2c   1         1         1       1           18m
shlao-bz1721619-fh979-worker-us-west-2d   1         1         1       1           18m

# oc get machines -n openshift-machine-api
NAME                                            INSTANCE              STATE     TYPE        REGION      ZONE         AGE
shlao-bz1721619-fh979-master-0                  i-04cc19ac3bf390bad   running   m5.xlarge   us-west-2   us-west-2a   18m
shlao-bz1721619-fh979-master-1                  i-04ced8be92e70d194   running   m5.xlarge   us-west-2   us-west-2b   18m
shlao-bz1721619-fh979-master-2                  i-00f1543c111891836   running   m5.xlarge   us-west-2   us-west-2c   18m
shlao-bz1721619-fh979-worker-us-west-2a-4tf99   i-047ea402746b3bacb   running   m5.large    us-west-2   us-west-2a   17m
shlao-bz1721619-fh979-worker-us-west-2b-qrd76   i-0188f7a7dedad07f0   running   m5.large    us-west-2   us-west-2b   17m
shlao-bz1721619-fh979-worker-us-west-2c-lgfdd   i-09abc707e0036f333   running   m5.large    us-west-2   us-west-2c   17m
shlao-bz1721619-fh979-worker-us-west-2d-flv9x   i-0c7d28fc89c922bd8   running   m5.large    us-west-2   us-west-2d   17m

Comment 4 errata-xmlrpc 2019-07-04 09:01:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1635