Bug 1725524

Summary: m4.xlarge is not supported in ap-northeast-2b
Product: OpenShift Container Platform Reporter: sheng.lao <shlao>
Component: InstallerAssignee: W. Trevor King <wking>
Installer sub component: openshift-installer QA Contact: sheng.lao <shlao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: eparis, gpei, jialiu, sponnaga, wking, wmeng
Version: 4.1.zKeywords: Regression, Reopened
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 18:07:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1725526    
Bug Blocks:    

Description sheng.lao 2019-07-01 03:28:17 UTC
Description of problem:
m4.xlarge is not supported in ap-northeast-2b

Version-Release number of the following components:
./openshift-install version 
openshift-install v4.1.4-201906301409-dirty
built from commit 01cca177d947833dce230ed4bb46d6d052dc289f
release image registry.svc.ci.openshift.org/ocp/release@sha256:92af5740adcd566a994c76d4f562b0e96b5ecc3a0f35b58a1177ead1d168c215

How reproducible:
always get error when creating cluster in ap-northeast-2

Steps to Reproduce:
1. ./openshift-install create cluster --dir test1
? Platform aws
? Region ap-northeast-2

2.
3.

Actual results:
INFO Creating infrastructure resources...         
ERROR                                              
ERROR Error: Error applying plan:                  
ERROR                                              
ERROR 1 error occurred:                            
ERROR   * module.masters.aws_instance.master[1]: 1 error occurred: 
ERROR   * aws_instance.master.1: Error launching source instance: Unsupported: Your requested instance type (m4.xlarge) is not supported in your requested Availability Zone (ap-northeast-2b). Please retry your request by not specifying an Availability Zone or choosing ap-northeast-2a, ap-northeast-2c. 


Expected results:
Successfully create cluster in region ap-northeast-2 

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Eric Paris 2019-07-01 14:04:23 UTC
looks a lot like https://github.com/openshift/installer/pull/1786 but in a different region...

Comment 7 Abhinav Dahiya 2019-07-01 23:42:14 UTC
```
aws --region ap-northeast-2 ec2 describe-reserved-instances-offerings --instance-tenancy default --instance-type m4.xlarge --product-description 'Linux/UNIX' --filters Name=scope,Values='Availability Zone' | jq -r '[.ReservedInstancesOfferings[].AvailabilityZone] | sort | unique[]'
ap-northeast-2a
ap-northeast-2b
ap-northeast-2c

```

It looks like locally AWS is reporting that the `m4.xlarge` instances are allowed for all AZs in ap-norteast-2

Comment 9 sheng.lao 2019-07-02 01:59:53 UTC
https://docs.aws.amazon.com/autoscaling/ec2/userguide/ts-as-instancelaunchfailure.html#ts-as-instancelaunchfailure-6

In this page, it says:
Cause: The instance type associated with your launch configuration might not be currently available in the Availability Zones specified in your Auto Scaling group.
Solution: Update your Auto Scaling group with the recommendations in the error message.

I think that we should check the configuration created by installer of Auto Scaling group.

Comment 11 Eric Paris 2019-07-02 14:15:50 UTC
Abhinav, you seemed to have looked at `ReservedInstancesOfferings` not On Demand instances.

My assumption is that you can get a reserved instance in 2b, but you can't get On Demand. And we use On Demand.

I obviously have no idea if this is something AWS will correct, or if we need to switch to m5 in that region, or what....

Comment 12 Abhinav Dahiya 2019-07-10 22:19:34 UTC
https://github.com/openshift/installer/pull/1935

Comment 14 Johnny Liu 2019-07-11 08:36:25 UTC
This bug was opened for tracking 4.1.z stream, QE opened a separate bug for 4.2 - 1725526, so move this bug to 4.1.z

Per QE's testing with 4.1.0-0.nightly-2019-07-10-210957, this bug is still reproduced.

Comment 15 W. Trevor King 2019-07-26 22:45:26 UTC
https://github.com/openshift/installer/pull/2108 is backporting a prereq before we can backport https://github.com/openshift/installer/pull/1935 to the release-4.1 branch.

Comment 16 W. Trevor King 2019-07-29 17:40:49 UTC
I've filed bug 1734136 with "do we want to support ap-east-1 in 4.1.z?".  That will determine whether we want to manually backport 1935 to release-4.1 (if we decide not to support ap-east-1) or whether we want to wait until pull 2108 lands to get a clean 1935 backport (if we decide to support ap-east-1).

Comment 17 Abhinav Dahiya 2019-08-13 20:11:41 UTC
It looks like we are not going to fix this issue for 4.1.z

Comment 18 Johnny Liu 2019-08-14 04:45:25 UTC
Per our official document [1], 'ap-northeast-2' is supported.

If this region does NOT supported, need remove the region from installer, and update our official document. Am I right?


[1]: https://docs.openshift.com/container-platform/4.1/installing/installing_aws/installing-aws-account.html#installation-aws-regions_installing-aws-account

Comment 19 W. Trevor King 2019-08-14 05:05:04 UTC
I've filed https://github.com/openshift/installer/pull/2213 with the dirty backport, because the conclusion of bug 1734136 was "we don't want the ap-east-1 stuff we'd need for a clean backport in 4.1.z".

Comment 20 Abhinav Dahiya 2019-08-14 16:38:29 UTC
(In reply to Johnny Liu from comment #18)
> Per our official document [1], 'ap-northeast-2' is supported.
> 
> If this region does NOT supported, need remove the region from installer,
> and update our official document. Am I right?
> 
> 
> [1]:
> https://docs.openshift.com/container-platform/4.1/installing/installing_aws/
> installing-aws-account.html#installation-aws-regions_installing-aws-account

At some point the users will be asked to use 4.2 instead of 4.1.. the z streams for 4.1 is for critical patches. And there's already a workaround even for 4.1 ie. set the instance type manually in the install-config.yaml. Or even use the default instance types but restrict to one zone.

The region is supported, just the defaults are no longer correct.

Comment 21 Scott Dodson 2019-08-16 13:28:24 UTC
(In reply to Abhinav Dahiya from comment #20)
> (In reply to Johnny Liu from comment #18)
> At some point the users will be asked to use 4.2 instead of 4.1.. the z
> streams for 4.1 is for critical patches. And there's already a workaround
> even for 4.1 ie. set the instance type manually in the install-config.yaml.
> Or even use the default instance types but restrict to one zone.
> 
> The region is supported, just the defaults are no longer correct.

If `openshift create cluster` workflow offers use of ap-northeast-2 and it's not possible to walk through the installation process we should either remove that region or address the machine type problems. We should not leave 4.1 in a state where the defaults fail.

Comment 23 sheng.lao 2019-08-23 03:31:12 UTC
Verified with version: 4.1.0-0.nightly-2019-08-22-165647

Steps:
# openshift-install create cluster --dir test1
? Platform  [Use arrows to move, type to filter]
> aws
? Platform aws
? Region  [Use arrows to move, type to filter, ? for more help]
  ap-northeast-1 (Tokyo)
> ap-northeast-2 (Seoul)
... ...

# sh check_cluster_health.sh
Passed

Comment 27 errata-xmlrpc 2019-10-16 18:07:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3004