Bug 1710981
| Summary: | m4 instances are old (2015), OpenShift should default to m5 instances for IPI and UPI installs on AWS | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> |
| Component: | Installer | Assignee: | Russell Teague <rteague> |
| Installer sub component: | openshift-installer | QA Contact: | Yunfei Jiang <yunjiang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aos-bugs, apitt, bbreard, bleanhar, dbaker, dornelas, dustymabe, eparis, erich, imcleod, jeder, jligon, jokerman, kalexand, mmccomas, nstielau, pragshar, sdodson, tsze, walters, wking, yunjiang |
| Version: | 4.1.0 | Keywords: | Reopened |
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
Feature: Switched the default preferred AWS instance class from m4 to m5.
Reason: Preference for newer hardware
Result: New clusters deployed on AWS will use m5 AWS instance class by default. If m5 is not available, the installer will fall back to m4.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-10-27 15:54:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1769322 | ||
| Bug Blocks: | |||
|
Comment 6
W. Trevor King
2019-05-23 19:40:19 UTC
There were 2 issues with m5 that made us (consciously) choose to stick with m4. 1. They have a lower EBS attach limit per node (m5 is 28, m4 is ~40) 2. Kube originally at least had trouble always getting the name of the device inside the instance correct. While (1) seems like it can easily overcome by getting more nodes it actually costs more. Since you have to pay for CPU/memory for the infrastructure pieces (kernel, kubelet, crio, sdn, etc) per node. If (2) was solved we probably can switch the default after the issues in comment #6 are resolved... Deferring this until M5 is more widely supported. With regard to availability in us-east-1e or ap-southeast-2 nothing seems to have changed there. We should revisit item #2 from comment 7 and the concerns from comment 6. > missing us-east-1e. One thing to remember is that in AWS the zones are randomized per-account https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html I suspect that "your" us-east-1e is one of the older AWS zones, possibly the very first, with old hardware. Some discussion also in https://www.reddit.com/r/aws/comments/9oy2iy/your_requested_instance_type_m5large_is_not/ I think we could try to carry whatever hacks are necessary to identify that AZ and avoid it? > I think we could try to carry whatever hacks are necessary to identify that AZ and avoid it?
Vs. just sticking with an image type that is supported across all zones in a region? I guess that's not future-proof against "AWS adds a new zone with only new hardware". But as far as I know, there is no API for "what on-demand instance types are available in $ZONE?", which is why we're leaning on reserved-instance queries above.
> One thing to remember is that in AWS the zones are randomized per-account...
$ AWS_PROFILE=ci aws --region us-east-1 ec2 describe-availability-zones --zone-names us-east-1e | jq -r '.AvailabilityZones[].ZoneId'
use1-az3
So if we wanted to hard-code choices by zone ID, we could do something like that. I'm not wildly excited about that, though ;).
The Canadian region (ca-central-1) *finally* added a 3rd AZ a few weeks ago (ca-central-1d). This AZ does *not* have "m4" instances. Anyone who attempts a stock IPI install in ca-central-1 gets the following error: ERROR Error: Error launching source instance: Unsupported: Your requested instance type (m4.xlarge) is not supported in your requested Availability Zone (ca-central-1d). Please retry your request by not specifying an Availability Zone or choosing ca-central-1a, ca-central-1b. Of course, the work around is to edit install-config.yaml and specify m5 instead, but for anyone wanting to install OpenShift 4 with IPI in the Canada region, the error above is likely to be their first installer experience. Still waiting for the depend_on bz to be fixed. (In reply to Abhinav Dahiya from comment #15) > Still waiting for the depend_on bz to be fixed. Still same. verified. PASS build: 4.6.0-0.nightly-2020-07-14-035247 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |