Bug 1859153 - [AWS] An IAM error occurred occasionally during the installation phase: Invalid IAM Instance Profile name
Summary: [AWS] An IAM error occurred occasionally during the installation phase: Inval...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Rafael Fonseca
QA Contact: Yunfei Jiang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-21 10:50 UTC by Yunfei Jiang
Modified: 2023-01-03 10:14 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:35:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5982 0 None open Bug 1859153: IAM instance profile race condition 2022-06-08 19:34:16 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:36:06 UTC

Description Yunfei Jiang 2020-07-21 10:50:56 UTC
Description of problem:

Error "Invalid IAM Instance Profile name" occurred when installing OCP 4.4.0-0.nightly-2020-07-18-033102

install log:
~~~
level=debug msg="module.dns.aws_route53_record.api_internal: Creation complete after 1m18s [id=Z07284573HERY5FDLQM1G_api-int.cam-tgt-6871a.qe.devcluster.openshift.com_A]"
level=error
level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-bootstrap-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name"
level=error msg="\tstatus code: 400, request id: 7ed118dc-2b87-4e7d-94cc-3c2b5e18c990"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-437511450/bootstrap/main.tf line 116, in resource \"aws_instance\" \"bootstrap\":"
level=error msg=" 116: resource \"aws_instance\" \"bootstrap\" {"
level=error
level=error
level=error
level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name"
level=error msg="\tstatus code: 400, request id: fe55a1ca-14c2-42dd-aedf-2cb7bed9dc36"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":"
level=error msg="  93: resource \"aws_instance\" \"master\" {"
level=error
level=error
level=error
level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name"
level=error msg="\tstatus code: 400, request id: 29a4c4a1-c74e-46dc-ac55-05d8d41de4a8"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":"
level=error msg="  93: resource \"aws_instance\" \"master\" {"
level=error
level=error
level=error
level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name"
level=error msg="\tstatus code: 400, request id: 9f64a775-503f-49c3-94e9-d742c52b18a5"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":"
level=error msg="  93: resource \"aws_instance\" \"master\" {"
level=error
level=error
level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform"
~~~

Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-07-18-033102


How reproducible:
always

Steps to Reproduce:
1. Trigger an IPI install on AWS

Actual results:
Create cluster failed

Expected results:
Create cluster succeed

Comment 1 Yunfei Jiang 2020-07-21 10:56:41 UTC
this bug blocks all 4.4 IPI testing on AWS

Comment 2 Eric Paris 2020-07-21 13:07:13 UTC
I believe this was an AWS outage this morning. Does it still reproduce?

Comment 3 Scott Dodson 2020-07-21 13:24:16 UTC
Please re-open if this reproduces, but this is believed to have been an AWS outage, AWS release jobs have been green since 05:49:22 EDT and had started failing at 03:24:30 EDT.

https://status.aws.amazon.com/

Between 12:02 AM and 2:35 AM PDT AWS customers experienced increased error rates while calling the IAM assume role, get session token and other APIs with the long term credentials. As of 2:35 AM PDT, we are fully recovered and the issue is resolved now. Other AWS services such as AWS CloudFormation whose features require these actions experienced similar impact.

Comment 4 Yunfei Jiang 2020-07-22 02:43:57 UTC
should be an AWS outage, rebuild successfully on 4.4.0-0.nightly-2020-07-18-033102.

thanks.

Comment 6 Rafael Fonseca 2022-06-08 13:46:29 UTC
This could be a result of a race condition when using the resource before it has been created on the AWS side [1] [2].

[1] https://github.com/hashicorp/terraform/issues/15341
[2] https://github.com/hashicorp/terraform-provider-aws/issues/838

Comment 10 Yunfei Jiang 2022-06-23 07:57:12 UTC
The error was not found in recent CI logs.

Comment 11 errata-xmlrpc 2022-08-10 10:35:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.