Description of problem: Error "Invalid IAM Instance Profile name" occurred when installing OCP 4.4.0-0.nightly-2020-07-18-033102 install log: ~~~ level=debug msg="module.dns.aws_route53_record.api_internal: Creation complete after 1m18s [id=Z07284573HERY5FDLQM1G_api-int.cam-tgt-6871a.qe.devcluster.openshift.com_A]" level=error level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-bootstrap-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name" level=error msg="\tstatus code: 400, request id: 7ed118dc-2b87-4e7d-94cc-3c2b5e18c990" level=error level=error msg=" on ../../../../../tmp/openshift-install-437511450/bootstrap/main.tf line 116, in resource \"aws_instance\" \"bootstrap\":" level=error msg=" 116: resource \"aws_instance\" \"bootstrap\" {" level=error level=error level=error level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name" level=error msg="\tstatus code: 400, request id: fe55a1ca-14c2-42dd-aedf-2cb7bed9dc36" level=error level=error msg=" on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":" level=error msg=" 93: resource \"aws_instance\" \"master\" {" level=error level=error level=error level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name" level=error msg="\tstatus code: 400, request id: 29a4c4a1-c74e-46dc-ac55-05d8d41de4a8" level=error level=error msg=" on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":" level=error msg=" 93: resource \"aws_instance\" \"master\" {" level=error level=error level=error level=error msg="Error: Error launching source instance: InvalidParameterValue: Value (cam-tgt-6871a-p8n7t-master-profile) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name" level=error msg="\tstatus code: 400, request id: 9f64a775-503f-49c3-94e9-d742c52b18a5" level=error level=error msg=" on ../../../../../tmp/openshift-install-437511450/master/main.tf line 93, in resource \"aws_instance\" \"master\":" level=error msg=" 93: resource \"aws_instance\" \"master\" {" level=error level=error level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform" ~~~ Version-Release number of selected component (if applicable): 4.4.0-0.nightly-2020-07-18-033102 How reproducible: always Steps to Reproduce: 1. Trigger an IPI install on AWS Actual results: Create cluster failed Expected results: Create cluster succeed
this bug blocks all 4.4 IPI testing on AWS
I believe this was an AWS outage this morning. Does it still reproduce?
Please re-open if this reproduces, but this is believed to have been an AWS outage, AWS release jobs have been green since 05:49:22 EDT and had started failing at 03:24:30 EDT. https://status.aws.amazon.com/ Between 12:02 AM and 2:35 AM PDT AWS customers experienced increased error rates while calling the IAM assume role, get session token and other APIs with the long term credentials. As of 2:35 AM PDT, we are fully recovered and the issue is resolved now. Other AWS services such as AWS CloudFormation whose features require these actions experienced similar impact.
should be an AWS outage, rebuild successfully on 4.4.0-0.nightly-2020-07-18-033102. thanks.
found same issues in recent Prow CI logs[1]: https://search.ci.openshift.org/?search=for+parameter+iamInstanceProfile.name+is+invalid.+Invalid+IAM+Instance+Profile+name&maxAge=168h&context=5&type=build-log&name=.*.*aws.*&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
This could be a result of a race condition when using the resource before it has been created on the AWS side [1] [2]. [1] https://github.com/hashicorp/terraform/issues/15341 [2] https://github.com/hashicorp/terraform-provider-aws/issues/838
The error was not found in recent CI logs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069