Bug 1669274 - Error creating IAM instance profile xxxx: EntityAlreadyExists:
Summary: Error creating IAM instance profile xxxx: EntityAlreadyExists:
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Alex Crawford
QA Contact: Johnny Liu
URL:
Whiteboard:
: 1669602 (view as bug list)
Depends On:
Blocks: 1664187
TreeView+ depends on / blocked
 
Reported: 2019-01-24 19:47 UTC by jooho lee
Modified: 2019-03-12 14:24 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-13 22:24:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
create cluster log (375.27 KB, text/plain)
2019-01-29 14:16 UTC, jooho lee
no flags Details
destroy log (5.97 MB, text/plain)
2019-01-31 20:39 UTC, jooho lee
no flags Details
log for destroy cluster with 0.11.0 (2.55 KB, text/plain)
2019-01-31 21:36 UTC, jooho lee
no flags Details
Create cluster log with 0.11.0 after destroy (393.95 KB, text/plain)
2019-01-31 21:36 UTC, jooho lee
no flags Details
Destroy cluster log with 0.11.0 after create fail (36.82 KB, text/plain)
2019-01-31 21:37 UTC, jooho lee
no flags Details

Description jooho lee 2019-01-24 19:47:26 UTC
Description of problem:

After "openshift-install destroy cluster", I can see the profiles are deleted from log file


~~~
time="2019-01-24T14:16:06-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role policy=ocp4-bootstrap-policy
time="2019-01-24T14:16:06-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role
time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role policy=ocp4_master_policy
time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role
time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role policy=ocp4_worker_policy
time="2019-01-24T14:16:07-05:00" level=info msg=Disassociated IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role
time="2019-01-24T14:16:07-05:00" level=info msg=Deleted IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile
time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role
~~~

However, when I retry "openshift-install create cluster", it failed because the profile exists.

~~~
ERROR 	* module.masters.aws_iam_instance_profile.master: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. 
ERROR 	status code: 409, request id: b0d3f25b-200b-11e9-a981-0bae4ac18883 
ERROR                                              
ERROR                                              
ERROR 	* module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. 
ERROR 	status code: 409, request id: b0d2e1be-200b-11e9-bbba-27de87cb8ca9 
~~~

Even I am not sure how I can delete the object on AWS console manually.

Version-Release number of the following components:


How reproducible:

Steps to Reproduce:
1../openshift-install-0.10.1  create cluster --log-level=debug --dir=aws

2. ./openshift-install-0.10.1  destroy cluster --log-level=debug --dir=aws

3.

Actual results:

Failed.

Comment 2 Alex Crawford 2019-01-25 20:24:23 UTC
*** Bug 1669602 has been marked as a duplicate of this bug. ***

Comment 3 Alex Crawford 2019-01-25 20:26:10 UTC
I'm failing to see the bug here. You have conflicting resources in your account. Those need to be removed before the installer can succeed. How do you think the behavior should differ?

Comment 4 jooho lee 2019-01-25 21:27:58 UTC
 You have conflicting resources in your account.
==> I thought "openshift-install destroy cluster" clean all resources. 

 Those need to be removed before the installer can succeed. 
==> The thing how I delete it in AWS console. This issue happens because I had many retries to install. So I deleted the folder(config file are stored) before I should do "openshift-install destroy cluster"
    If I delete the folder, the destroy command can not be executed so I must execute "create cluster" command even though I know the command will fail. After it failed, I can try to destroy.
    However, it looks it didn't delete all objects.
 

How do you think the behavior should differ?

==> I think this case is different from 1669602 bugzilla ticket.
    The reproduce way is also different.

For this case, this is what I did.

1../openshift-install-0.10.1  create cluster --log-level=debug --dir=aws

Cluster Name: ocp40

2.rm -rf ./aws

3../openshift-install-0.10.1  create cluster --log-level=debug --dir=aws

4../openshift-install-0.10.1  destroy cluster --log-level=debug --dir=aws

5. rm -rf ./aws/terraforms.*

6. Delete roles related to this cluster on aws console (because openshift installer didn't delete it.)

7../openshift-install-0.10.1  create cluster --log-level=debug --dir=aws

-Jooho

Comment 5 jooho lee 2019-01-25 21:31:10 UTC
One thing more,

if I don't do step 6, the error message is like following:

```
ERROR 	* module.bootstrap.aws_iam_role.bootstrap: 1 error occurred: 
ERROR 	* aws_iam_role.bootstrap: Error creating IAM Role ocp40-bootstrap-role: EntityAlreadyExists: Role with name ocp40-bootstrap-role already exists. 
ERROR 	status code: 409, request id: d1df157a-20dd-11e9-a84d-21bcc48b71d8 
```

if I do,
```
ERROR 	* module.iam.aws_iam_instance_profile.worker: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.worker: Error creating IAM instance profile ocp40-worker-profile: EntityAlreadyExists: Instance Profile ocp40-worker-profile already exists. 
ERROR 	status code: 409, request id: bac6acb9-20e6-11e9-854e-352154b78abb 

```

I don't know where I can delete the profile on AWS console so I could not test more.

-Jooho

Comment 6 Alex Crawford 2019-01-25 23:13:29 UTC
At step 2, you lost the ability to clean up your resources. Attempting to create a new cluster doesn't help because every cluster is assigned a random, unique identifier. You must keep that directory if you want to eventually destroy the cluster. You can follow https://access.redhat.com/solutions/3826921 to recover the information needed to destroy the cluster.

Comment 7 jooho lee 2019-01-27 16:11:00 UTC
I tried to recover the information but I could get any output from this command.

The CLUSTER_NAME is ocp40 that I used for installation.
~~~
 AWS_REGION="us-east-1"; CLUSTER_NAME="ocp40"; aws ec2 describe-instances --filter "Name=tag:clusterid,Values=${ocp40}" --query 'Reservations[*].Instances[*].[Tags[?Key==`openshiftClusterID`].Value]' | jq '.[0][0][0][0]'

aws ec2 describe-instances --filter "Name=tag:clusterid,Values=${ocp40}"
~~~

By the way, I had failed to install the OCP because of maximum load balancer limit so I could not create any EC2 instances. Hence, it should show anything with the above command.

Comment 8 W. Trevor King 2019-01-28 21:02:11 UTC
(In reply to jooho lee from comment #7)
> By the way, I had failed to install the OCP because of maximum load balancer
> limit so I could not create any EC2 instances. Hence, it should show
> anything with the above command.

The VPC is closer to the root of the dependency graph, so it may be more reliable.  Try:

  $ aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))'

Also, based on your topic post, your cluster name is "ocp4" not "ocp40".  For example, as seen in
arn:aws:iam::694280550618:role/ocp4-bootstrap-role.  You might want to double-check your region too; you'll need to adjust if you weren't using us-east-1.

Comment 9 jooho lee 2019-01-28 21:45:34 UTC
Thanks Trevor,

I tested 3 times with ocp4, ocp40 and jlee-ocp4.

That's why there were 2 params.

From my testing, I deleted objects with the new metadata using 'ocp4' cluster uuid that is from the command you provided. Then retry to install with the same cluster name(ocp4)

I saw the profiles were deleted by log
```
INFO Disassociated                                 IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role
INFO Deleted                                       IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile
INFO Deleted  
```


However, still I encounter the same issues.
~~~
ERROR 2 errors occurred:                           
ERROR 	* module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. 
ERROR 	status code: 409, request id: 76746af6-2344-11e9-b795-e1661da7aeec 
ERROR                                              
ERROR                                              
ERROR 	* module.masters.aws_iam_instance_profile.master: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. 
ERROR 	status code: 409, request id: 766a0a98-2344-11e9-b795-e1661da7aeec 
~~~



This is how I did:
~~~
$ export CLUSTER_NAME=$(aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))')

$ echo "{\"clusterName\":\"${CLUSTER_NAME}\",\"clusterID\":\"${CLUSTER_UUID}\",\"aws\":{\"region\":\"${AWS_REGION}\",\"identifier\":[{\"openshiftClusterID\":\"${CLUSTER_UUID}\"},{\"kubernetes.io/cluster/${CLUSTER_NAME}\":\"owned\"}]}}" > metadata.json


$ ./openshift-install-0.10.1  destroy cluster --log-level=debug

$ ./openshift-install-0.10.1  create cluster --log-level=debug
~~~

Comment 10 W. Trevor King 2019-01-28 22:17:13 UTC
> $ export CLUSTER_NAME=$(aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))')
>
> $ echo "{\"clusterName\":\"${CLUSTER_NAME}\",\"clusterID\":\"${CLUSTER_UUID}\",\"aws\":{\"region\":\"${AWS_REGION}\",\"identifier\":[{\"openshiftClusterID\":\"${CLUSTER_UUID}\"},{\"kubernetes.io/cluster/${CLUSTER_NAME}\":\"owned\"}]}}" > metadata.json

That describe-vpcs call should dump out JSON with cluster-name, cluster-ID pairs.  And you already know the cluster name (it's the ocp4 you feed into that describe-vpcs command).  The describe-vpcs command lets you find cluster ID(s) for a given cluster name.  So you want:

  $ CLUSTER_NAME=ocp4
  $ CLUSTER_UUID=something-you-got-from-the-describe-vpcs-call
  $ echo ...

Comment 11 jooho lee 2019-01-29 14:14:32 UTC
@Trevor,

Yes I got the UUID and successfully finish the openshift-install destroy cluster.

However, when I try to create cluster again with the same cluster name. Still, I got this errors(I will attach full log):
~~~
ERROR 	* module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. 
ERROR 	status code: 409, request id: 154dc012-23cf-11e9-a975-d96e17ae8e64 
ERROR                                              
ERROR                                              
ERROR 	* module.masters.aws_iam_instance_profile.master: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. 
ERROR 	status code: 409, request id: 15c6e71d-23cf-11e9-a975-d96e17ae8e64 
~~~

FYI, I paste this log when I execute `openshift-install destroy cluster` before executing `openshift-install create cluster`

```
$ ./openshift-install-0.10.1  destroy cluster --log-level=debug  
DEBUG search for and delete matching resources by tag in us-east-1 matching aws.Filter{"openshiftClusterID":"e5130fc1-ad94-49ad-ac0c-ec051942e0be"} 
DEBUG search for and delete matching resources by tag in us-east-1 matching aws.Filter{"kubernetes.io/cluster/ocp4":"owned"} 
DEBUG no deletions from us-east-1, removing client 
DEBUG search for IAM roles                         
DEBUG search for IAM users     
```

-Jooho

Comment 12 jooho lee 2019-01-29 14:16:38 UTC
Created attachment 1524630 [details]
create cluster log

Comment 13 W. Trevor King 2019-01-31 05:35:06 UTC
> ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists
> ...
> FYI, I paste this log when I execute `openshift-install destroy cluster` before executing `openshift-install create cluster`....

The delete logs say notging about removing instance profiles, so I think they're still leftover from previous runs.  Can you check the tags on ocp4-bootstrap-profile?  And paste the metadata.json you used?

Comment 14 Johnny Liu 2019-01-31 11:00:21 UTC
I can not reproduce such issue, https://access.redhat.com/solutions/3826921 is working well in my testing.

# ./openshift-install version
./openshift-install v0.11.0

Here is my steps:
1. run `openshift-install create install-config` to generate a install-config file
2. mkdir test1 test2 cleanup
3. cp ./testing-install-config.yaml ./test1; ./openshift-install create cluster --dir ./test1; installation is completed
4. Following https://access.redhat.com/solutions/3826921 to generate metadata.json file into cleanup folder
5. ./openshift-install destroy cluster --dir ./cleanup; destroy is completed
6. cp ./testing-install-config.yaml ./test2; ./openshift-install create cluster --dir ./test2; installation is completed, cluster is running well.

Comment 15 jooho lee 2019-01-31 16:02:23 UTC
Thanks Johnny,

Actually, I update the KCS dramatically and also checked the steps are working fine.

There are something different from my reproduce steps:

~~~
3. cp ./testing-install-config.yaml ./test1; ./openshift-install create cluster --dir ./test1; installation is completed  <====

==> I have never completed the installation because of loadbalancer/VPC limit so the creating process failed 
    If I can install the ocp4 successfully, I can say it is because of the failure or not but.. still the limitation issues are around since 23th(installation party)

4. Following https://access.redhat.com/solutions/3826921 to generate metadata.json file into cleanup folder

5. ./openshift-install destroy cluster --dir ./cleanup; destroy is completed
==> I can see the "destroy" is completed as well but when I try to install again with the same cluster-name, then it showed the error messages( I didn't use metadata.json file that is generated before)


6. cp ./testing-install-config.yaml ./test2; ./openshift-install create cluster --dir ./test2; installation is completed, cluster is running well.   <=== ( I didn't use metadata.json file that is generated before)

Comment 16 W. Trevor King 2019-01-31 16:34:20 UTC
> ... I didn't use metadata.json file that is generated before...

This still leaves me suspicious that your issie is an incorrectly reconstituted metadata.json.  can you provide tge information requested in comment 13?

Comment 17 jooho lee 2019-01-31 17:02:05 UTC
Sorry Trevor, I missed the comment.


 Can you check the tags on ocp4-bootstrap-profile?  
==> where can I check the tags? (I am not familiar with aws yet)

And paste the metadata.json you used?

~~~
{"clusterName":"ocp4","clusterID":"b78c01d7-9a47-479c-8476-f43e60ebd704","aws":{"region":"us-east-1","identifier":[{"openshiftClusterID":"b78c01d7-9a47-479c-8476-f43e60ebd704"},{"kubernetes.io/cluster/ocp4":"owned"}]}}
~~~

-Jooho

Comment 18 W. Trevor King 2019-01-31 17:53:17 UTC
> >  Can you check the tags on ocp4-bootstrap-profile? 
> ==> where can I check the tags? (I am not familiar with aws yet)

Ah, sorry, looks like AWS still doesn't support tagging instance profiles.  We currently delete these when we delete the associated instance and/or IAM role (which *are* tagged).  But if you somehow get those deleted, you may be left with dangling instance profiles.  We try hard to not remove the instances or roles before we have the instance profiles removed, but you can get into that state if something else is removing resources without being as careful, or maybe we have bugs in our guards.  Can you check to see if you have instances or roles left with your name?  Try:

  $ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text
  $ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role
  $ aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role

Comment 19 jooho lee 2019-01-31 17:58:23 UTC
$ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text

~~~
  aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text
arn:aws:ec2:us-east-1:694280550618:dhcp-options/dopt-09408a14d9446c1a1	arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-01c38c3b50ca34a17	arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-01f0fdd0812952b45	arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-02bac15763c765196
arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-050c1babb311635af	arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-05ba35bd608f15dcb
arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-0dd103bf30df580ff
arn:aws:ec2:us-east-1:694280550618:internet-gateway/igw-013a7893d031db052
arn:aws:ec2:us-east-1:694280550618:natgateway/nat-005b489e99acbb87c	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-00eaceeb95e59c724	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-013d364ef9b18b5afarn:aws:ec2:us-east-1:694280550618:natgateway/nat-015c4ee6e231f8623	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-01c60c08fcc7ff4e6	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-02aab6f8dde7adb12arn:aws:ec2:us-east-1:694280550618:natgateway/nat-02f0ba9daab849665	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-035252b1e4816cc79
arn:aws:ec2:us-east-1:694280550618:natgateway/nat-039bb3a9488994772	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-03cb6d7507bb36801	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0490a47137ffcb499arn:aws:ec2:us-east-1:694280550618:natgateway/nat-062978066eeea6fc4	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-080a251ef00287f1e
arn:aws:ec2:us-east-1:694280550618:natgateway/nat-089f3d0b0724fc167	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-094a3410c70099f3b	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-098a03cdce2e62781arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0af9fec9abebeeda1	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0b5ce15ebf9914806	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0bbdbc1eb8de8807d
arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0d7c0ea7a73addc42	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0d925a7abc3ea5b5c	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0e4dde24f1156a2dfarn:aws:ec2:us-east-1:694280550618:natgateway/nat-0efc16e17b00ec201	arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0f460db10b6d1c915	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-000e64c23fac4f32b
arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0229d14b5196a264a	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-02754d2148aee8f36	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-03b274238c896061d
arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0410f2d605870d1f5	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-05ae27866ffa2990b	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-06ebbf11cfc9fcfcc
arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0755be223fa4a0a58
arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0ca2ed23df356274d	arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0d98eb10bcb98a0c7
arn:aws:ec2:us-east-1:694280550618:security-group/sg-003d08fa9cc4aa967
arn:aws:ec2:us-east-1:694280550618:security-group/sg-04c8855918950681f
arn:aws:ec2:us-east-1:694280550618:security-group/sg-09b782112b56814ec
arn:aws:ec2:us-east-1:694280550618:security-group/sg-0ddfbef29db01a641	arn:aws:ec2:us-east-1:694280550618:security-group/sg-0e9de1887f12bfca1
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-019318f9a8185c6f3	arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0338973fc756753a1
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-03c7e4f09165b14ab
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-06436774ab1f10da4	arn:aws:ec2:us-east-1:694280550618:subnet/subnet-065b141b4e19064ce
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-08084b58c4543844e
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-09ca3b5898017e509	arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0ac20b8dd00f9674c	arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0b010217d5b058b50
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0c63ddb163f861fa2
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0d87bb6624753f1f5
arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0fed3ca6c6a603f3e
arn:aws:ec2:us-east-1:694280550618:vpc/vpc-04aa480d1bf9ef648
arn:aws:elasticloadbalancing:us-east-1:694280550618:loadbalancer/net/ocp4-ext/1126b1bd05fd3c6b
arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-api-ext/22a8ef37515a86c7	arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-api-int/00381ff98fe0f1a7	arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-services/16ea19be3886a5c0
arn:aws:route53:::hostedzone/ZHUTW6DPO4S38
~~~


aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role
~~
Invalid choice: 'list-role-tags', maybe you meant:

  * list-roles
~~~

Seems the command is wrong..

Comment 20 jooho lee 2019-01-31 18:02:20 UTC
oh.. it is because of awscli version..

I will download new version and will try.

Comment 21 W. Trevor King 2019-01-31 18:02:41 UTC
> Invalid choice: 'list-role-tags', maybe you meant...

It's newish functionality, but it's there [1].  Maybe install a more-recent aws command?  Or use the AWS web console (although I'm not sure quite where you'd look).

> arn:aws:ec2:us-east-1:694280550618:vpc/vpc-04aa480d1bf9ef648

But this should be sufficient for discovering your tags.  Try:

  $ aws --region us-east-1 ec2 describe-vpcs --vpc-ids vpc-04aa480d1bf9ef648

[1]: https://docs.aws.amazon.com/cli/latest/reference/iam/list-role-tags.html

Comment 22 jooho lee 2019-01-31 18:06:35 UTC
Here we go..

$ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role
False
TAGS	openshiftClusterID	b78c01d7-9a47-479c-8476-f43e60ebd704
TAGS	Name	ocp4-bootstrap


$ aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role

False
TAGS	openshiftClusterID	b78c01d7-9a47-479c-8476-f43e60ebd704
TAGS	kubernetes.io/cluster/ocp4	owned

Comment 23 W. Trevor King 2019-01-31 18:11:30 UTC
> Here we go..
>
> $ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role
> False
> TAGS	openshiftClusterID	b78c01d7-9a47-479c-8476-f43e60ebd704
> TAGS	Name	ocp4-bootstrap

That matches your manifests.json from comment 17.  You fed that into 'openshift-install destroy cluster' and it discovered no resources (per comment 11)?  Can you run that deletion again and attach .openshift_install.log?

Comment 24 jooho lee 2019-01-31 20:39:08 UTC
Created attachment 1525588 [details]
destroy log

Comment 25 jooho lee 2019-01-31 20:40:54 UTC
I attach the log but the destroy command never finish...so I stop it then upload the log.

Comment 26 W. Trevor King 2019-01-31 21:05:48 UTC
> ... but the destroy command never finish...

This means "the installer is failing to destroy resources associated with the cluster, and is repeatedly trying in hopes that it will eventually succeed".  From the tail of your logs:

time="2019-01-31T15:35:41-05:00" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"openshiftClusterID\":\"b78c01d7-9a47-479c-8476-f43e60ebd704\"}"
time="2019-01-31T15:35:41-05:00" level=debug msg="deleting arn:aws:ec2:us-east-1:694280550618:dhcp-options/dopt-09408a14d9446c1a1: unrecognized EC2 resource type dhcp-options"
time="2019-01-31T15:35:41-05:00" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"kubernetes.io/cluster/ocp4\":\"owned\"}"

which means the installer you used to create the cluster included [1] (released with 0.11.0), but the installer you're using to destroy the cluster does not.  Try using 0.11.0 to destroy the cluster.

$ grep 'info.*\(instance-profile\|role\)' /tmp/openshift_install.log 
time="2019-01-31T14:15:28-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role policy=ocp4-bootstrap-policy
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role policy=ocp4_master_policy
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role policy=ocp4_worker_policy
time="2019-01-31T14:15:29-05:00" level=info msg=Disassociated IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile
time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role

This has all three roles being deleted, but only the worker instance profile being deleted.  But it looks like you're using your metadata.json from a recent (failed) creation, which will not have the right cluster ID for your orphaned resources.  Try again with your reconstructed metadata.json from comment 17.

Comparing with a recent CI run:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1166/pull-ci-openshift-installer-master-e2e-aws/3314/artifacts/e2e-aws/installer/.openshift_install.log | grep 'info.*\(instance-profile\|role\)'
time="2019-01-31T19:44:43Z" level=info msg=Disassociated IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-master-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-096e3aab076389774" id=i-096e3aab076389774 name=ci-op-cm66vt6r-1d3f3-master-profile role=ci-op-cm66vt6r-1d3f3-master-role
time="2019-01-31T19:44:43Z" level=info msg=Deleted IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-master-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-096e3aab076389774" id=i-096e3aab076389774 name=ci-op-cm66vt6r-1d3f3-master-profile
time="2019-01-31T19:44:47Z" level=info msg=Disassociated IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-worker-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-03d0b17820c3bd5a0" id=i-03d0b17820c3bd5a0 name=ci-op-cm66vt6r-1d3f3-worker-profile role=ci-op-cm66vt6r-1d3f3-worker-role
time="2019-01-31T19:44:47Z" level=info msg=Deleted IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-worker-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-03d0b17820c3bd5a0" id=i-03d0b17820c3bd5a0 name=ci-op-cm66vt6r-1d3f3-worker-profile
time="2019-01-31T19:46:50Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-master-role" id=ci-op-cm66vt6r-1d3f3-master-role name=ci-op-cm66vt6r-1d3f3-master-role policy=ci-op-cm66vt6r-1d3f3_master_policy
time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-master-role" id=ci-op-cm66vt6r-1d3f3-master-role name=ci-op-cm66vt6r-1d3f3-master-role
time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-worker-role" id=ci-op-cm66vt6r-1d3f3-worker-role name=ci-op-cm66vt6r-1d3f3-worker-role policy=ci-op-cm66vt6r-1d3f3_worker_policy
time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-worker-role" id=ci-op-cm66vt6r-1d3f3-worker-role name=ci-op-cm66vt6r-1d3f3-worker-role

which is deleting all three instance profiles.

[1]: https://github.com/openshift/installer/pull/1116

Comment 27 jooho lee 2019-01-31 21:35:11 UTC
Ok. This is result.

1. I destroy with 0.11.0 binary  (log:  openshift_install.log.destroy_before_create)
   Looks it deletes the profile object
   (aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role)
    An error occurred (NoSuchEntity) when calling the ListRoleTags operation: The role with name ocp4-bootstrap-role cannot be found.

2. Try to create a cluster with the same clustername "ocp4"  (log:  openshift_install.log_create_after_destroy  )
   Still the error messages show the 

ERROR 	* module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. 
ERROR 	status code: 409, request id: 0284d1cd-259e-11e9-9ec7-7778711760e6 
ERROR                                              
ERROR                                              
ERROR 	* module.masters.aws_iam_instance_profile.master: 1 error occurred: 
ERROR 	* aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. 
ERROR 	status code: 409, request id: 02837224-259e-11e9-9385-61fa1054d42b 

  Checked the tag:
  ( aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role
False
TAGS	openshiftClusterID	782fec19-527a-4c9d-a949-34bfe2b2ea6a
TAGS	kubernetes.io/cluster/ocp4	owned)

  cat metadata.json
   ~~~
cat metadata.json 
{"clusterName":"ocp4","clusterID":"782fec19-527a-4c9d-a949-34bfe2b2ea6a","aws":{"region":"us-east-1","identifier":[{"openshiftClusterID":"782fec19-527a-4c9d-a949-34bfe2b2ea6a"},{"kubernetes.io/cluster/ocp4":"owned"}]}}
~~~


3. Tried to destroy the cluster.  (log: openshift_install.log_destroy_after_create)
looks it deleted the profile
~~~
aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role

An error occurred (NoSuchEntity) when calling the ListRoleTags operation: The role with name ocp4-master-role cannot be found.

~~~


I attach 3 logs

Comment 28 jooho lee 2019-01-31 21:36:00 UTC
Created attachment 1525599 [details]
log for destroy cluster with 0.11.0

Comment 29 jooho lee 2019-01-31 21:36:44 UTC
Created attachment 1525600 [details]
Create cluster log with 0.11.0 after destroy

Comment 30 jooho lee 2019-01-31 21:37:20 UTC
Created attachment 1525601 [details]
Destroy cluster log with 0.11.0 after create fail

Comment 31 W. Trevor King 2019-01-31 21:46:57 UTC
(In reply to jooho lee from comment #27)
> 1. I destroy with 0.11.0 binary  (log: 
> openshift_install.log.destroy_before_create)
>    Looks it deletes the profile object
>    (aws --region us-east-1 iam list-role-tags --role-name
> ocp4-bootstrap-role)
>     An error occurred (NoSuchEntity) when calling the ListRoleTags
> operation: The role with name ocp4-bootstrap-role cannot be found.

I don't know what deleted the bootstrap role, but the logs you attached only look like they delete the DHCP options (they also claim to delete NAT gateways, but AWS keeps those around for a while, so you can get multiple "successful" delete calls for a single NAT gateway).

And a missing role doesn't necessarily mean successful instance-profile cleanup.

> 2. Try to create a cluster with the same clustername "ocp4"  (log: 
> openshift_install.log_create_after_destroy  )
>    Still the error messages show the 
> 
> ERROR 	* module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error
> occurred: 
> ERROR 	* aws_iam_instance_profile.bootstrap: Error creating IAM instance
> profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile
> ocp4-bootstrap-profile already exists.

Instead of running new creates, you should just check for your dangling instance profiles.  Something like:

  $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))

should do it.  Then probably just remove them with:

  $ aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-bootstrap-profile
  $ aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-master-profile

and then keep track of your deletion logs going forward, so you can paste the last deletion before you hit this next time.

Comment 32 W. Trevor King 2019-01-31 21:48:31 UTC
(In reply to W. Trevor King from comment #31)
>   $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))

Oops, should have been:

  $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))]'

Comment 33 jooho lee 2019-01-31 22:08:35 UTC

There are still the profiles. Acutally, more than what I expected.
~~
aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))]'

[
  {
    "Path": "/",
    "InstanceProfileName": "ocp4-bootstrap-profile",
    "InstanceProfileId": "AIPAJHAD5ZBOR4OBCAE2O",
    "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-bootstrap-profile",
    "CreateDate": "2019-01-24T01:38:23Z",
    "Roles": []
  },
  {
    "Path": "/",
    "InstanceProfileName": "ocp4-master-profile",
    "InstanceProfileId": "AIPAIAQ6ZSIDNVT3XBSEI",
    "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-master-profile",
    "CreateDate": "2019-01-24T01:38:23Z",
    "Roles": []
  },
  {
    "Path": "/",
    "InstanceProfileName": "ocp4-on-aws-master-profile",
    "InstanceProfileId": "AIPAIIF6ST2CL6L3U724A",
    "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-on-aws-master-profile",
    "CreateDate": "2019-01-18T10:11:57Z",
    "Roles": [
      {
        "Path": "/",
        "RoleName": "ocp4-on-aws-master-role",
        "RoleId": "AROAIEEWXSKPWJQLPVTGU",
        "Arn": "arn:aws:iam::694280550618:role/ocp4-on-aws-master-role",
        "CreateDate": "2019-01-18T10:11:55Z",
        "AssumeRolePolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Sid": "",
              "Effect": "Allow",
              "Principal": {
                "Service": "ec2.amazonaws.com"
              },
              "Action": "sts:AssumeRole"
            }
          ]
        }
      }
    ]
  },
  {
    "Path": "/",
    "InstanceProfileName": "ocp4-on-aws-worker-profile",
    "InstanceProfileId": "AIPAJG24K437BY6IETETE",
    "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-on-aws-worker-profile",
    "CreateDate": "2019-01-18T10:11:57Z",
    "Roles": [
      {
        "Path": "/",
        "RoleName": "ocp4-on-aws-worker-role",
        "RoleId": "AROAI5W232Z7R5S4DIOMM",
        "Arn": "arn:aws:iam::694280550618:role/ocp4-on-aws-worker-role",
        "CreateDate": "2019-01-18T10:11:55Z",
        "AssumeRolePolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Sid": "",
              "Effect": "Allow",
              "Principal": {
                "Service": "ec2.amazonaws.com"
              },
              "Action": "sts:AssumeRole"
            }
          ]
        }
      }
    ]
  }
]

~~~

I tried to delete them all but failed
~~~
 aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-bootstrap-profile
 aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-master-profile
 aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-master-profile
  An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first.

 aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-worker-profile
  An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first
~~~

As you see, the last 2 profile cannot be deleted. Should I delete them? If so, how can I delete them?

Comment 34 W. Trevor King 2019-01-31 22:37:13 UTC
>  aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-master-profile
>   An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first.

  $ aws --region us-east-1 iam remove-role-from-instance-profile --instance-profile-name ocp4-on-aws-master-profile --role-name ocp4-on-aws-master-role

and similar for your other instance-profile / role pairs.

Comment 35 jooho lee 2019-02-01 01:56:23 UTC
OK I delete them all.

Now, should I try to create a cluster for test?

Comment 36 W. Trevor King 2019-02-01 02:00:31 UTC
(In reply to jooho lee from comment #35)
> Now, should I try to create a cluster for test?

Yes, please :)

Comment 37 jooho lee 2019-02-01 14:36:39 UTC
Finally, I install OCP4 on AWS.

I am going to create KCSs regarding this bugzilla ticket and let you know.

Moreover, I will retest it again and will confirm it.

Thanks a lot Trevor!.


Note You need to log in before you can comment on or make changes to this bug.