Description of problem: After "openshift-install destroy cluster", I can see the profiles are deleted from log file ~~~ time="2019-01-24T14:16:06-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role policy=ocp4-bootstrap-policy time="2019-01-24T14:16:06-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role policy=ocp4_master_policy time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role policy=ocp4_worker_policy time="2019-01-24T14:16:07-05:00" level=info msg=Disassociated IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role time="2019-01-24T14:16:07-05:00" level=info msg=Deleted IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile time="2019-01-24T14:16:07-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role ~~~ However, when I retry "openshift-install create cluster", it failed because the profile exists. ~~~ ERROR * module.masters.aws_iam_instance_profile.master: 1 error occurred: ERROR * aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. ERROR status code: 409, request id: b0d3f25b-200b-11e9-a981-0bae4ac18883 ERROR ERROR ERROR * module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. ERROR status code: 409, request id: b0d2e1be-200b-11e9-bbba-27de87cb8ca9 ~~~ Even I am not sure how I can delete the object on AWS console manually. Version-Release number of the following components: How reproducible: Steps to Reproduce: 1../openshift-install-0.10.1 create cluster --log-level=debug --dir=aws 2. ./openshift-install-0.10.1 destroy cluster --log-level=debug --dir=aws 3. Actual results: Failed.
*** Bug 1669602 has been marked as a duplicate of this bug. ***
I'm failing to see the bug here. You have conflicting resources in your account. Those need to be removed before the installer can succeed. How do you think the behavior should differ?
You have conflicting resources in your account. ==> I thought "openshift-install destroy cluster" clean all resources. Those need to be removed before the installer can succeed. ==> The thing how I delete it in AWS console. This issue happens because I had many retries to install. So I deleted the folder(config file are stored) before I should do "openshift-install destroy cluster" If I delete the folder, the destroy command can not be executed so I must execute "create cluster" command even though I know the command will fail. After it failed, I can try to destroy. However, it looks it didn't delete all objects. How do you think the behavior should differ? ==> I think this case is different from 1669602 bugzilla ticket. The reproduce way is also different. For this case, this is what I did. 1../openshift-install-0.10.1 create cluster --log-level=debug --dir=aws Cluster Name: ocp40 2.rm -rf ./aws 3../openshift-install-0.10.1 create cluster --log-level=debug --dir=aws 4../openshift-install-0.10.1 destroy cluster --log-level=debug --dir=aws 5. rm -rf ./aws/terraforms.* 6. Delete roles related to this cluster on aws console (because openshift installer didn't delete it.) 7../openshift-install-0.10.1 create cluster --log-level=debug --dir=aws -Jooho
One thing more, if I don't do step 6, the error message is like following: ``` ERROR * module.bootstrap.aws_iam_role.bootstrap: 1 error occurred: ERROR * aws_iam_role.bootstrap: Error creating IAM Role ocp40-bootstrap-role: EntityAlreadyExists: Role with name ocp40-bootstrap-role already exists. ERROR status code: 409, request id: d1df157a-20dd-11e9-a84d-21bcc48b71d8 ``` if I do, ``` ERROR * module.iam.aws_iam_instance_profile.worker: 1 error occurred: ERROR * aws_iam_instance_profile.worker: Error creating IAM instance profile ocp40-worker-profile: EntityAlreadyExists: Instance Profile ocp40-worker-profile already exists. ERROR status code: 409, request id: bac6acb9-20e6-11e9-854e-352154b78abb ``` I don't know where I can delete the profile on AWS console so I could not test more. -Jooho
At step 2, you lost the ability to clean up your resources. Attempting to create a new cluster doesn't help because every cluster is assigned a random, unique identifier. You must keep that directory if you want to eventually destroy the cluster. You can follow https://access.redhat.com/solutions/3826921 to recover the information needed to destroy the cluster.
I tried to recover the information but I could get any output from this command. The CLUSTER_NAME is ocp40 that I used for installation. ~~~ AWS_REGION="us-east-1"; CLUSTER_NAME="ocp40"; aws ec2 describe-instances --filter "Name=tag:clusterid,Values=${ocp40}" --query 'Reservations[*].Instances[*].[Tags[?Key==`openshiftClusterID`].Value]' | jq '.[0][0][0][0]' aws ec2 describe-instances --filter "Name=tag:clusterid,Values=${ocp40}" ~~~ By the way, I had failed to install the OCP because of maximum load balancer limit so I could not create any EC2 instances. Hence, it should show anything with the above command.
(In reply to jooho lee from comment #7) > By the way, I had failed to install the OCP because of maximum load balancer > limit so I could not create any EC2 instances. Hence, it should show > anything with the above command. The VPC is closer to the root of the dependency graph, so it may be more reliable. Try: $ aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))' Also, based on your topic post, your cluster name is "ocp4" not "ocp40". For example, as seen in arn:aws:iam::694280550618:role/ocp4-bootstrap-role. You might want to double-check your region too; you'll need to adjust if you weren't using us-east-1.
Thanks Trevor, I tested 3 times with ocp4, ocp40 and jlee-ocp4. That's why there were 2 params. From my testing, I deleted objects with the new metadata using 'ocp4' cluster uuid that is from the command you provided. Then retry to install with the same cluster name(ocp4) I saw the profiles were deleted by log ``` INFO Disassociated IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role INFO Deleted IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile INFO Deleted ``` However, still I encounter the same issues. ~~~ ERROR 2 errors occurred: ERROR * module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. ERROR status code: 409, request id: 76746af6-2344-11e9-b795-e1661da7aeec ERROR ERROR ERROR * module.masters.aws_iam_instance_profile.master: 1 error occurred: ERROR * aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. ERROR status code: 409, request id: 766a0a98-2344-11e9-b795-e1661da7aeec ~~~ This is how I did: ~~~ $ export CLUSTER_NAME=$(aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))') $ echo "{\"clusterName\":\"${CLUSTER_NAME}\",\"clusterID\":\"${CLUSTER_UUID}\",\"aws\":{\"region\":\"${AWS_REGION}\",\"identifier\":[{\"openshiftClusterID\":\"${CLUSTER_UUID}\"},{\"kubernetes.io/cluster/${CLUSTER_NAME}\":\"owned\"}]}}" > metadata.json $ ./openshift-install-0.10.1 destroy cluster --log-level=debug $ ./openshift-install-0.10.1 create cluster --log-level=debug ~~~
> $ export CLUSTER_NAME=$(aws ec2 --region us-east-1 describe-vpcs --output json | jq '.Vpcs[] | {"name": ([.Tags[] | select(.Key == "Name") | .Value][0]), "openshiftClusterID": ([.Tags[] | select(.Key == "openshiftClusterID") | .Value][0])} | select(.name | contains("ocp4"))') > > $ echo "{\"clusterName\":\"${CLUSTER_NAME}\",\"clusterID\":\"${CLUSTER_UUID}\",\"aws\":{\"region\":\"${AWS_REGION}\",\"identifier\":[{\"openshiftClusterID\":\"${CLUSTER_UUID}\"},{\"kubernetes.io/cluster/${CLUSTER_NAME}\":\"owned\"}]}}" > metadata.json That describe-vpcs call should dump out JSON with cluster-name, cluster-ID pairs. And you already know the cluster name (it's the ocp4 you feed into that describe-vpcs command). The describe-vpcs command lets you find cluster ID(s) for a given cluster name. So you want: $ CLUSTER_NAME=ocp4 $ CLUSTER_UUID=something-you-got-from-the-describe-vpcs-call $ echo ...
@Trevor, Yes I got the UUID and successfully finish the openshift-install destroy cluster. However, when I try to create cluster again with the same cluster name. Still, I got this errors(I will attach full log): ~~~ ERROR * module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. ERROR status code: 409, request id: 154dc012-23cf-11e9-a975-d96e17ae8e64 ERROR ERROR ERROR * module.masters.aws_iam_instance_profile.master: 1 error occurred: ERROR * aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. ERROR status code: 409, request id: 15c6e71d-23cf-11e9-a975-d96e17ae8e64 ~~~ FYI, I paste this log when I execute `openshift-install destroy cluster` before executing `openshift-install create cluster` ``` $ ./openshift-install-0.10.1 destroy cluster --log-level=debug DEBUG search for and delete matching resources by tag in us-east-1 matching aws.Filter{"openshiftClusterID":"e5130fc1-ad94-49ad-ac0c-ec051942e0be"} DEBUG search for and delete matching resources by tag in us-east-1 matching aws.Filter{"kubernetes.io/cluster/ocp4":"owned"} DEBUG no deletions from us-east-1, removing client DEBUG search for IAM roles DEBUG search for IAM users ``` -Jooho
Created attachment 1524630 [details] create cluster log
> ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists > ... > FYI, I paste this log when I execute `openshift-install destroy cluster` before executing `openshift-install create cluster`.... The delete logs say notging about removing instance profiles, so I think they're still leftover from previous runs. Can you check the tags on ocp4-bootstrap-profile? And paste the metadata.json you used?
I can not reproduce such issue, https://access.redhat.com/solutions/3826921 is working well in my testing. # ./openshift-install version ./openshift-install v0.11.0 Here is my steps: 1. run `openshift-install create install-config` to generate a install-config file 2. mkdir test1 test2 cleanup 3. cp ./testing-install-config.yaml ./test1; ./openshift-install create cluster --dir ./test1; installation is completed 4. Following https://access.redhat.com/solutions/3826921 to generate metadata.json file into cleanup folder 5. ./openshift-install destroy cluster --dir ./cleanup; destroy is completed 6. cp ./testing-install-config.yaml ./test2; ./openshift-install create cluster --dir ./test2; installation is completed, cluster is running well.
Thanks Johnny, Actually, I update the KCS dramatically and also checked the steps are working fine. There are something different from my reproduce steps: ~~~ 3. cp ./testing-install-config.yaml ./test1; ./openshift-install create cluster --dir ./test1; installation is completed <==== ==> I have never completed the installation because of loadbalancer/VPC limit so the creating process failed If I can install the ocp4 successfully, I can say it is because of the failure or not but.. still the limitation issues are around since 23th(installation party) 4. Following https://access.redhat.com/solutions/3826921 to generate metadata.json file into cleanup folder 5. ./openshift-install destroy cluster --dir ./cleanup; destroy is completed ==> I can see the "destroy" is completed as well but when I try to install again with the same cluster-name, then it showed the error messages( I didn't use metadata.json file that is generated before) 6. cp ./testing-install-config.yaml ./test2; ./openshift-install create cluster --dir ./test2; installation is completed, cluster is running well. <=== ( I didn't use metadata.json file that is generated before)
> ... I didn't use metadata.json file that is generated before... This still leaves me suspicious that your issie is an incorrectly reconstituted metadata.json. can you provide tge information requested in comment 13?
Sorry Trevor, I missed the comment. Can you check the tags on ocp4-bootstrap-profile? ==> where can I check the tags? (I am not familiar with aws yet) And paste the metadata.json you used? ~~~ {"clusterName":"ocp4","clusterID":"b78c01d7-9a47-479c-8476-f43e60ebd704","aws":{"region":"us-east-1","identifier":[{"openshiftClusterID":"b78c01d7-9a47-479c-8476-f43e60ebd704"},{"kubernetes.io/cluster/ocp4":"owned"}]}} ~~~ -Jooho
> > Can you check the tags on ocp4-bootstrap-profile? > ==> where can I check the tags? (I am not familiar with aws yet) Ah, sorry, looks like AWS still doesn't support tagging instance profiles. We currently delete these when we delete the associated instance and/or IAM role (which *are* tagged). But if you somehow get those deleted, you may be left with dangling instance profiles. We try hard to not remove the instances or roles before we have the instance profiles removed, but you can get into that state if something else is removing resources without being as careful, or maybe we have bugs in our guards. Can you check to see if you have instances or roles left with your name? Try: $ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text $ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role $ aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role
$ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text ~~~ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'kubernetes.io/cluster/ocp4' && Value == 'owned']].ResourceARN" --output text arn:aws:ec2:us-east-1:694280550618:dhcp-options/dopt-09408a14d9446c1a1 arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-01c38c3b50ca34a17 arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-01f0fdd0812952b45 arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-02bac15763c765196 arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-050c1babb311635af arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-05ba35bd608f15dcb arn:aws:ec2:us-east-1:694280550618:elastic-ip/eipalloc-0dd103bf30df580ff arn:aws:ec2:us-east-1:694280550618:internet-gateway/igw-013a7893d031db052 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-005b489e99acbb87c arn:aws:ec2:us-east-1:694280550618:natgateway/nat-00eaceeb95e59c724 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-013d364ef9b18b5afarn:aws:ec2:us-east-1:694280550618:natgateway/nat-015c4ee6e231f8623 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-01c60c08fcc7ff4e6 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-02aab6f8dde7adb12arn:aws:ec2:us-east-1:694280550618:natgateway/nat-02f0ba9daab849665 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-035252b1e4816cc79 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-039bb3a9488994772 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-03cb6d7507bb36801 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0490a47137ffcb499arn:aws:ec2:us-east-1:694280550618:natgateway/nat-062978066eeea6fc4 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-080a251ef00287f1e arn:aws:ec2:us-east-1:694280550618:natgateway/nat-089f3d0b0724fc167 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-094a3410c70099f3b arn:aws:ec2:us-east-1:694280550618:natgateway/nat-098a03cdce2e62781arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0af9fec9abebeeda1 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0b5ce15ebf9914806 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0bbdbc1eb8de8807d arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0d7c0ea7a73addc42 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0d925a7abc3ea5b5c arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0e4dde24f1156a2dfarn:aws:ec2:us-east-1:694280550618:natgateway/nat-0efc16e17b00ec201 arn:aws:ec2:us-east-1:694280550618:natgateway/nat-0f460db10b6d1c915 arn:aws:ec2:us-east-1:694280550618:route-table/rtb-000e64c23fac4f32b arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0229d14b5196a264a arn:aws:ec2:us-east-1:694280550618:route-table/rtb-02754d2148aee8f36 arn:aws:ec2:us-east-1:694280550618:route-table/rtb-03b274238c896061d arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0410f2d605870d1f5 arn:aws:ec2:us-east-1:694280550618:route-table/rtb-05ae27866ffa2990b arn:aws:ec2:us-east-1:694280550618:route-table/rtb-06ebbf11cfc9fcfcc arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0755be223fa4a0a58 arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0ca2ed23df356274d arn:aws:ec2:us-east-1:694280550618:route-table/rtb-0d98eb10bcb98a0c7 arn:aws:ec2:us-east-1:694280550618:security-group/sg-003d08fa9cc4aa967 arn:aws:ec2:us-east-1:694280550618:security-group/sg-04c8855918950681f arn:aws:ec2:us-east-1:694280550618:security-group/sg-09b782112b56814ec arn:aws:ec2:us-east-1:694280550618:security-group/sg-0ddfbef29db01a641 arn:aws:ec2:us-east-1:694280550618:security-group/sg-0e9de1887f12bfca1 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-019318f9a8185c6f3 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0338973fc756753a1 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-03c7e4f09165b14ab arn:aws:ec2:us-east-1:694280550618:subnet/subnet-06436774ab1f10da4 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-065b141b4e19064ce arn:aws:ec2:us-east-1:694280550618:subnet/subnet-08084b58c4543844e arn:aws:ec2:us-east-1:694280550618:subnet/subnet-09ca3b5898017e509 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0ac20b8dd00f9674c arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0b010217d5b058b50 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0c63ddb163f861fa2 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0d87bb6624753f1f5 arn:aws:ec2:us-east-1:694280550618:subnet/subnet-0fed3ca6c6a603f3e arn:aws:ec2:us-east-1:694280550618:vpc/vpc-04aa480d1bf9ef648 arn:aws:elasticloadbalancing:us-east-1:694280550618:loadbalancer/net/ocp4-ext/1126b1bd05fd3c6b arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-api-ext/22a8ef37515a86c7 arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-api-int/00381ff98fe0f1a7 arn:aws:elasticloadbalancing:us-east-1:694280550618:targetgroup/ocp4-services/16ea19be3886a5c0 arn:aws:route53:::hostedzone/ZHUTW6DPO4S38 ~~~ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role ~~ Invalid choice: 'list-role-tags', maybe you meant: * list-roles ~~~ Seems the command is wrong..
oh.. it is because of awscli version.. I will download new version and will try.
> Invalid choice: 'list-role-tags', maybe you meant... It's newish functionality, but it's there [1]. Maybe install a more-recent aws command? Or use the AWS web console (although I'm not sure quite where you'd look). > arn:aws:ec2:us-east-1:694280550618:vpc/vpc-04aa480d1bf9ef648 But this should be sufficient for discovering your tags. Try: $ aws --region us-east-1 ec2 describe-vpcs --vpc-ids vpc-04aa480d1bf9ef648 [1]: https://docs.aws.amazon.com/cli/latest/reference/iam/list-role-tags.html
Here we go.. $ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role False TAGS openshiftClusterID b78c01d7-9a47-479c-8476-f43e60ebd704 TAGS Name ocp4-bootstrap $ aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role False TAGS openshiftClusterID b78c01d7-9a47-479c-8476-f43e60ebd704 TAGS kubernetes.io/cluster/ocp4 owned
> Here we go.. > > $ aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role > False > TAGS openshiftClusterID b78c01d7-9a47-479c-8476-f43e60ebd704 > TAGS Name ocp4-bootstrap That matches your manifests.json from comment 17. You fed that into 'openshift-install destroy cluster' and it discovered no resources (per comment 11)? Can you run that deletion again and attach .openshift_install.log?
Created attachment 1525588 [details] destroy log
I attach the log but the destroy command never finish...so I stop it then upload the log.
> ... but the destroy command never finish... This means "the installer is failing to destroy resources associated with the cluster, and is repeatedly trying in hopes that it will eventually succeed". From the tail of your logs: time="2019-01-31T15:35:41-05:00" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"openshiftClusterID\":\"b78c01d7-9a47-479c-8476-f43e60ebd704\"}" time="2019-01-31T15:35:41-05:00" level=debug msg="deleting arn:aws:ec2:us-east-1:694280550618:dhcp-options/dopt-09408a14d9446c1a1: unrecognized EC2 resource type dhcp-options" time="2019-01-31T15:35:41-05:00" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"kubernetes.io/cluster/ocp4\":\"owned\"}" which means the installer you used to create the cluster included [1] (released with 0.11.0), but the installer you're using to destroy the cluster does not. Try using 0.11.0 to destroy the cluster. $ grep 'info.*\(instance-profile\|role\)' /tmp/openshift_install.log time="2019-01-31T14:15:28-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role policy=ocp4-bootstrap-policy time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-bootstrap-role" id=ocp4-bootstrap-role name=ocp4-bootstrap-role time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role policy=ocp4_master_policy time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-master-role" id=ocp4-master-role name=ocp4-master-role time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role policy=ocp4_worker_policy time="2019-01-31T14:15:29-05:00" level=info msg=Disassociated IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile role=ocp4-worker-role time="2019-01-31T14:15:29-05:00" level=info msg=Deleted IAM instance profile="arn:aws:iam::694280550618:instance-profile/ocp4-worker-profile" arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-profile time="2019-01-31T14:15:29-05:00" level=info msg=Deleted arn="arn:aws:iam::694280550618:role/ocp4-worker-role" id=ocp4-worker-role name=ocp4-worker-role This has all three roles being deleted, but only the worker instance profile being deleted. But it looks like you're using your metadata.json from a recent (failed) creation, which will not have the right cluster ID for your orphaned resources. Try again with your reconstructed metadata.json from comment 17. Comparing with a recent CI run: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1166/pull-ci-openshift-installer-master-e2e-aws/3314/artifacts/e2e-aws/installer/.openshift_install.log | grep 'info.*\(instance-profile\|role\)' time="2019-01-31T19:44:43Z" level=info msg=Disassociated IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-master-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-096e3aab076389774" id=i-096e3aab076389774 name=ci-op-cm66vt6r-1d3f3-master-profile role=ci-op-cm66vt6r-1d3f3-master-role time="2019-01-31T19:44:43Z" level=info msg=Deleted IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-master-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-096e3aab076389774" id=i-096e3aab076389774 name=ci-op-cm66vt6r-1d3f3-master-profile time="2019-01-31T19:44:47Z" level=info msg=Disassociated IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-worker-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-03d0b17820c3bd5a0" id=i-03d0b17820c3bd5a0 name=ci-op-cm66vt6r-1d3f3-worker-profile role=ci-op-cm66vt6r-1d3f3-worker-role time="2019-01-31T19:44:47Z" level=info msg=Deleted IAM instance profile="arn:aws:iam::460538899914:instance-profile/ci-op-cm66vt6r-1d3f3-worker-profile" arn="arn:aws:ec2:us-east-1:460538899914:instance/i-03d0b17820c3bd5a0" id=i-03d0b17820c3bd5a0 name=ci-op-cm66vt6r-1d3f3-worker-profile time="2019-01-31T19:46:50Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-master-role" id=ci-op-cm66vt6r-1d3f3-master-role name=ci-op-cm66vt6r-1d3f3-master-role policy=ci-op-cm66vt6r-1d3f3_master_policy time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-master-role" id=ci-op-cm66vt6r-1d3f3-master-role name=ci-op-cm66vt6r-1d3f3-master-role time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-worker-role" id=ci-op-cm66vt6r-1d3f3-worker-role name=ci-op-cm66vt6r-1d3f3-worker-role policy=ci-op-cm66vt6r-1d3f3_worker_policy time="2019-01-31T19:46:51Z" level=info msg=Deleted arn="arn:aws:iam::460538899914:role/ci-op-cm66vt6r-1d3f3-worker-role" id=ci-op-cm66vt6r-1d3f3-worker-role name=ci-op-cm66vt6r-1d3f3-worker-role which is deleting all three instance profiles. [1]: https://github.com/openshift/installer/pull/1116
Ok. This is result. 1. I destroy with 0.11.0 binary (log: openshift_install.log.destroy_before_create) Looks it deletes the profile object (aws --region us-east-1 iam list-role-tags --role-name ocp4-bootstrap-role) An error occurred (NoSuchEntity) when calling the ListRoleTags operation: The role with name ocp4-bootstrap-role cannot be found. 2. Try to create a cluster with the same clustername "ocp4" (log: openshift_install.log_create_after_destroy ) Still the error messages show the ERROR * module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error occurred: ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile ocp4-bootstrap-profile already exists. ERROR status code: 409, request id: 0284d1cd-259e-11e9-9ec7-7778711760e6 ERROR ERROR ERROR * module.masters.aws_iam_instance_profile.master: 1 error occurred: ERROR * aws_iam_instance_profile.master: Error creating IAM instance profile ocp4-master-profile: EntityAlreadyExists: Instance Profile ocp4-master-profile already exists. ERROR status code: 409, request id: 02837224-259e-11e9-9385-61fa1054d42b Checked the tag: ( aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role False TAGS openshiftClusterID 782fec19-527a-4c9d-a949-34bfe2b2ea6a TAGS kubernetes.io/cluster/ocp4 owned) cat metadata.json ~~~ cat metadata.json {"clusterName":"ocp4","clusterID":"782fec19-527a-4c9d-a949-34bfe2b2ea6a","aws":{"region":"us-east-1","identifier":[{"openshiftClusterID":"782fec19-527a-4c9d-a949-34bfe2b2ea6a"},{"kubernetes.io/cluster/ocp4":"owned"}]}} ~~~ 3. Tried to destroy the cluster. (log: openshift_install.log_destroy_after_create) looks it deleted the profile ~~~ aws --region us-east-1 iam list-role-tags --role-name ocp4-master-role An error occurred (NoSuchEntity) when calling the ListRoleTags operation: The role with name ocp4-master-role cannot be found. ~~~ I attach 3 logs
Created attachment 1525599 [details] log for destroy cluster with 0.11.0
Created attachment 1525600 [details] Create cluster log with 0.11.0 after destroy
Created attachment 1525601 [details] Destroy cluster log with 0.11.0 after create fail
(In reply to jooho lee from comment #27) > 1. I destroy with 0.11.0 binary (log: > openshift_install.log.destroy_before_create) > Looks it deletes the profile object > (aws --region us-east-1 iam list-role-tags --role-name > ocp4-bootstrap-role) > An error occurred (NoSuchEntity) when calling the ListRoleTags > operation: The role with name ocp4-bootstrap-role cannot be found. I don't know what deleted the bootstrap role, but the logs you attached only look like they delete the DHCP options (they also claim to delete NAT gateways, but AWS keeps those around for a while, so you can get multiple "successful" delete calls for a single NAT gateway). And a missing role doesn't necessarily mean successful instance-profile cleanup. > 2. Try to create a cluster with the same clustername "ocp4" (log: > openshift_install.log_create_after_destroy ) > Still the error messages show the > > ERROR * module.bootstrap.aws_iam_instance_profile.bootstrap: 1 error > occurred: > ERROR * aws_iam_instance_profile.bootstrap: Error creating IAM instance > profile ocp4-bootstrap-profile: EntityAlreadyExists: Instance Profile > ocp4-bootstrap-profile already exists. Instead of running new creates, you should just check for your dangling instance profiles. Something like: $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-")) should do it. Then probably just remove them with: $ aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-bootstrap-profile $ aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-master-profile and then keep track of your deletion logs going forward, so you can paste the last deletion before you hit this next time.
(In reply to W. Trevor King from comment #31) > $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-")) Oops, should have been: $ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))]'
There are still the profiles. Acutally, more than what I expected. ~~ aws --region us-east-1 iam list-instance-profiles --output json | jq '[.InstanceProfiles[] | select(.InstanceProfileName | startswith("ocp4-"))]' [ { "Path": "/", "InstanceProfileName": "ocp4-bootstrap-profile", "InstanceProfileId": "AIPAJHAD5ZBOR4OBCAE2O", "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-bootstrap-profile", "CreateDate": "2019-01-24T01:38:23Z", "Roles": [] }, { "Path": "/", "InstanceProfileName": "ocp4-master-profile", "InstanceProfileId": "AIPAIAQ6ZSIDNVT3XBSEI", "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-master-profile", "CreateDate": "2019-01-24T01:38:23Z", "Roles": [] }, { "Path": "/", "InstanceProfileName": "ocp4-on-aws-master-profile", "InstanceProfileId": "AIPAIIF6ST2CL6L3U724A", "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-on-aws-master-profile", "CreateDate": "2019-01-18T10:11:57Z", "Roles": [ { "Path": "/", "RoleName": "ocp4-on-aws-master-role", "RoleId": "AROAIEEWXSKPWJQLPVTGU", "Arn": "arn:aws:iam::694280550618:role/ocp4-on-aws-master-role", "CreateDate": "2019-01-18T10:11:55Z", "AssumeRolePolicyDocument": { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } } ] }, { "Path": "/", "InstanceProfileName": "ocp4-on-aws-worker-profile", "InstanceProfileId": "AIPAJG24K437BY6IETETE", "Arn": "arn:aws:iam::694280550618:instance-profile/ocp4-on-aws-worker-profile", "CreateDate": "2019-01-18T10:11:57Z", "Roles": [ { "Path": "/", "RoleName": "ocp4-on-aws-worker-role", "RoleId": "AROAI5W232Z7R5S4DIOMM", "Arn": "arn:aws:iam::694280550618:role/ocp4-on-aws-worker-role", "CreateDate": "2019-01-18T10:11:55Z", "AssumeRolePolicyDocument": { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } } ] } ] ~~~ I tried to delete them all but failed ~~~ aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-bootstrap-profile aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-master-profile aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-master-profile An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first. aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-worker-profile An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first ~~~ As you see, the last 2 profile cannot be deleted. Should I delete them? If so, how can I delete them?
> aws --region us-east-1 iam delete-instance-profile --instance-profile-name ocp4-on-aws-master-profile > An error occurred (DeleteConflict) when calling the DeleteInstanceProfile operation: Cannot delete entity, must remove roles from instance profile first. $ aws --region us-east-1 iam remove-role-from-instance-profile --instance-profile-name ocp4-on-aws-master-profile --role-name ocp4-on-aws-master-role and similar for your other instance-profile / role pairs.
OK I delete them all. Now, should I try to create a cluster for test?
(In reply to jooho lee from comment #35) > Now, should I try to create a cluster for test? Yes, please :)
Finally, I install OCP4 on AWS. I am going to create KCSs regarding this bugzilla ticket and let you know. Moreover, I will retest it again and will confirm it. Thanks a lot Trevor!.