+++ This bug was initially created as a clone of Bug #1945907 +++ This is a clone of https://issues.redhat.com/browse/CORS-1653. 1. Proposed title of this feature request Permit using existing IAM roles for bootstrap, worker and control plane nodes in installer. Implementation should support AWS, Azure, and GCP. 2. What is the nature and description of the request? Enhance the installer to allow the customer to pre-create the IAM roles used by the bootstrap, worker and control plane nodes and supply those roles to the installer in IPI mode. 3. Why does the customer need this? (List the business requirements here) It is currently impossible to perform an IPI mode installation of OCP in the public cloud with additional restrictions. For instance, some customers require that all roles match a specific naming scheme and/or include a predefined permissions boundary in the role creation process. 4. List any affected packages or components. Installer
verified. FAILED. OCP version: 4.8.0-0.nightly-2021-04-05-174735 Installer skipped user provided IAM role in `platform.aws.iamRole`, it created its own IAM roles (just like a normal IPI) to finish the installation process. Per [1], it should use existing IAM role in `platform.aws.iamRole` in install-config.yaml. [1] https://github.com/smrowley/installer/blob/7c54988f0be7cb44822a14cf2d4708adcf72abcb/data/data/install.openshift.io_installconfigs.yaml#L919-L923 Steps to Reproduce: 1. Create install-conifg.yaml apiVersion: v1 baseDomain: qe.devcluster.openshift.com compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: aws: {} replicas: 3 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: aws: {} replicas: 3 metadata: creationTimestamp: null name: yunjiang-eplat networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: aws: region: us-east-2 iamRole: existing_iam_master publish: External pullSecret: <HIDDEN> sshKey: <HIDDEN> 2. Create cluster. Actual results: 1. after checking aws web consoel and install log, policies and roles were created for bootstrap, master and worker: time="2021-04-06T04:43:07-04:00" level=debug msg="module.masters.aws_iam_role_policy.master_policy[0]: Creation complete after 1s [id=yunjiang-eplat-cgdm2-master-role:yunjiang-eplat-cgdm2-master-policy]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.masters.aws_iam_role.master_role[0]: Creation complete after 0s [id=yunjiang-eplat-cgdm2-master-role]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.masters.aws_iam_role_policy.master_policy[0]: Creation complete after 1s [id=yunjiang-eplat-cgdm2-master-role:yunjiang-eplat-cgdm2-master-policy]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.iam.aws_iam_role_policy.worker_policy[0]: Creation complete after 1s [id=yunjiang-eplat-cgdm2-worker-role:yunjiang-eplat-cgdm2-worker-policy]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.iam.aws_iam_role.worker_role[0]: Creation complete after 0s [id=yunjiang-eplat-cgdm2-worker-role]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.iam.aws_iam_role_policy.worker_policy[0]: Creation complete after 1s [id=yunjiang-eplat-cgdm2-worker-role:yunjiang-eplat-cgdm2-worker-policy]" time="2021-04-06T04:43:07-04:00" level=debug msg="module.bootstrap.aws_iam_role_policy.bootstrap[0]: Creation complete after 1s [id=yunjiang-eplat-cgdm2-bootstrap-role:yunjiang-eplat-cgdm2-bootstrap-policy]" 2. the profiles attached to cluster instances contain new created roles, instead of existing role existing_iam_master Expected results: * No new role were created. * The profiles attached to cluster instances contain existing role existing_iam_master Additional info: following config works as expect: apiVersion: v1 baseDomain: qe.devcluster.openshift.com compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: aws: iamRole: existing_iam_worker replicas: 3 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: aws: iamRole: existing_iam_master replicas: 3 metadata: creationTimestamp: null name: yunjiang-e1 networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: aws: region: us-east-2 publish: External pullSecret: <HIDDEN> sshKey: <HIDDEN>
@yunjiang You are setting the incorrect field. You need to set the `iamRole` field on the machine pool, so `.controlPlane.platform.aws.iamRole` and `.compute[0].platform.aws.iamRole`. $ openshift-install explain installconfig.controlPlane.platform.aws.iamRole KIND: InstallConfig VERSION: v1 RESOURCE: <string> IAMRole is the name of the IAM Role to use for the instance profile of the machine. Leave unset to have the installer create the IAM Role on your behalf. $ openshift-install explain installconfig.compute.platform.aws.iamRole KIND: InstallConfig VERSION: v1 RESOURCE: <string> IAMRole is the name of the IAM Role to use for the instance profile of the machine. Leave unset to have the installer create the IAM Role on your behalf.
If you want to use the same role for both the control plane and compute nodes, you can set the `.platform.aws.defaultMachinePlatform.iamRole` field. $ openshift-install explain installconfig.platform.aws.defaultMachinePlatform.iamRole KIND: InstallConfig VERSION: v1 RESOURCE: <string> IAMRole is the name of the IAM Role to use for the instance profile of the machine. Leave unset to have the installer create the IAM Role on your behalf.
Thanks Matthew, yes, the `.controlPlane.platform.aws.iamRole` and `.compute[0].platform.aws.iamRole` works as expect, I will try installconfig.platform.aws.defaultMachinePlatform.iamRole I also got a question regarding user could provide one role for both mast and worker, this means, the worker has many permissions it doesn’t require, since master role contains all permissions for worker, should we add some warning messages to let user know that the worker instances are using the same role as master?
I, personally, don't think that we need warnings. The user is in charge of giving the permissions to the roles. The installer is not checking the permissions given to the roles. So if the user gives more permissions to a role than are required, we will not warn the user. Along those same lines, I don't see why we would warn the user if the used the same role for the masters and workers, whether by giving the same name to the individual machine pools or by using the default machine platform. I would not recommend that user's set the IAM role in the default machine platform. I do not think that it is something that we should go out of our way to document. But we need to support it, since any field in the default machine platform must be honored. We don't have a way to exclude fields that are in the machine pool platforms from also being in the default machine platform.
(In reply to Matthew Staebler from comment #6) > The installer is not checking the permissions given to the roles. installer will do permission check for IAM user used by `openshift-install create clsuter`, if some of required permissions of user provided roles are missing, it will cause install failure, I think the user experience will be better if we could check if some required permissions are missing before installing cluster, like checking IAM user permissions. another issue is after cluster destroyed, the shared tag in IAM role was not removed, looks like we hit the same issue in bug 1926547, will it be fixed by bug 1926547?
(In reply to Yunfei Jiang from comment #7) > (In reply to Matthew Staebler from comment #6) > > The installer is not checking the permissions given to the roles. > installer will do permission check for IAM user used by `openshift-install > create clsuter`, if some of required permissions of user provided roles are > missing, it will cause install failure, I think the user experience will be > better if we could check if some required permissions are missing before > installing cluster, like checking IAM user permissions. Presumably, if the user is supplying their own IAM roles for instances, they are also going to be using manual credentials mode. In that case, the installer will not perform any permissions checking, even of the IAM entity used by the installer. > another issue is after cluster destroyed, the shared tag in IAM role was not > removed, looks like we hit the same issue in bug 1926547, will it be fixed > by bug 1926547? No, it will not be addressed by that BZ. Please open a new BZ, or fail this BZ. The BZ that you linked will address the fact that the destroyer erroneously completes successfully despite not being able to removed the shared tag. But the underlying issue of not being able to remove the shared tag is not addressed. Out of curiosity, does the IAM user that you are using have the `iam:UntagRole` permission? That permission is needed to remove the shared tag from the IAM role.
Thanks Matthew, tested against following four configurations, all clusters could be installed successfully with correct IAM role. There is an issue that mentioned in comment 8, I created bug 1948359 to track this issue, mask this bug as VERIFIED. OCP Version:4.8.0-0.nightly-2021-04-08-043959 > config 1 - master and worker: compute: - architecture: amd64 name: worker platform: aws: iamRole: existing_iam_worker2 controlPlane: name: master platform: aws: iamRole: existing_iam_master2 platform: aws: region: us-east-2 > config 2 - master only: compute: - architecture: amd64 name: worker platform: {} controlPlane: name: master platform: aws: iamRole: existing_iam_master2 platform: aws: region: us-east-2 > config 3 - worker only: compute: - architecture: amd64 name: worker platform: aws: iamRole: existing_iam_worker2 controlPlane: name: master platform: {} platform: aws: region: us-east-2 > config 4 - master and worker (with defaultMachinePlatform): compute: - architecture: amd64 name: worker platform: {} controlPlane: name: master platform: {} platform: aws: region: us-east-2 defaultMachinePlatform: iamRole: existing_iam_master2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438