Bug 2079844 - EFS cluster csi driver status stuck in AWSEFSDriverCredentialsRequestControllerProgressing with sts installation
Summary: EFS cluster csi driver status stuck in AWSEFSDriverCredentialsRequestControll...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.11.0
Assignee: Hemant Kumar
QA Contact: Rohit Patil
URL:
Whiteboard:
Depends On:
Blocks: 2095253
TreeView+ depends on / blocked
 
Reported: 2022-04-28 11:16 UTC by Meng Bo
Modified: 2022-08-10 11:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2095253 (view as bug list)
Environment:
Last Closed: 2022-08-10 11:09:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift aws-efs-csi-driver-operator pull 43 0 None Merged Bug 2079844: Do not sync credentials in manual mode 2022-05-27 14:26:46 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:09:32 UTC

Description Meng Bo 2022-04-28 11:16:57 UTC
Description of problem:
The efs.csi.aws.com clustercsidriver status is stuck in AWSEFSDriverCredentialsRequestControllerProgressing on the STS cluster.

Version-Release number of selected component (if applicable):
ClusterID: 7689ea60-fdff-4624-8fe0-74a1ced59c8b
ClusterVersion: Stable at "4.10.9"
ClusterOperators:
	clusteroperator/cloud-credential is not upgradeable because Upgradeable annotation cloudcredential.openshift.io/upgradeable-to on cloudcredential.operator.openshift.io/cluster object needs updating before upgrade. See Manually Creating IAM documentation for instructions on preparing a cluster for upgrade.

How reproducible:
2 of my 2 tries

Steps to Reproduce:
Generally by following the steps in https://docs.openshift.com/container-platform/4.10/storage/container_storage_interface/persistent-storage-csi-aws-efs.html

1. Install cluster with STS method
2. Install the efs-csi-operator from the webconsole
3. Generate the STS role and policy for efs access
$ cat credrequest/cr.yaml
apiVersion: cloudcredential.openshift.io/v1
kind: CredentialsRequest
metadata:
  name: openshift-aws-efs-csi-driver
  namespace: openshift-cloud-credential-operator
spec:
  providerSpec:
    apiVersion: cloudcredential.openshift.io/v1
    kind: AWSProviderSpec
    statementEntries:
    - action:
      - elasticfilesystem:*
      effect: Allow
      resource: '*'
  secretRef:
    name: aws-efs-cloud-credentials
    namespace: openshift-cluster-csi-drivers
  serviceAccountNames:
  - aws-efs-csi-driver-operator
  - aws-efs-csi-driver-controller-sa

$ ./ccoctl aws create-iam-roles --credentials-requests-dir ./credrequest/ --identity-provider-arn <arn for the oidc idp> --name efs --region us-east-1

4. Create the secret generated above
$ oc create -f manifests/openshift-cluster-csi-drivers-aws-efs-cloud-credentials-credentials.yaml

5. Create the credential request manually
$ oc create -f ./credrequest/cr.yaml

6. Create the efs csi driver
$ cat csi.yaml
apiVersion: operator.openshift.io/v1
kind: ClusterCSIDriver
metadata:
  name: efs.csi.aws.com
spec:
  managementState: Managed

$ oc create -f csi.yaml

6. Check the ClusterCSIDriver status


Actual results:
It stuck in AWSEFSDriverCredentialsRequestControllerProgressing

  - lastTransitionTime: "2022-04-28T08:50:33Z"
    message: Credentials not yet provisioned by cloud-credential-operator
    reason: CredentialsNotProvisionedYet
    status: "False"
    type: AWSEFSDriverCredentialsRequestControllerAvailable
  - lastTransitionTime: "2022-04-28T08:50:33Z"
    message: Waiting for cloud-credential-operator to provision the credentials
    reason: CredentialsNotProvisionedYet
    status: "True"
    type: AWSEFSDriverCredentialsRequestControllerProgressing
  - lastTransitionTime: "2022-04-28T08:50:33Z"
    reason: AsExpected
    status: "False"
    type: AWSEFSDriverCredentialsRequestControllerDegraded


Expected results:
The EFS csi driver should be available.


Additional info:
must gather will be attached.

Comment 4 Hemant Kumar 2022-05-05 18:28:54 UTC
Well the efs operator should not create or sync CredentialRequest at all when CloudCredential is in Manual code (which it is, in STS mode) - so I am fixing this via https://github.com/openshift/library-go/pull/1363 

Once that merges, I will backport it to EFS operator. Are you saying that, there are two different CredentialRequests created in your case? (one subtly different from another). Again - seems like a bug in how CredentialRequest should not be synced in manual mode.


> Why the policy with https://github.com/openshift/aws-efs-csi-driver/blob/master/docs/iam-policy-example.json could work, but create it with below does not
    - action:
      - elasticfilesystem:*
      effect: Allow
      resource: '*'


Not sure. let me verify.

Comment 8 Hemant Kumar 2022-05-31 16:06:16 UTC
In STS mode user is responsible for creating and granting necessary permissions and yes obviously I expect the results to vary from profile to profile. It looks like with one profile we did not had enough permissions but with other one we did. 

So I am not sure why this BZ "FailedQA". This bug was not about fixing permissions associated with driver or operator anyways. The thing to look for is - AWSEFSDriverCredentialsRequestControllerProgressing should not be stuck.

Comment 12 errata-xmlrpc 2022-08-10 11:09:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.