Hide Forgot
Description of problem: Cluster storage operator credentials request for AWS does not include KMS statements. This leads to failure to deploy PVs due to inability to provide a key. Version-Release number of selected component (if applicable): Tested on 4.9.12. How reproducible: Always Steps to Reproduce: 1. Install to restricted AWS environment with out KMS privileges by default 2. Create IAM roles with ccoctl from credentials requests 3. Create pvc in gp2-csi storage class (same problem, deiff error in gp2 SC) cat <<EOF | oc create -n $PROJ -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: data-csi spec: storageClassName: "gp2-csi" resources: requests: storage: 512Mi accessModes: - ReadWriteOnce EOF oc set volume -n $PROJ deployment/demo --add -m /opt/app-root/src/data \ --name=data -t persistentVolumeClaim --claim-name=data-csi oc describe pvc data-csi -n $PROJ Actual results: PVC in gp2-csi class results in: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 27s (x3 over 57s) persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 15s ebs.csi.aws.com_ip-100-127-136-183_16213ddf-eedf-480a-84ed-116b1df1caa6 failed to provision volume with StorageClass "gp2-csi": rpc error: code = Internal desc = Could not create volume "pvc-eeb59df1-f744-46e9-b0c4-288f3c8d1bc1": failed to get an available volume in EC2: InvalidVolume.NotFound: The volume 'vol-0c94abf5d9bfe6680' does not exist. status code: 400, request id: 3c39958c-353a-48c2-bc75-37bd04673718 Normal ExternalProvisioning 12s (x2 over 19s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator Normal Provisioning 8s (x4 over 19s) ebs.csi.aws.com_ip-100-127-136-183_16213ddf-eedf-480a-84ed-116b1df1caa6 External provisioner is provisioning volume for claim "dale/data-1150" Warning ProvisioningFailed 7s (x3 over 14s) ebs.csi.aws.com_ip-100-127-136-183_16213ddf-eedf-480a-84ed-116b1df1caa6 failed to provision volume with StorageClass "gp2-csi": rpc error: code = AlreadyExists desc = Could not create volume "pvc-eeb59df1-f744-46e9-b0c4-288f3c8d1bc1": Parameters on this idempotent request are inconsistent with parameters used in previous request(s) PVC in the gp2 class (using the in-tree driver) results in error providing clue to cause of failure above: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 13s (x3 over 41s) persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 6s persistentvolume-controller Failed to provision volume with StorageClass "gp2": failed to create encrypted volume: the volume disappeared after creation, most likely due to inaccessible KMS encryption key Normal WaitForPodScheduled 6s persistentvolume-controller waiting for pod demo-5c75b5598f-gvpvv to be scheduled Expected results: PV created and PVC bound. After adding to following policy statement to the IAM role used by csi operator pod: { "Sid": "AddKMS0", "Effect": "Allow", "Action": [ "kms:Decrypt", "kms:Encrypt", "kms:GenerateDataKey", "kms:GenerateDataKeyWithoutPlainText", "kms:DescribeKey" ], "Resource": "*" } PVC binds properly: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 31s (x4 over 65s) persistentvolume-controller waiting for first consumer to be created before binding Normal Provisioning 17s ebs.csi.aws.com_ip-100-127-136-183_16213ddf-eedf-480a-84ed-116b1df1caa6 External provisioner is provisioning volume for claim "dale/data-1147" Normal ExternalProvisioning 16s (x3 over 17s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator Normal ProvisioningSucceeded 14s ebs.csi.aws.com_ip-100-127-136-183_16213ddf-eedf-480a-84ed-116b1df1caa6 Successfully provisioned volume pvc-81ea5b8e-bbe9-4d10-9fc4-452a58e66d79 Additional info: The credentials request lacks any KMS actions. https://github.com/openshift/cluster-storage-operator/blob/master/manifests/03_credentials_request_aws.yaml#L20 Contrast this to machine API which uses KMS for boot disk encryption. https://github.com/openshift/machine-api-operator/blob/master/install/0000_30_machine-api-operator_00_credentials-request.yaml#L45-L58
Reproduced in 4.10.2 without fix: $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mypvc Pending gp2-csi-enc 56m Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForPodScheduled 52m persistentvolume-controller waiting for pod mypod to be scheduled Warning ProvisioningFailed 52m ebs.csi.aws.com_ip-10-0-189-237_f9750b96-cdf4-4298-b399-f67e49be6119 failed to provision volume with StorageClass "gp2-csi-enc": rpc error: code = Internal desc = Could not create volume "pvc-047ae64e-8d1c-41d1-8140-7fc91ce1541c": failed to get an available volume in EC2: InvalidVolume.NotFound: The volume 'vol-05326553803a304c6' does not exist. status code: 400, request id: a642efc0-7074-4acb-b0f4-869ec913cd00 Warning ProvisioningFailed 26m (x14 over 52m) ebs.csi.aws.com_ip-10-0-189-237_f9750b96-cdf4-4298-b399-f67e49be6119 failed to provision volume with StorageClass "gp2-csi-enc": rpc error: code = AlreadyExists desc = Could not create volume "pvc-047ae64e-8d1c-41d1-8140-7fc91ce1541c": Parameters on this idempotent request are inconsistent with parameters used in previous request(s) Normal ExternalProvisioning <invalid> (x228 over 52m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator Normal Provisioning <invalid> (x23 over 52m) ebs.csi.aws.com_ip-10-0-189-237_f9750b96-cdf4-4298-b399-f67e49be6119 External provisioner is provisioning volume for claim "wduan/mypvc" [wduan@preserve-wduan-ws ~]$ oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 3h59m gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 3h58m gp2-csi-enc ebs.csi.aws.com Delete WaitForFirstConsumer true 58m gp3-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 3h58m
Verified pass with $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mypvc Bound pvc-097f3046-495d-4b01-90a1-f21bd001bccf 2Gi RWO gp2-csi-enc 8m8s $ oc get pod NAME READY STATUS RESTARTS AGE mypod 1/1 Running 0 7m39s Marked as Verified.
*** Bug 2066813 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069