Bug 1932152

Summary: When “iopsPerGB” parameter is set to a wrong number, events for AWS EBS CSI Driver provisioning are not clear
Product: OpenShift Container Platform Reporter: Qin Ping <piqin>
Component: StorageAssignee: Jan Safranek <jsafrane>
Storage sub component: Kubernetes External Components QA Contact: Qin Ping <piqin>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: aos-bugs, jsafrane
Version: 4.7   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:48:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qin Ping 2021-02-24 05:13:31 UTC
Description of problem:
When “iopsPerGB” parameter is set to a wrong number, events for AWS EBS CSI Driver provisioning are not clear


Version-Release number of selected component (if applicable):
4.7.0-rc.3

How reproducible:
Always

Steps to Reproduce:
Create a storageclass with the following yaml file
$ cat sc-csi.yaml 
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: io1-csi
parameters:
  type: io1
  iopsPerGB: "10"
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
Create a PVC using this storageclass


Actual results:
PV is not provisioned. Check the events:
Warning  ProvisioningFailed    1s    ebs.csi.aws.com_ip-10-0-158-46_2aa28a47-d1e8-4570-9a4d-c4b35ea9f5cf  failed to provision volume with StorageClass "io1-csi": rpc error: code = Internal desc = Could not create volume "pvc-fd6be5e2-32db-487f-b540-c55f0d686e85": could not create volume in EC2: InvalidParameterValue: Volume iops of 50 is too low; minimum is 100.

In this case, “iopsPerGB” is 10, but the event said it’s 50 and minimum is 100. Tried to update this parameter to 200, got the following event:
 Warning  ProvisioningFailed  <invalid>               ebs.csi.aws.com_ip-10-0-158-46_2aa28a47-d1e8-4570-9a4d-c4b35ea9f5cf  failed to provision volume with StorageClass "io1-csi": rpc error: code = Internal desc = Could not create volume "pvc-353cca7c-645e-4f11-b852-e96a40b19472": could not create volume in EC2: InvalidParameterValue: Iops to volume size ratio of 200.000000 is too high; maximum is 50

Set the parameter to “50”, PV is created successfully.


Expected results:
With clearer events to indicate how the “iopsPerGB” parameter should be set here.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Jan Safranek 2021-03-23 08:12:22 UTC
Posted PR upstream: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/809

Comment 2 Jan Safranek 2021-04-14 13:59:16 UTC
Large part (all?) of this is fixed by PR https://github.com/openshift/aws-ebs-csi-driver/pull/184 - you should get much nicer messages now.

There is one "known bug": in-tree volume increases volume IOPS when <iopsPerGB> * <volume size> is lower than the minimum value supported by AWS. The volume plugin will provision the volume with higher IOPS. The CSI driver does *not* increase IOPS, it returns an error instead - increasing IOPS increases also costs, and even if the difference is negligible, they decided to protect the user. We need to fix the CSI translation library + external-provisioner to allow the CSI driver to provision a migrated in-tree volume the same way as in-tree volume plugin, with higher IOPS. Feel free to open a BZ for it, we will get it fixed in 4.9 rebase, or, if we really want, backport to 4.8 is possible. See https://github.com/kubernetes/kubernetes/pull/101082.

Comment 3 Jan Safranek 2021-04-14 14:01:37 UTC
> you should get much nicer messages now.

Forgot to mention, the CSI driver will silently *lower* the IOPS to the highest supported value. Instead of "Iops to volume size ratio of 200.000000 is too high; maximum is 50", you will get no error and volume with 50 IOPS.

Comment 5 Qin Ping 2021-04-25 03:37:24 UTC
 Warning  ProvisioningFailed    <invalid> (x3 over <invalid>)  ebs.csi.aws.com_ip-10-0-207-45_dfcc0b06-63db-4c2a-a9d9-b8e65ad91e3e  failed to provision volume with StorageClass "io1-csi": rpc error: code = Internal desc = Could not create volume "pvc-708985ae-f207-4ae8-99a2-c1058581c684": invalid combination of volume size 1 GB and iopsPerGB 10: the resulting IOPS 10 is too low for AWS, it must be at least 100

  Warning  ProvisioningFailed    <invalid>                      ebs.csi.aws.com_ip-10-0-207-45_dfcc0b06-63db-4c2a-a9d9-b8e65ad91e3e  failed to provision volume with StorageClass "io1-csi": rpc error: code = Internal desc = Could not create volume "pvc-c8241d8f-be1c-4cb6-8e53-72d0741bc093": could not create volume in EC2: InvalidParameterValue: The volume size is invalid for io1 volumes: 1 GiB. io1 volumes must be at least 4 GiB in size. Please specify a volume size above the minimum limit.

Verified with: 4.8.0-0.nightly-2021-04-24-234710

Comment 8 errata-xmlrpc 2021-07-27 22:48:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438