Bug 2057495 - Alibaba Disk CSI driver does not provision small PVCs
Summary: Alibaba Disk CSI driver does not provision small PVCs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: Jan Safranek
QA Contact: Rohit Patil
URL:
Whiteboard:
Depends On:
Blocks: 2076671 2098655
TreeView+ depends on / blocked
 
Reported: 2022-02-23 14:03 UTC by Jan Safranek
Modified: 2022-08-10 10:51 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Alibaba Cloud supports only volumes larger than 20 GiB. Alibaba CSI driver, shipped as part of OpenShift, returned error when user created a PersistentVolumeClaim (PVC) smaller that 20 GiB with a message 'The specified parameter "Size" is not valid'. We updated the Alibaba CSI driver to automatically increase all volume sizes to at least 20 GiB and smaller PVCs are now dynamically provisioned. For example, a PVC requesting 1 byte will result in a new dynamically provisioned 20 GiB volume. This can could result in increased costs. Cluster admins should consider using quota on PVC count for each namespace in restricted environments.
Clone Of:
: 2076671 (view as bug list)
Environment:
Last Closed: 2022-08-10 10:50:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift alibaba-cloud-csi-driver pull 12 0 None Merged Bug 2057495: Add a parameter to the schema to automatically increase volume sizes … 2022-04-20 12:48:22 UTC
Github openshift alibaba-disk-csi-driver-operator pull 25 0 None Merged Bug 2057495: Auto increase volume size to 20 GiB in the default storage class 2022-05-04 08:02:38 UTC
Github openshift library-go pull 1348 0 None Merged Bug 2057495: Re-create StorageClass to change immutable parameters 2022-04-20 12:48:23 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:51:12 UTC

Description Jan Safranek 2022-02-23 14:03:53 UTC
Description of problem:

Alibaba supports volumes only larger than 20 GiB. Generic e2e tests (openshift-tests run openshift/conformance/parallel) create too small PVCs and fail, see:

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_installer/5604/pull-ci-openshift-installer-master-e2e-alibaba/1496424012073406464


At least these tests need to be fixed somehow:

[sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should not deadlock when a pod's predecessor fails 
[sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should perform rolling updates and roll backs of template modifications with PVCs 
[sig-storage] PVC Protection Verify that PVC in active use by a pod is not removed immediately
[sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should provide basic identity
[sig-storage] PVC Protection Verify that scheduling of a pod that uses PVC that is being deleted fails and the pod becomes Unschedulable 
[sig-storage] PVC Protection Verify "immediate" deletion of a PVC that is not in active use by a pod
[sig-apps] StatefulSet Basic StatefulSet functionality [StatefulSetBasic] should adopt matching orphans and release non-matching pods

Comment 2 Jan Safranek 2022-03-01 16:02:25 UTC
Assigning to Alibaba if they want to implement automatic increase of volumes to 20GiB. See above, we run Kubernetes e2e tests with the Alibaba Disk CSI driver in the default storage class and the tests create really small PVCs (1 byte!). The tests expect that the CSI driver provisions the smallest volume for this 1 byte, which is 20GiB in Alibaba Disk case.

The CSI driver gets:

> time="2022-02-23T10:59:08Z" level=info msg="CreateVolume: Starting CreateVolume, pvc-53021639-69e6-4751-b36e-891273a7b3b6,
> name:\"pvc-53021639-69e6-4751-b36e-891273a7b3b6\"
> capacity_range:<required_bytes:1 >
> volume_capabilities:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > >
> parameters: ...

(edited for readability)

limit_bytes is zero, i.e. unspecified. Strictly from the CSI protocol perspective, the CSI driver can provision as large volume as it wants.

I understand that this may bring some additional costs to the customers, ordering 1 byte and paying for 20 GiB is quite a difference. Still, it's better than ordering 1 byte and getting nothing. What do you think? All the other CSI drivers we ship round the volume size to the smallest size they support, which is typically 1 GiB.

Comment 3 Jan Safranek 2022-03-01 16:03:18 UTC
This blocks our CI, all stateful set tests fail. It's not blocking 4.10 in any way, but we should decide what to do about it soon.

Comment 4 Jan Safranek 2022-03-15 09:10:47 UTC
In addition, from user perspective it looks weird if all other CSI drivers increase the volume size to the smallest size they can provision. For most of them it's 1GiB, but for example IBM has 10 GiB minimum and it does increase the volume size too.

Comment 5 Jan Safranek 2022-03-29 09:17:04 UTC
https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/pull/628 got merged upstream, cherry-picking into OCP now.

Comment 9 Rohit Patil 2022-05-04 11:48:12 UTC
Payload: 4.11.0-0.nightly-2022-04-26-181148
Flexy template: ipi-on-alicloud/versioned-installer-ci 
Verifications: PASS
#1 With default sc, pvc with 20Gi file system, dep, write data => Pass
#2 With default sc, pvc with 1Gi file system, dep, write data => Pass 
#3 With default sc, pvc with 20Gi block, dep, write data => Pass
#4 With default sc, pvc with 1Gi block, dep, write data => Pass
#5 With new sc, pvc with 20Gi fs, dep, write data => Pass
#6 With new sc, pvc with 1Gi fs, dep, write data => Pass
#7 With new sc, pvc with 20Gi bl, dep, write data => Pass
#8 With new sc, pvc with 1Gi bl, dep, write data => Pass 
#9 With new sc volumeSizeAutoAvailable: "false", pvc with 1Gi fs, dep, write data => Pass
#10 47918 ali_csi.go => for fstypes(ext4,ext3,xfs) Golang Automtion
   File: https://github.com/openshift/openshift-tests-private/blob/master/test/extended/storage/ali_csi.go#L30

upgrade Payload: 4.10.0-0.nightly-2022-05-03-165256 => 4.11.0-0.nightly-2022-04-26-181148
#11 to check the sc parameters volumeSizeAutoAvailable: "true" => 
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-upgrade/job/upgrade-runner/9975/console

Earlier it was showing as (Status)open for PR:25, for which reason i did not tested immediately, after checking PR, got to know the PR is merged.
Done sync then tested.   
https://github.com/openshift/alibaba-disk-csi-driver-operator/pull/25 => merged

Comment 11 errata-xmlrpc 2022-08-10 10:50:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.