Bug 2057495
Summary: | Alibaba Disk CSI driver does not provision small PVCs | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jan Safranek <jsafrane> | |
Component: | Storage | Assignee: | Jan Safranek <jsafrane> | |
Storage sub component: | Kubernetes | QA Contact: | Rohit Patil <ropatil> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | unspecified | CC: | aos-bugs | |
Version: | 4.10 | |||
Target Milestone: | --- | |||
Target Release: | 4.11.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Alibaba Cloud supports only volumes larger than 20 GiB. Alibaba CSI driver, shipped as part of OpenShift, returned error when user created a PersistentVolumeClaim (PVC) smaller that 20 GiB with a message 'The specified parameter "Size" is not valid'.
We updated the Alibaba CSI driver to automatically increase all volume sizes to at least 20 GiB and smaller PVCs are now dynamically provisioned. For example, a PVC requesting 1 byte will result in a new dynamically provisioned 20 GiB volume.
This can could result in increased costs. Cluster admins should consider using quota on PVC count for each namespace in restricted environments.
|
Story Points: | --- | |
Clone Of: | ||||
: | 2076671 (view as bug list) | Environment: | ||
Last Closed: | 2022-08-10 10:50:55 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2076671, 2098655 |
Description
Jan Safranek
2022-02-23 14:03:53 UTC
StatefulSets use 1 byte volumes: https://github.com/kubernetes/kubernetes/blob/296bf4f01668374ade252a751d4c3567917b9890/test/e2e/framework/statefulset/fixtures.go#L107 PVC protection use 1GiB: https://github.com/kubernetes/kubernetes/blob/296bf4f01668374ade252a751d4c3567917b9890/test/e2e/storage/pvc_protection.go#L82 (here it could be possible to use HostPath PVs, as PV protection does) Assigning to Alibaba if they want to implement automatic increase of volumes to 20GiB. See above, we run Kubernetes e2e tests with the Alibaba Disk CSI driver in the default storage class and the tests create really small PVCs (1 byte!). The tests expect that the CSI driver provisions the smallest volume for this 1 byte, which is 20GiB in Alibaba Disk case.
The CSI driver gets:
> time="2022-02-23T10:59:08Z" level=info msg="CreateVolume: Starting CreateVolume, pvc-53021639-69e6-4751-b36e-891273a7b3b6,
> name:\"pvc-53021639-69e6-4751-b36e-891273a7b3b6\"
> capacity_range:<required_bytes:1 >
> volume_capabilities:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > >
> parameters: ...
(edited for readability)
limit_bytes is zero, i.e. unspecified. Strictly from the CSI protocol perspective, the CSI driver can provision as large volume as it wants.
I understand that this may bring some additional costs to the customers, ordering 1 byte and paying for 20 GiB is quite a difference. Still, it's better than ordering 1 byte and getting nothing. What do you think? All the other CSI drivers we ship round the volume size to the smallest size they support, which is typically 1 GiB.
This blocks our CI, all stateful set tests fail. It's not blocking 4.10 in any way, but we should decide what to do about it soon. In addition, from user perspective it looks weird if all other CSI drivers increase the volume size to the smallest size they can provision. For most of them it's 1GiB, but for example IBM has 10 GiB minimum and it does increase the volume size too. https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/pull/628 got merged upstream, cherry-picking into OCP now. Payload: 4.11.0-0.nightly-2022-04-26-181148 Flexy template: ipi-on-alicloud/versioned-installer-ci Verifications: PASS #1 With default sc, pvc with 20Gi file system, dep, write data => Pass #2 With default sc, pvc with 1Gi file system, dep, write data => Pass #3 With default sc, pvc with 20Gi block, dep, write data => Pass #4 With default sc, pvc with 1Gi block, dep, write data => Pass #5 With new sc, pvc with 20Gi fs, dep, write data => Pass #6 With new sc, pvc with 1Gi fs, dep, write data => Pass #7 With new sc, pvc with 20Gi bl, dep, write data => Pass #8 With new sc, pvc with 1Gi bl, dep, write data => Pass #9 With new sc volumeSizeAutoAvailable: "false", pvc with 1Gi fs, dep, write data => Pass #10 47918 ali_csi.go => for fstypes(ext4,ext3,xfs) Golang Automtion File: https://github.com/openshift/openshift-tests-private/blob/master/test/extended/storage/ali_csi.go#L30 upgrade Payload: 4.10.0-0.nightly-2022-05-03-165256 => 4.11.0-0.nightly-2022-04-26-181148 #11 to check the sc parameters volumeSizeAutoAvailable: "true" => https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-upgrade/job/upgrade-runner/9975/console Earlier it was showing as (Status)open for PR:25, for which reason i did not tested immediately, after checking PR, got to know the PR is merged. Done sync then tested. https://github.com/openshift/alibaba-disk-csi-driver-operator/pull/25 => merged Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |