Description of problem: After volume snapshot from the original pvc(3G), I restored this snapshot to create a larger pvc(6G), it was successful, but the restored PVC filesystem size is only 3G. Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-09-07-224533 AWSEBSCSIDriverOperator version: 4.6.0-0.nightly-2020-09-07-224533 How reproducible: Always Steps to Reproduce: 1. create original pod and pvc(mypvc-ori) 2. create snapshot(mysnapshot) from original pvc(mypvc-ori) 3. create pvc(mypvc-res) with snapshot(mysnapshot) 4. create pod(mypod-res) to consume pvc(mypvc-res) 5. do some check A. original pvc: [wduan@MINT snapshot]$ oc get pvc mypvc-ori NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mypvc-ori Bound pvc-f2a71a77-09f2-4062-8b2a-f6193b842c4a 3Gi RWO gp2-csi 42m B. Snapshot: [wduan@MINT snapshot]$ oc get volumesnapshot mysnapshot NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE mysnapshot true mypvc-ori 3Gi csi-aws-ebs-snapclass snapcontent-84d57185-0350-4e70-bb4f-f21a7b0a5832 39m 39m C. restored pvc: [wduan@MINT snapshot]$ oc get pvc mypvc-res NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mypvc-res Bound pvc-68450514-b466-404c-be73-ae8a273ac8fa 6Gi RWO gp2-csi 39m spec: accessModes: - ReadWriteOnce dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: mysnapshot resources: requests: storage: 6Gi storageClassName: gp2-csi volumeMode: Filesystem volumeName: pvc-68450514-b466-404c-be73-ae8a273ac8fa status: accessModes: - ReadWriteOnce capacity: storage: 6Gi phase: Bound C: Mount point on container is 3G: [wduan@MINT snapshot]$ oc exec mypod-res -- /bin/sh -c 'df -h | grep mnt' /dev/nvme2n1 2.9G 9.0M 2.9G 1% /mnt/local D: On the node, disk is 6G and mount point is 3G sh-4.4# lsblk | grep pvc-68450514-b466-404c-be73-ae8a273ac8fa nvme2n1 259:6 0 6G 0 disk /var/lib/kubelet/pods/ebb6fda3-f788-4742-b949-261db2ea0e3d/volumes/kubernetes.io~csi/pvc-68450514-b466-404c-be73-ae8a273ac8fa/mount sh-4.4# df -h | grep pvc-68450514-b466-404c-be73-ae8a273ac8fa /dev/nvme2n1 2.9G 9.0M 2.9G 1% /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-68450514-b466-404c-be73-ae8a273ac8fa/globalmount Actual results: mypvc-res is 6G, but the mount volume size is only 3G Expected results: mount volume size is 6G Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
I set the target release to 4.7 as this is RFE bug.
Snapshot is restored as byte-to-byte copy, including filesystem size. Question is, what should be the right behavior. If the PVC is "Filesystem", it makes some sense to restore the snapshot *and* resize the filesystem too. On the other hand, then it's not 1:1 copy of the original volume.
Ceph CSI driver has chosen not to allow restore of snapshot to PVCs of bigger size: https://github.com/ceph/ceph-csi/pull/258 https://github.com/ceph/ceph-csi/pull/1244 Perhaps other CSI driver or external-provisioner itself should block such snapshot restore. We need to talk with upstream.
Kubernetes should resize the filesystem using the standard resize workflow.
This affects volume snapshot GA, bumping severity a bit.
Filed issue upstream, several components need to be changed: https://github.com/kubernetes/kubernetes/issues/94929
@Jan, hit a similar snapshot size issue in AWS EBS csi driver: 1. Expand PVC of AWS EBS CSI from 1Gi to 2Gi 2. When the filesystem is not resized, create a snapshot for PVC 3. The volumesnapshot size is 2Gi. 4. Restore volumesnapshot, the filesystem size is 1Gi From the snapshot view, it's a correct snapshot, but the restored PVC size is 2Gi, the container only can use 1Gi. Don't know if we need to consider this issue.
This may take a long time to fix upstream, we may need changes in CSI: https://github.com/container-storage-interface/spec/pull/452
We had a meeting upstream and we agreed on a solution. It will require changes in CSI spec and all CSI drivers, so this bug probably won't be fixed in 4.7 timeframe.
*** Bug 2060926 has been marked as a duplicate of this bug. ***
Issue is seen on alicloud platform as well wrt Snapshot which is requesting more volume size, while Resize feature is working fine. Observations. Payload: 4.11.0-0.nightly-2022-05-11-054135 Platform: Alicloud #################################################################################################################################################### Resize ext4 Working and PASS. rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE persistentvolumeclaim/mypvc-ext4 Bound pvc-b1901132-e321-4b0b-b203-240c266882de 20Gi RWO csi-ext4 35s Filesystem NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/mydep-ext4-65dd7d87bb-rtjgv 1/1 Running 0 31s 10.131.0.17 kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn <none> <none> oc exec -it pod/mydep-ext4-65dd7d87bb-rtjgv -n testali /bin/bash bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.4G 112G 7% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 20G 45M 20G 1% /mnt/storage bash-5.1$ /bin/dd if=dev/zero of=/mnt/storage/testfile1.txt bs=1G count=3 /bin/dd: error writing '/mnt/storage/testfile1.txt': No space left on device 20+0 records in 19+0 records out 20940640256 bytes (21 GB, 20 GiB) copied, 155.555 s, 135 MB/s bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.4G 112G 7% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 20G 20G 0 100% /mnt/storage oc patch pvc mypvc-ext4 -n testali -p '{"spec":{"resources":{"requests":{"storage":"24Gi"}}}}' --type=merge persistentvolumeclaim/mypvc-ext4 patched rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE persistentvolumeclaim/mypvc-ext4 Bound pvc-b1901132-e321-4b0b-b203-240c266882de 24Gi RWO csi-ext4 7m11s Filesystem NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/mydep-ext4-65dd7d87bb-rtjgv 1/1 Running 0 7m6s 10.131.0.17 kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn <none> <none> rohitpatil@ropatil-mac Downloads % oc exec -it pod/mydep-ext4-65dd7d87bb-rtjgv -n testali /bin/bash bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.4G 112G 7% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 24G 20G 4.0G 84% /mnt/storage #Write data bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.4G 112G 8% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 24G 24G 451M 99% /mnt/storage bash-5.1$ /bin/dd if=dev/zero of=/mnt/storage/testfile2.txt bs=1G count=2 /bin/dd: error writing '/mnt/storage/testfile2.txt': No space left on device 1+0 records in 0+0 records out 472170496 bytes (472 MB, 450 MiB) copied, 3.04103 s, 155 MB/s bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.4G 112G 8% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 24G 24G 0 100% /mnt/storage #################################################################################################################################################### Snapshot not working as expected. Restore pvc requesting more size:24Gi rohitpatil@ropatil-mac Downloads % oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE csi-ext4 diskplugin.csi.alibabacloud.com Delete WaitForFirstConsumer true 5s rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/mypvc-ext4 Bound pvc-5c104aeb-3900-4d21-b721-171a7e6845dd 20Gi RWO csi-ext4 62s NAME READY STATUS RESTARTS AGE pod/mydep-ext4-65dd7d87bb-gz997 1/1 Running 0 18s rohitpatil@ropatil-mac Downloads % oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-5c104aeb-3900-4d21-b721-171a7e6845dd 20Gi RWO Delete Bound testali/mypvc-ext4 csi-ext4 36s oc exec -it pod/mydep-ext4-65dd7d87bb-gz997 -n testali /bin/bash bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.9G 111G 8% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 20G 45M 20G 1% /mnt/storage oc exec pod/mydep-ext4-65dd7d87bb-gz997 -n testali -i -- sh -c "/bin/dd if=/dev/zero of=/mnt/storage/testfile1.txt bs=1G count=21" bash-5.1$ df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 8.9G 111G 8% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/passwd /dev/vdb 20G 20G 0 100% /mnt/storage rohitpatil@ropatil-mac Downloads % oc get volumesnapshotcontent NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS VOLUMESNAPSHOT VOLUMESNAPSHOTNAMESPACE AGE snapcontent-731aefd0-8d2f-47eb-83bb-585ad840c3e9 true 21474836480 Delete diskplugin.csi.alibabacloud.com alicloud-disk my-snapshot testali 22s rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE persistentvolumeclaim/mypvc-ext4 Bound pvc-5c104aeb-3900-4d21-b721-171a7e6845dd 20Gi RWO csi-ext4 13m Filesystem persistentvolumeclaim/res-pvc Bound pvc-9afef05b-ce32-45cf-8a68-dec5b577c410 24Gi RWO csi-ext4 25s Filesystem NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/mydep-ext4-65dd7d87bb-gz997 1/1 Running 0 13m 10.131.0.20 kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn <none> <none> pod/res-pod 1/1 Running 0 20s 10.128.2.38 kewang-1611al1-xsjvf-worker-us-east-1a-tpk6c <none> <none> rohitpatil@ropatil-mac Downloads % oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-5c104aeb-3900-4d21-b721-171a7e6845dd 20Gi RWO Delete Bound testali/mypvc-ext4 csi-ext4 14m pvc-9afef05b-ce32-45cf-8a68-dec5b577c410 24Gi RWO Delete Bound testali/res-pvc csi-ext4 90s oc exec -it res-pod -n testali /bin/bash [root@res-pod /]# df -h Filesystem Size Used Avail Use% Mounted on overlay 120G 9.2G 111G 8% / tmpfs 64M 0 64M 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 52M 3.8G 2% /etc/hostname /dev/vdb 20G 20G 0 100% /mnt/storage rohitpatil@ropatil-mac Downloads % oc exec pod/res-pod -n testali -i -- sh -c "/bin/dd if=/dev/zero of=/mnt/storage/testfile2.txt bs=1G count=1" /bin/dd: error writing '/mnt/storage/testfile2.txt': No space left on device 1+0 records in 0+0 records out 0 bytes copied, 0.456851 s, 0.0 kB/s command terminated with exit code 1
Last driver that needs fixing is IBM VPC to finally close this bug - I've opened a PR and we're discussing implementation: https://github.com/kubernetes-sigs/ibm-vpc-block-csi-driver/pull/100
Waiting for rebase.
passed: (3m41s) 2023-02-15T07:35:18 "[sig-storage] STORAGE ROSA-OSD_CCS-ARO-Author:chaoyang-Medium-48913-[CSI Driver] [Snapshot] [Filesystem ext4] provisioning should provision storage with snapshot data source larger than original volume" Passed on azure
Passed for gcp pd csi driver.
Passed on IBM with 4.13.0-0.ci.test-2023-02-20-013242-ci-ln-1382kf2-latest
Passed on alicloud with 4.13.0-0.nightly-2023-02-17-090603
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:1326