Bug 1877261 - [RFE] Mounted volume size issue when restore a larger size pvc than snapshot
Summary: [RFE] Mounted volume size issue when restore a larger size pvc than snapshot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: 4.13.0
Assignee: Roman Bednář
QA Contact: Chao Yang
URL:
Whiteboard:
: 2060926 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-09 08:45 UTC by Wei Duan
Modified: 2023-05-17 22:46 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-17 22:46:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift alibaba-cloud-csi-driver pull 23 0 None open Bug 1877261: UPSTREAM: 673: Feature/support disk waiting during mount 2023-02-13 13:54:44 UTC
Github openshift gcp-pd-csi-driver pull 32 0 None open Bug 1877261: UPSTREAM: 973: filesystem is not resized when restoring 2023-02-13 13:55:53 UTC
Red Hat Product Errata RHSA-2023:1326 0 None None None 2023-05-17 22:46:44 UTC

Description Wei Duan 2020-09-09 08:45:04 UTC
Description of problem:
After volume snapshot from the original pvc(3G), I restored this snapshot to create a larger pvc(6G), it was successful, but the restored PVC filesystem size is only 3G. 

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-07-224533
AWSEBSCSIDriverOperator version: 4.6.0-0.nightly-2020-09-07-224533

How reproducible:
Always

Steps to Reproduce:
1. create original pod and pvc(mypvc-ori)
2. create snapshot(mysnapshot) from original pvc(mypvc-ori)
3. create pvc(mypvc-res) with snapshot(mysnapshot)
4. create pod(mypod-res) to consume pvc(mypvc-res)
5. do some check

A. original pvc:
[wduan@MINT snapshot]$ oc get pvc mypvc-ori
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mypvc-ori   Bound    pvc-f2a71a77-09f2-4062-8b2a-f6193b842c4a   3Gi        RWO            gp2-csi        42m

B. Snapshot:
[wduan@MINT snapshot]$ oc get volumesnapshot mysnapshot
NAME         READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS           SNAPSHOTCONTENT                                    CREATIONTIME   AGE
mysnapshot   true         mypvc-ori                           3Gi           csi-aws-ebs-snapclass   snapcontent-84d57185-0350-4e70-bb4f-f21a7b0a5832   39m            39m

C. restored pvc:
[wduan@MINT snapshot]$ oc get pvc mypvc-res
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mypvc-res   Bound    pvc-68450514-b466-404c-be73-ae8a273ac8fa   6Gi        RWO            gp2-csi        39m

spec:
  accessModes:
  - ReadWriteOnce
  dataSource:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: mysnapshot
  resources:
    requests:
      storage: 6Gi
  storageClassName: gp2-csi
  volumeMode: Filesystem
  volumeName: pvc-68450514-b466-404c-be73-ae8a273ac8fa
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 6Gi
  phase: Bound

C: Mount point on container is 3G:
[wduan@MINT snapshot]$ oc exec mypod-res -- /bin/sh -c 'df -h | grep mnt'
/dev/nvme2n1                          2.9G  9.0M  2.9G   1% /mnt/local

D: On the node, disk is 6G and mount point is 3G 
sh-4.4# lsblk | grep pvc-68450514-b466-404c-be73-ae8a273ac8fa      
nvme2n1                      259:6    0     6G  0 disk /var/lib/kubelet/pods/ebb6fda3-f788-4742-b949-261db2ea0e3d/volumes/kubernetes.io~csi/pvc-68450514-b466-404c-be73-ae8a273ac8fa/mount
sh-4.4# df -h | grep pvc-68450514-b466-404c-be73-ae8a273ac8fa
/dev/nvme2n1                          2.9G  9.0M  2.9G   1% /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-68450514-b466-404c-be73-ae8a273ac8fa/globalmount


Actual results:
mypvc-res is 6G, but the mount volume size is only 3G

Expected results:
mount volume size is 6G


Master Log:

Node Log (of failed PODs):

PV Dump:
  
PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Wei Duan 2020-09-09 08:49:55 UTC
I set the target release to 4.7 as this is RFE bug.

Comment 2 Jan Safranek 2020-09-11 11:41:24 UTC
Snapshot is restored as byte-to-byte copy, including filesystem size. Question is, what should be the right behavior. If the PVC is "Filesystem", it makes some sense to restore the snapshot *and* resize the filesystem too. On the other hand, then it's not 1:1 copy of the original volume.

Comment 3 Jan Safranek 2020-09-15 14:42:01 UTC
Ceph CSI driver has chosen not to allow restore of snapshot to PVCs of bigger size:
https://github.com/ceph/ceph-csi/pull/258
https://github.com/ceph/ceph-csi/pull/1244

Perhaps other CSI driver or external-provisioner itself should block such snapshot restore. We need to talk with upstream.

Comment 4 Jan Safranek 2020-09-17 14:33:35 UTC
Kubernetes should resize the filesystem using the standard resize workflow.

Comment 5 Jan Safranek 2020-09-18 07:46:44 UTC
This affects volume snapshot GA, bumping severity a bit.

Comment 6 Jan Safranek 2020-09-21 12:50:19 UTC
Filed issue upstream, several components need to be changed: https://github.com/kubernetes/kubernetes/issues/94929

Comment 7 Qin Ping 2020-09-29 05:42:16 UTC
@Jan, hit a similar snapshot size issue in AWS EBS csi driver:
1. Expand PVC of AWS EBS CSI from 1Gi to 2Gi
2. When the filesystem is not resized, create a snapshot for PVC
3. The volumesnapshot size is 2Gi.
4. Restore volumesnapshot, the filesystem size is 1Gi

From the snapshot view, it's a correct snapshot, but the restored PVC size is 2Gi, the container only can use 1Gi.

Don't know if we need to consider this issue.

Comment 8 Jan Safranek 2020-10-23 15:30:00 UTC
This may take a long time to fix upstream, we may need changes in CSI: https://github.com/container-storage-interface/spec/pull/452

Comment 9 Jan Safranek 2020-12-04 14:58:41 UTC
We had a meeting upstream and we agreed on a solution. It will require changes in CSI spec and all CSI drivers, so this bug probably won't be fixed in 4.7 timeframe.

Comment 12 Roman Bednář 2022-03-04 15:45:38 UTC
*** Bug 2060926 has been marked as a duplicate of this bug. ***

Comment 13 Rohit Patil 2022-05-16 06:41:17 UTC
Issue is seen on alicloud platform as well wrt Snapshot which is requesting more volume size, while Resize feature is working fine.

Observations.
Payload: 4.11.0-0.nightly-2022-05-11-054135
Platform: Alicloud

####################################################################################################################################################
Resize ext4 Working and PASS. 

rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
persistentvolumeclaim/mypvc-ext4   Bound    pvc-b1901132-e321-4b0b-b203-240c266882de   20Gi       RWO            csi-ext4       35s   Filesystem

NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE                                           NOMINATED NODE   READINESS GATES
pod/mydep-ext4-65dd7d87bb-rtjgv   1/1     Running   0          31s   10.131.0.17   kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn   <none>           <none>

oc exec -it pod/mydep-ext4-65dd7d87bb-rtjgv -n testali /bin/bash
bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.4G  112G   7% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         20G   45M   20G   1% /mnt/storage

bash-5.1$ /bin/dd if=dev/zero of=/mnt/storage/testfile1.txt bs=1G count=3
/bin/dd: error writing '/mnt/storage/testfile1.txt': No space left on device
20+0 records in
19+0 records out
20940640256 bytes (21 GB, 20 GiB) copied, 155.555 s, 135 MB/s

bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.4G  112G   7% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         20G   20G     0 100% /mnt/storage

oc patch pvc mypvc-ext4 -n testali -p '{"spec":{"resources":{"requests":{"storage":"24Gi"}}}}' --type=merge 
persistentvolumeclaim/mypvc-ext4 patched

rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE     VOLUMEMODE
persistentvolumeclaim/mypvc-ext4   Bound    pvc-b1901132-e321-4b0b-b203-240c266882de   24Gi       RWO            csi-ext4       7m11s   Filesystem

NAME                              READY   STATUS    RESTARTS   AGE    IP            NODE                                           NOMINATED NODE   READINESS GATES
pod/mydep-ext4-65dd7d87bb-rtjgv   1/1     Running   0          7m6s   10.131.0.17   kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn   <none>           <none>

rohitpatil@ropatil-mac Downloads % oc exec -it pod/mydep-ext4-65dd7d87bb-rtjgv -n testali /bin/bash
bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.4G  112G   7% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         24G   20G  4.0G  84% /mnt/storage

#Write data
bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.4G  112G   8% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         24G   24G  451M  99% /mnt/storage

bash-5.1$ /bin/dd if=dev/zero of=/mnt/storage/testfile2.txt bs=1G count=2 
/bin/dd: error writing '/mnt/storage/testfile2.txt': No space left on device
1+0 records in
0+0 records out
472170496 bytes (472 MB, 450 MiB) copied, 3.04103 s, 155 MB/s

bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.4G  112G   8% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         24G   24G     0 100% /mnt/storage

####################################################################################################################################################
Snapshot not working as expected. 
Restore pvc requesting more size:24Gi

rohitpatil@ropatil-mac Downloads % oc get sc
NAME                      PROVISIONER                       RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-ext4                  diskplugin.csi.alibabacloud.com   Delete          WaitForFirstConsumer   true                   5s

rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/mypvc-ext4   Bound    pvc-5c104aeb-3900-4d21-b721-171a7e6845dd   20Gi       RWO            csi-ext4       62s

NAME                              READY   STATUS    RESTARTS   AGE
pod/mydep-ext4-65dd7d87bb-gz997   1/1     Running   0          18s

rohitpatil@ropatil-mac Downloads % oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                STORAGECLASS   REASON   AGE
pvc-5c104aeb-3900-4d21-b721-171a7e6845dd   20Gi       RWO            Delete           Bound    testali/mypvc-ext4   csi-ext4                36s

oc exec -it pod/mydep-ext4-65dd7d87bb-gz997 -n testali /bin/bash
bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.9G  111G   8% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         20G   45M   20G   1% /mnt/storage

oc exec pod/mydep-ext4-65dd7d87bb-gz997 -n testali -i -- sh -c "/bin/dd if=/dev/zero of=/mnt/storage/testfile1.txt bs=1G count=21"
bash-5.1$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  8.9G  111G   8% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/passwd
/dev/vdb         20G   20G     0 100% /mnt/storage

rohitpatil@ropatil-mac Downloads % oc get volumesnapshotcontent 
NAME                                               READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                            VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT   VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-731aefd0-8d2f-47eb-83bb-585ad840c3e9   true         21474836480   Delete           diskplugin.csi.alibabacloud.com   alicloud-disk         my-snapshot      testali                   22s

rohitpatil@ropatil-mac Downloads % oc get pvc,pod -n testali -o wide
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
persistentvolumeclaim/mypvc-ext4   Bound    pvc-5c104aeb-3900-4d21-b721-171a7e6845dd   20Gi       RWO            csi-ext4       13m   Filesystem
persistentvolumeclaim/res-pvc      Bound    pvc-9afef05b-ce32-45cf-8a68-dec5b577c410   24Gi       RWO            csi-ext4       25s   Filesystem

NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE                                           NOMINATED NODE   READINESS GATES
pod/mydep-ext4-65dd7d87bb-gz997   1/1     Running   0          13m   10.131.0.20   kewang-1611al1-xsjvf-worker-us-east-1b-4s4bn   <none>           <none>
pod/res-pod                       1/1     Running   0          20s   10.128.2.38   kewang-1611al1-xsjvf-worker-us-east-1a-tpk6c   <none>           <none>

rohitpatil@ropatil-mac Downloads % oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                STORAGECLASS   REASON   AGE
pvc-5c104aeb-3900-4d21-b721-171a7e6845dd   20Gi       RWO            Delete           Bound    testali/mypvc-ext4   csi-ext4                14m
pvc-9afef05b-ce32-45cf-8a68-dec5b577c410   24Gi       RWO            Delete           Bound    testali/res-pvc      csi-ext4                90s

oc exec -it res-pod -n testali /bin/bash
[root@res-pod /]# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         120G  9.2G  111G   8% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   52M  3.8G   2% /etc/hostname
/dev/vdb         20G   20G     0 100% /mnt/storage

rohitpatil@ropatil-mac Downloads % oc exec pod/res-pod -n testali -i -- sh -c "/bin/dd if=/dev/zero of=/mnt/storage/testfile2.txt bs=1G count=1"
/bin/dd: error writing '/mnt/storage/testfile2.txt': No space left on device
1+0 records in
0+0 records out
0 bytes copied, 0.456851 s, 0.0 kB/s
command terminated with exit code 1

Comment 14 Roman Bednář 2022-11-14 09:01:31 UTC
Last driver that needs fixing is IBM VPC to finally close this bug - I've opened a PR and we're discussing implementation: https://github.com/kubernetes-sigs/ibm-vpc-block-csi-driver/pull/100

Comment 15 Roman Bednář 2023-01-10 08:43:43 UTC
Waiting for rebase.

Comment 20 Chao Yang 2023-02-15 07:36:22 UTC
passed: (3m41s) 2023-02-15T07:35:18 "[sig-storage] STORAGE ROSA-OSD_CCS-ARO-Author:chaoyang-Medium-48913-[CSI Driver] [Snapshot] [Filesystem ext4] provisioning should provision storage with snapshot data source larger than original volume"
Passed on azure

Comment 21 Chao Yang 2023-02-16 02:04:00 UTC
Passed for gcp pd csi driver.

Comment 22 Chao Yang 2023-02-20 02:52:33 UTC
Passed on IBM with 4.13.0-0.ci.test-2023-02-20-013242-ci-ln-1382kf2-latest

Comment 23 Chao Yang 2023-02-20 10:32:35 UTC
Passed on alicloud with 4.13.0-0.nightly-2023-02-17-090603

Comment 26 errata-xmlrpc 2023-05-17 22:46:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:1326


Note You need to log in before you can comment on or make changes to this bug.