Bug 1997384

Summary: [RBD] Thin to thick and thick to thin snapshot restore is not supported
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: csi-driverAssignee: Niels de Vos <ndevos>
Status: CLOSED WONTFIX QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.9CC: asriram, etamir, gshanmug, madam, muagarwa, ndevos, nthomas, ocs-bugs, odf-bz-bot
Target Milestone: ---Keywords: Triaged
Target Release: ---Flags: etamir: needinfo? (asriram)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-15 05:47:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jilju Joy 2021-08-25 06:24:11 UTC
Description of problem (please be detailed as possible and provide log
snippests):
The below scenarios are not supported
Thin provision enabled storage class to create PVC --> Thick provision enabled storage class to restore snapshot.

Thick provision enabled storage class to create PVC --> Thin provision enabled storage class to restore snapshot.

This is initially reported in the bug 1959793#c2 which is now used to track "thick PVC to thick PVC snapshot restore" only.

Usage of thick provision enabled storage class to restore a think provisioned PVC and vice versa is currently blocked. 

=====================================================================

Version of all relevant components (if applicable):
OCP 4.9.0-0.nightly-2021-08-23-224104
odf-operator.v4.9.0-105.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, snapshot restore is failing

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
Scenario 1:
1. Create PVC using thin provision enabled storage class
2. Create snapshot
3. Restore the snapshot using thick provision enabled storage class

Scenario 2:
1. Create PVC using thick provision enabled storage class
2. Create snapshot
3. Restore the snapshot using thin provision enabled storage class


Actual results:

Thin to thick:
Message
E             ----     ------                ----                  ----                                                                                                                -------
E             Normal   Provisioning          84s (x9 over 3m31s)   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-84ccc64b48-2f7ln_b11d5c3b-d233-400e-81c4-105125c2181a  External provisioner is provisioning volume for claim "namespace-test-4c7c3417b2874d7d9b6fb24fc/restore-pvc-test-2aa7d50c5cfc43-8573ba1d"
E             Warning  ProvisioningFailed    83s (x9 over 3m30s)   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-84ccc64b48-2f7ln_b11d5c3b-d233-400e-81c4-105125c2181a  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd-thick": rpc error: code = InvalidArgument desc = cannot restore from snapshot ocs-storagecluster-cephblockpool/csi-vol-419ea92d-050d-11ec-9671-0a580a83000f@csi-snap-e3944c71-050d-11ec-9671-0a580a83000f: cannot create thick volume from thin volume "ocs-storagecluster-cephblockpool/csi-snap-e3944c71-050d-11ec-9671-0a580a83000f"
E             Normal   ExternalProvisioning  13s (x15 over 3m30s)  persistentvolume-controller                                                                                         waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator

ocs_ci/helpers/helpers.py:121: ResourceWrongStatusException


Thick to thin:

E             Type     Reason                Age                  From                                                                                                                Message
E             ----     ------                ----                 ----                                                                                                                -------
E             Normal   Provisioning          73s (x9 over 3m21s)  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-84ccc64b48-2f7ln_b11d5c3b-d233-400e-81c4-105125c2181a  External provisioner is provisioning volume for claim "namespace-test-2f32d6f446b245a6b600600aa/restore-pvc-test-e9811d8469234b-1d433068"
E             Warning  ProvisioningFailed    73s (x9 over 3m21s)  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-84ccc64b48-2f7ln_b11d5c3b-d233-400e-81c4-105125c2181a  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd": rpc error: code = InvalidArgument desc = cannot restore from snapshot ocs-storagecluster-cephblockpool/csi-vol-2c7680e9-050f-11ec-9671-0a580a83000f@csi-snap-eec831a8-050f-11ec-9671-0a580a83000f: cannot create thin volume from thick volume "ocs-storagecluster-cephblockpool/csi-snap-eec831a8-050f-11ec-9671-0a580a83000f"
E             Normal   ExternalProvisioning  9s (x16 over 3m21s)  persistentvolume-controller                                                                                         waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator

ocs_ci/helpers/helpers.py:121: ResourceWrongStatusException

================================================================
Expected results:
Snapshot restore should succeed.

Additional info:

Comment 2 Niels de Vos 2021-08-25 14:59:07 UTC
There are two issues reported in this BZ:

1. restoring thin provisioned volume to a thick provisioned volume
2. restoring thick provisioned volume to a thin provisioned volume

Both are currently not supported by Ceph-CSI, and that is intentional. thick-provisioning is only available as Technology Preview, with limited functionality. The main (only?) advantage for thick-provisioning is that accounting of consumed storage is easier than with thin-provisioning.

Technically it is possible to implement both features:

1. do not use efficient rbd cloning functionality, but read/write data from snapshot to the new volume
2. run `rbd sparsify` after restoring the snapshot

Both solutions are inefficient and will take additional time when restoring snapshots. However, if there is a strong business case for these, we can look into this.

At the moment, I am considering to close this as WONTFIX.

Comment 7 Niels de Vos 2021-11-12 16:40:53 UTC
thick-provisioning will not move out of TechPreview. Performance while creating volumes is not deemed usable. Cluster-wide quotas can protect users from over-allocating better than thick-provisioning can do (as it only can work for RBD volumes).