Bug 2101874
| Summary: | CephFS | PVC Clone with a lot of files takes hours to Bound | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Avi Liani <alayani> | ||||||
| Component: | ceph | Assignee: | Venky Shankar <vshankar> | ||||||
| ceph sub component: | CephFS | QA Contact: | Elad <ebenahar> | ||||||
| Status: | CLOSED DEFERRED | Docs Contact: | |||||||
| Severity: | unspecified | ||||||||
| Priority: | unspecified | CC: | bniver, hyelloji, jeanchlopez, madam, muagarwa, ocs-bugs, odf-bz-bot, rperiyas | ||||||
| Version: | 4.11 | Keywords: | Automation, Performance | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2022-10-26 03:36:50 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
I ran the same test on the same hardware (VMWare-LSO) with OCP 4.11 & ODF 4.10 - got the same results also the test ran on AWS with the same results. Hi, ran a few tests to evaluate the impact of the volume and number fo files when doing the restore to new PVC for a snapshot. Test environment - 3 worker nodes - Storage backend is external array - Platform VMware - OCP 4.8.24 - ODF 4.8.14 (6 * 2TB OSDs with default configuration) Results are expressed this way - File (#) column is the number of files created - Restore column is the time it took to restore the snapshot to a new PVC in seconds - Create column is the time it took to create the original dataset Here is what I got if this can help evaluate the time in takes for people. Small 1KB Files ( Files (#) Restore (s) Create (s) 1000 14 4 5000 63 21 10000 128 39 50000 511 201 100000 927 428 200000 1684 852 400000 3512 1723 800000 7950 4117 Large 12MB Files Files (#) Restore (s) Create (s) 1000 247 253 5000 698 1227 10000 1713 2454 50000 7759 12653 100000 50422 27218 Best regards JC Created attachment 1905055 [details]
Small file test results graphed
Created attachment 1905056 [details]
Bif files test results graphed
(In reply to JC Lopez from comment #4) > Hi, > > ran a few tests to evaluate the impact of the volume and number fo files > when doing the restore to new PVC for a snapshot. > > Test environment > - 3 worker nodes > - Storage backend is external array > - Platform VMware > - OCP 4.8.24 > - ODF 4.8.14 (6 * 2TB OSDs with default configuration) > > Results are expressed this way > - File (#) column is the number of files created > - Restore column is the time it took to restore the snapshot to a new PVC in > seconds > - Create column is the time it took to create the original dataset > > Here is what I got if this can help evaluate the time in takes for people. > Small 1KB Files ( > Files (#) Restore (s) Create (s) > 1000 14 4 > 5000 63 21 > 10000 128 39 > 50000 511 201 > 100000 927 428 > 200000 1684 852 > 400000 3512 1723 > 800000 7950 4117 > > Large 12MB Files > Files (#) Restore (s) Create (s) > 1000 247 253 > 5000 698 1227 > 10000 1713 2454 > 50000 7759 12653 > 100000 50422 27218 > > Best regards > JC and this results is acceptable ? @jeanchlopez Hi Avi No it is not. Spectrum Protec Plus times out mounting the volume. We adjusted the timeout but we are already at 45 minutes which restricts the frequency of the backup to an outside objectstore for DR purposes. Best regards JC Hi Avi,
CephFS PVC clones are slow for huge data sets as the implementation currently is a full copy of the entire data. This constraint is highlighted in the upstream doc here
https://docs.ceph.com/en/quincy/cephfs/fs-volumes/#cloning-snapshots
and I presume also in the downstream documentation. Lots and lots of small files just makes this even worse.
There are plans for the future to improve/mitigate this, but that is pretty much long term.
Cheers,
Venky
|
Description of problem (please be detailed as possible and provide log snippests): When creating a clone of CephFS PVC with a lot of small files - more then 800 K files (total of ~10 GiB), It take a very long time until the clone get into Bound state - ~ 1.5 Hour, ~ 1.8 MiB/Sec Version of all relevant components (if applicable): OCP versions ============== clientVersion: buildDate: "2022-01-11T00:24:19Z" compiler: gc gitCommit: 293b52e3eb2e15df6312ea3541c1f8f397cfb24f gitTreeState: clean gitVersion: 4.10.0-202201102355.p0.g293b52e.assembly.stream-293b52e goVersion: go1.17.2 major: "" minor: "" platform: linux/amd64 openshiftVersion: 4.11.0-0.nightly-2022-06-25-081133 releaseClientVersion: 4.10.0-0.nightly-2022-01-11-065245 serverVersion: buildDate: "2022-06-24T05:13:15Z" compiler: gc gitCommit: 0a57f1f59bda75ea2cf13d9f3b4ac5d202134f2d gitTreeState: clean gitVersion: v1.24.0+9ddc8b1 goVersion: go1.18.1 major: "1" minor: "24" platform: linux/amd64 Cluster version: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-25-081133 True False 87m Cluster version is 4.11.0-0.nightly-2022-06-25-081133 OCS versions ============== NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.11.0 NooBaa Operator 4.11.0 Succeeded ocs-operator.v4.11.0 OpenShift Container Storage 4.11.0 Succeeded odf-csi-addons-operator.v4.11.0 CSI Addons 4.11.0 Succeeded odf-operator.v4.11.0 OpenShift Data Foundation 4.11.0 Succeeded ODF (OCS) build : full_version: 4.11.0-105 Rook versions =============== rook: v4.11.0-0.3cf4f59e15f5dbbaccede303940e26e06ef4aec1 go: go1.17.10 Ceph versions =============== ceph version 16.2.8-59.el8cp (4e10c5fa8a9bc0a421a4dd0833f951ab1cfdcfa7) pacific (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? volumes clone is very very slow. Is there any workaround available to the best of your knowledge? Not that i know Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? All the time. Can this issue reproduce from the UI? yes If this is a regression, please provide more details to justify this: this is new test, i ran it on 4.10 as well with the same results. Steps to Reproduce: On OCP & ODF cluster : 1. Create new CephFS PVC (more then 10 GiB) and attach it to pod 2. on the pod extract the Linux kernel 13 times (will get > 800K files - ~9.5 GiB) - taking ~ 7 Min. can done by : a. use the image quay.io/ocsci/perf:latest b. mount the pvc in the pod on /mnt c. and run inside the pod the command /opt/multiple_files.sh 13 /mnt d. wait for the pod get into Completed state. 3. take a clone of the PVC - can be done from the UI Actual results: The clone take very long time - ~ 1.5 Hour Expected results: the clone will take no more then few minutes - not more then the time it took to initially write the data - ~ 7 Min. Additional info: the output of the `oc describe pvc` output after the clone got into bound state is : Name: clone-pas-test-1 Namespace: pas-test-namespace StorageClass: ocs-storagecluster-cephfs Status: Bound Volume: pvc-4f91e953-7d67-40ad-9d56-afe836170aaf Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com volume.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 100Gi Access Modes: RWX VolumeMode: Filesystem DataSource: Kind: PersistentVolumeClaim Name: pvc-pas-test Used By: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning ProvisioningFailed 88m openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-8549bdb7d6-g8gl5_9259b5ba-75ce-48d4-9505-a61efd468227 failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = clone from snapshot is pending Warning ProvisioningFailed 70m (x13 over 88m) openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-8549bdb7d6-g8gl5_9259b5ba-75ce-48d4-9505-a61efd468227 failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = clone from snapshot is already in p rogress Normal ExternalProvisioning 3m37s (x351 over 88m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "openshift-storage.cephfs.csi.ceph.com" or manually created by system admin istrator Normal Provisioning 3s (x32 over 88m) openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-8549bdb7d6-g8gl5_9259b5ba-75ce-48d4-9505-a61efd468227 External provisioner is provisioning volume for claim "pas-test-namespace/clone-pas-test-1" and the output of the `oc -n openshift-storage logs csi-cephfsplugin-provisioner-8549bdb7d6-g8gl5 -c csi-provisioner --since-time=2022-06-28T13:01:02Z | grep clone-pas-test-1 | grep started` is : I0628 13:01:03.294650 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:03.900756 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:04.932317 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:06.958651 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:14.901555 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:22.923662 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:01:39.098629 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:02:11.120525 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:03:15.143955 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:05:23.166780 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:09:39.215652 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:14:39.259017 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:14:49.828064 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:19:39.308110 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:24:39.361502 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:29:39.404231 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:29:49.828838 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:34:39.445224 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:39:39.486306 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:44:39.527288 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:44:49.829368 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:49:39.566177 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:54:39.608320 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:59:39.647876 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 13:59:49.829705 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:04:39.691295 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:09:39.734306 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:14:39.774401 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:14:49.830236 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:19:39.824495 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:24:39.869367 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started I0628 14:29:39.909425 1 controller.go:1337] provision "pas-test-namespace/clone-pas-test-1" class "ocs-storagecluster-cephfs": started as you can see it appear 32 times !!!