Description of problem (please be detailed as possible and provide log snippests): After restoring from a OADP backup with cephfs csi volumes, there is one PVC "wmla-conda" consistently stuck pending. The PVC is around 13GB and 185K files. [chudg@wml1x171 /home/chudg]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE wmla-conda Pending ocs-storagecluster-cephfs 33h wmla-cws-share Bound pvc-3aa99b42-922b-435a-aa92-2b9ba181b489 15Gi RWX ocs-storagecluster-cephfs 33h wmla-edi Bound pvc-00a8de84-9ad3-4523-add0-a2ed8ebb522a 10Gi RWX ocs-storagecluster-cephfs 33h wmla-infoservice Bound pvc-09009aa2-8b6c-4be7-a24f-f9a3d0af6b2b 4Gi RWX ocs-storagecluster-cephfs 33h wmla-logging Bound pvc-6e2399f8-2938-47a8-84fa-fcbcceed9934 10Gi RWX ocs-storagecluster-cephfs 33h wmla-mygpfs Bound pvc-2d74707d-870e-4e9c-a74b-93beb77ec52a 20Gi RWX ocs-storagecluster-cephfs 33h [chudg@wml1x171 /home/chudg]$ oc describe pvc wmla-conda Name: wmla-conda Namespace: chudg StorageClass: ocs-storagecluster-cephfs Status: Pending Volume: Labels: app.kubernetes.io/instance=wmla-cr---wmla app.kubernetes.io/managed-by=Ansible app.kubernetes.io/name=wmla-dlpd app.kubernetes.io/version=1.0.0 env=conda-share icpd-addon/status=1618990653296570 icpdsupport/addOnId=wml-accelerator icpdsupport/addOnName=WML-Accelerator icpdsupport/addOnType=component icpdsupport/app=wml-accelerator icpdsupport/controlPlaneNamespace=zen-chudg icpdsupport/serviceInstanceId=1618990653296570 release=wmla velero.io/backup-name=b8 velero.io/restore-name=r8 velero.io/volume-snapshot-name=velero-wmla-conda-rldfg Annotations: volume.beta.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem DataSource: APIGroup: snapshot.storage.k8s.io Kind: VolumeSnapshot Name: velero-wmla-conda-rldfg Mounted By: wmla-conda-849674cf4c-vk8q7 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Provisioning 82s (x170 over 13h) openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-66c59d467f-7hvc6_8d07eb8b-3b62-4dd6-91ec-d80aa49c92d8 External provisioner is provisioning volume for claim "chudg/wmla-conda" Normal ExternalProvisioning 25s (x4143 over 16h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "openshift-storage.cephfs.csi.ceph.com" or manually created by system administrator oc logs csi-cephfsplugin-provisioner-66c59d467f-7hvc6 -c csi-provisioner I0422 21:04:06.986781 1 connection.go:186] GRPC error: rpc error: code = Aborted desc = in progress I0422 21:04:06.986848 1 controller.go:642] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Background: rpc error: code = Aborted desc = in progress I0422 21:04:06.986926 1 controller.go:1091] Temporary error received, adding PVC 4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4 to claims in progress W0422 21:04:06.986952 1 controller.go:943] Retrying syncing claim "4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4", failure 157 E0422 21:04:06.986976 1 controller.go:966] error syncing claim "4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4": failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = in progress I0422 21:04:06.987013 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"chudg", Name:"wmla-conda", UID:"4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4", APIVersion:"v1", ResourceVersion:"296904287", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = in progress I0422 21:09:06.987266 1 controller.go:1317] provision "chudg/wmla-conda" class "ocs-storagecluster-cephfs": started I0422 21:09:06.987484 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"chudg", Name:"wmla-conda", UID:"4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4", APIVersion:"v1", ResourceVersion:"296904287", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "chudg/wmla-conda" Version of all relevant components (if applicable): OADP 0.2.1 with CSI plugin OCP 4.6.8 OCS 4.6.4 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Using OADP as a backup and restore utility, users cannot restore certain cephfs csi volumes. It's unknown what's special about the PVC. Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 4 Can this issue reproducible? Yes Can this issue reproduce from the UI? No If this is a regression, please provide more details to justify this: n/a Steps to Reproduce: I haven't been able to reproduce with standalone application. This may be specific to the application being backed up. General steps are basically: 1. Perform OADP backup of Cloud Pak for Data wml-accelerator application. Storage class is ocs-storagecluster-cephfs. The particular PVC having problems is around 13GB with 185000 files. 2. Delete namespace 3. Restore from backup Actual results: Restore of PVC "wmla-conda" is stuck Pending. No PV is created. In the first occurrence, wmla team waited over 2 days. wmla team retried the steps on a different cluster and hit the same issue, with the same PVC stuck for 33+ hours now. Expected results: After OADP restore, PVCs should all be bound. Additional info: ceph status looks ok sh-4.4# ceph -s cluster: id: 7b132412-410e-4e70-bb91-e9c4d2bbdf11 health: HEALTH_OK services: mon: 3 daemons, quorum b,d,e (age 2h) mgr: a(active, since 9h) mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay osd: 3 osds: 3 up (since 2h), 3 in (since 5w) rgw: 2 daemons active (ocs.storagecluster.cephobjectstore.a, ocs.storagecluster.cephobjectstore.b) task status: scrub status: mds.ocs-storagecluster-cephfilesystem-a: idle mds.ocs-storagecluster-cephfilesystem-b: idle data: pools: 10 pools, 176 pgs objects: 1.74M objects, 129 GiB usage: 544 GiB used, 1.8 TiB / 2.3 TiB avail pgs: 176 active+clean io: client: 1.4 MiB/s rd, 1.5 MiB/s wr, 78 op/s rd, 363 op/s wr sh-4.4# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 2.3 TiB 1.8 TiB 532 GiB 545 GiB 22.69 TOTAL 2.3 TiB 1.8 TiB 532 GiB 545 GiB 22.69 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL ocs-storagecluster-cephblockpool 1 3.1 GiB 877 9.4 GiB 0.63 498 GiB ocs-storagecluster-cephobjectstore.rgw.control 2 0 B 8 0 B 0 498 GiB ocs-storagecluster-cephfilesystem-metadata 3 610 MiB 246.12k 1.1 GiB 0.07 498 GiB ocs-storagecluster-cephobjectstore.rgw.meta 4 3.2 KiB 12 1.9 MiB 0 498 GiB ocs-storagecluster-cephfilesystem-data0 5 125 GiB 1.50M 522 GiB 25.88 498 GiB ocs-storagecluster-cephobjectstore.rgw.log 6 1.2 MiB 212 7.6 MiB 0 498 GiB ocs-storagecluster-cephobjectstore.rgw.buckets.index 7 342 B 22 342 B 0 498 GiB ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec 8 0 B 0 0 B 0 498 GiB .rgw.root 9 4.7 KiB 16 2.8 MiB 0 498 GiB ocs-storagecluster-cephobjectstore.rgw.buckets.data 10 1 KiB 1 192 KiB 0 498 GiB
Created attachment 1774628 [details] csi-cephfsplugin-provisioner logs
Looks like the snapshot from which restore was tried is not proper: DataSource: APIGroup: snapshot.storage.k8s.io Kind: VolumeSnapshot Name: velero-wmla-conda-rldfg I0422 08:31:10.363168 1 controller.go:964] VolumeContentSource_Snapshot {Snapshot:snapshot_id:"0001-0011-openshift-storage- 0000000000000001-c1d7d4e1-a29a-11eb-b5d8-0a580a800542" } I0422 08:31:10.363264 1 controller.go:972] Requested volume size is 21474836480 and snapshot size is 0 for the source snapshot velero-wmla-conda-rldfg W0422 08:31:10.363276 1 controller.go:979] requested volume size 21474836480 is greater than the size 0 for the source snapshot velero-wmla-conda-rldfg. Volume plugin needs to handle volume expansion. But if I can interpret the logs correctly this is not what causing the issue. CreateVolume is always returning with this error: I0422 08:31:54.987982 1 connection.go:185] GRPC response: {} I0422 08:31:54.988093 1 connection.go:186] GRPC error: rpc error: code = Aborted desc = in progress I0422 08:31:54.988126 1 controller.go:642] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Background: rpc error: code = Aborted desc = in progress I0422 08:32:32.633750 1 controller.go:1091] Temporary error received, adding PVC 4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4 to claims in progress W0422 08:32:32.633771 1 controller.go:943] Retrying syncing claim "4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4", failure 6 E0422 08:32:32.633797 1 controller.go:966] error syncing claim "4b61e5ee-0ffc-41d5-b2a8-fa6ee5fd6ed4": failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = in progress This means that the request to create a clone is not getting completed at the Ceph side. Do we have the Ceph cluster, we might need to check why the clone is not getting created there.
wmla team still has the cluster, but probably not for much longer. Let me know what is needed. $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE wmla-conda Pending ocs-storagecluster-cephfs 5d12h wmla-cws-share Bound pvc-3aa99b42-922b-435a-aa92-2b9ba181b489 15Gi RWX ocs-storagecluster-cephfs 5d12h wmla-edi Bound pvc-00a8de84-9ad3-4523-add0-a2ed8ebb522a 10Gi RWX ocs-storagecluster-cephfs 5d12h wmla-infoservice Bound pvc-09009aa2-8b6c-4be7-a24f-f9a3d0af6b2b 4Gi RWX ocs-storagecluster-cephfs 5d12h wmla-logging Bound pvc-6e2399f8-2938-47a8-84fa-fcbcceed9934 10Gi RWX ocs-storagecluster-cephfs 5d12h wmla-mygpfs Bound pvc-2d74707d-870e-4e9c-a74b-93beb77ec52a 20Gi RWX ocs-storagecluster-cephfs 5d12h
@henrychi if you can provide access to the cluster that would be helpful or can you get the output of the below command from the ceph cluster ``` ceph fs clone status ocs-storagecluster-cephfilesystem csi-vol-118b914c-a38d-11eb-9e8b-0a580a81000d --group_name csi ```
^
Hi @henrychi.com, As requested in https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c5 can you please share the clone status? It would also be great if we can get the must-gather.
On reproducing, this issue was encountered both in local velero setup, as well as on setting up velero via OADP. After restoring a backup, the PVC did not come back to BOUND state and stayed PENDING with the following error: ``` Warning ProvisioningFailed 148m (x15 over 176m) openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-66c59d467f-2pjgw_b9d3b9a4-818c-42a1-b585-b9f7202694ca failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a" ``` csi-cephfsplugin: ``` [ygupta@localhost ~]$ oc logs csi-cephfsplugin-provisioner-66c59d467f-2pjgw csi-cephfsplugin -n openshift-storage | grep pvc-ab71a168-21f9-430c-b89a-80a416de477f I0429 09:07:07.290072 1 utils.go:159] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f GRPC call: /csi.v1.Controller/CreateVolume I0429 09:07:07.291693 1 utils.go:160] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-ab71a168-21f9-430c-b89a-80a416de477f","parameters":{"clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":1}}],"volume_content_source":{"Type":{"Snapshot":{"snapshot_id":"0001-0011-openshift-storage-0000000000000001-6702cdf8-a8c9-11eb-9bc2-0a580a81022a"}}}} I0429 09:07:07.607927 1 cephcmds.go:59] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f command succeeded: ceph [-m 172.30.33.171:6789,172.30.213.18:6789,172.30.255.196:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs get ocs-storagecluster-cephfilesystem --format=json] I0429 09:07:07.919624 1 cephcmds.go:59] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f command succeeded: ceph [-m 172.30.33.171:6789,172.30.213.18:6789,172.30.255.196:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs ls --format=json] I0429 09:07:08.234498 1 cephcmds.go:59] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f command succeeded: ceph [-m 172.30.33.171:6789,172.30.213.18:6789,172.30.255.196:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs dump --format=json] I0429 09:07:08.547046 1 cephcmds.go:59] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f command succeeded: ceph [-m 172.30.33.171:6789,172.30.213.18:6789,172.30.255.196:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs ls --format=json] E0429 09:07:08.548774 1 omap.go:77] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f omap not found (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a"): rados: ret=-2, No such file or directory W0429 09:07:08.548805 1 voljournal.go:624] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f unable to read omap keys: pool or key missing: key not found: rados: ret=-2, No such file or directory E0429 09:07:08.548841 1 utils.go:163] ID: 50 Req-ID: pvc-ab71a168-21f9-430c-b89a-80a416de477f GRPC error: rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a" ``` csi-cephfsplugin-provisioner ``` [ygupta@localhost ~]$ oc logs csi-cephfsplugin-provisioner-66c59d467f-2pjgw csi-provisioner -n openshift-storageI0429 12:36:35.831384 1 controller.go:964] VolumeContentSource_Snapshot {Snapshot:snapshot_id:"0001-0011-openshift-storage-0000000000000001-6702cdf8-a8c9-11eb-9bc2-0a580a81022a" } I0429 12:36:35.831404 1 controller.go:972] Requested volume size is 1073741824 and snapshot size is 0 for the source snapshot velero-test-pvc-q2zw7 W0429 12:36:35.831423 1 controller.go:979] requested volume size 1073741824 is greater than the size 0 for the source snapshot velero-test-pvc-q2zw7. Volume plugin needs to handle volume expansion. I0429 12:36:35.831431 1 controller.go:570] CreateVolumeRequest {Name:pvc-ab71a168-21f9-430c-b89a-80a416de477f CapacityRange:required_bytes:1073741824 VolumeCapabilities:[mount:<> access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:openshift-storage csi.storage.k8s.io/controller-expand-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/controller-expand-secret-namespace:openshift-storage csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:openshift-storage csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:openshift-storage fsName:ocs-storagecluster-cephfilesystem] Secrets:map[] VolumeContentSource:snapshot:<snapshot_id:"0001-0011-openshift-storage-0000000000000001-6702cdf8-a8c9-11eb-9bc2-0a580a81022a" > AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} I0429 12:36:35.834643 1 connection.go:182] GRPC call: /csi.v1.Controller/CreateVolume I0429 12:36:35.834656 1 connection.go:183] GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-ab71a168-21f9-430c-b89a-80a416de477f","parameters":{"clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":1}}],"volume_content_source":{"Type":{"Snapshot":{"snapshot_id":"0001-0011-openshift-storage-0000000000000001-6702cdf8-a8c9-11eb-9bc2-0a580a81022a"}}}} I0429 12:36:37.140280 1 connection.go:185] GRPC response: {} I0429 12:36:37.140321 1 connection.go:186] GRPC error: rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a" I0429 12:36:37.140348 1 controller.go:642] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a" I0429 12:36:37.140390 1 controller.go:1084] Final error received, removing PVC ab71a168-21f9-430c-b89a-80a416de477f from claims in progress ``` We need further analysis on the same to identify the reason for such behavior.
That's great Yug, thanks for reproducing it. Can you also check what is the status of the source snapshot, are the omap values proper for it?
I'm checking with wmla team to see if cluster is still available. Will get back. Regarding the "no snap source in omap", that seems like a different issue than mine. Just FYI, my VolumeSnapshotClass deletionPolicy is set to Retain, so the volumesnapshotcontent isn't deleted when the volumesnapshot is deleted. cat ocs-storagecluster-cephfsplugin-snapclass-velero.yaml apiVersion: snapshot.storage.k8s.io/v1beta1 deletionPolicy: Retain driver: openshift-storage.cephfs.csi.ceph.com kind: VolumeSnapshotClass metadata: name: ocs-storagecluster-cephfsplugin-snapclass-velero labels: velero.io/csi-volumesnapshot-class: "true" parameters: clusterID: openshift-storage csi.storage.k8s.io/snapshotter-secret-name: rook-csi-cephfs-provisioner csi.storage.k8s.io/snapshotter-secret-namespace: openshift-storage
We tried to reproduce the issue on OCS 4.6 with OADP and after restore operation, the PVC got to BOUND state without any issue. As Mudit and Madhu suggested, it looks like the clone creation is stuck at ceph side.
Unfortunately I haven't been able to reach the wmla team and provide the clone status. I'll update if they get back to me, but their cluster may no longer be available.
I was able to get more info from the cluster. From comment #5, this is the clone status: sh-4.4# ceph fs clone status ocs-storagecluster-cephfilesystem csi-vol-118b914c-a38d-11eb-9e8b-0a580a81000d --group_name csi Error ENOENT: subvolume 'csi-vol-118b914c-a38d-11eb-9e8b-0a580a81000d' does not exist
From comment #7, the must-gather can be downloaded below. Please let me know if more information is needed. https://ibm.box.com/s/hg0aexs550q85qafl0mycqg55dxl8gtg
(In reply to henrychi from comment #14) > From comment #7, the must-gather can be downloaded below. Please let me > know if more information is needed. > > https://ibm.box.com/s/hg0aexs550q85qafl0mycqg55dxl8gtg The shared must-gather doesn't seem to have any logs from pods in the openshift-storage namespace. Can you please share the logs so that we can perform a deeper analysis of the same?
That must have been an oversight, found the logs in the attachment.
Ok. Let me know if more info is needed.
Looks like velero restore operation performs clone from snapshots created during the backup operation. From the attached logs it seems like the `ceph fs clone status` keeps failing in the backend; due to which the restored PVC never gets to the BOUND state. For debugging the issue further, we need to look into the logs of other pods like ceph manager, ceph mon, etc. Can you provide the logs from all the pods in the `openshift-storage` namespace? Also, can you please provide the output of `ceph -s` so that we can verify if the ceph cluster is in a healthy state. In the meantime, to verify if clone creation is working fine on the backend, we can try creating clones manually and monitor the status: To create a new clone named "test": `ceph fs subvolume snapshot clone ocs-storagecluster-cephfilesystem csi-vol-54f89e48-b869-11eb-ac15-fe59eb9b76df csi-snap-c1d7d4e1-a29a-11eb-b5d8-0a580a800542 test --group_name csi --target_group_name csi` To check if the subvolume got created: `ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name csi` During the clone operation, monitor the clone status by: `ceph fs clone status ocs-storagecluster-cephfilesystem test --group_name csi`
1) logs from openshift-storage pods for i in $(kubectl get pods |cut -d" " -f1); do kubectl logs $i --all-containers=true > $i.log; done https://ibm.box.com/s/m7153uaptx9n1qqeqxvvvhibtkd562qb 2) output of ceph -s sh-4.4# ceph -s cluster: id: 7b132412-410e-4e70-bb91-e9c4d2bbdf11 health: HEALTH_OK services: mon: 3 daemons, quorum g,i,j (age 5d) mgr: a(active, since 5d) mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-a=up:active} 1 up:standby-replay osd: 3 osds: 3 up (since 5d), 3 in (since 9w) rgw: 2 daemons active (ocs.storagecluster.cephobjectstore.a, ocs.storagecluster.cephobjectstore.b) task status: scrub status: mds.ocs-storagecluster-cephfilesystem-a: idle mds.ocs-storagecluster-cephfilesystem-b: idle data: pools: 10 pools, 176 pgs objects: 2.36M objects, 180 GiB usage: 767 GiB used, 1.6 TiB / 2.3 TiB avail pgs: 176 active+clean io: client: 1.5 MiB/s rd, 1.9 MiB/s wr, 97 op/s rd, 477 op/s wr 3)Test clone creation 3a) The command given in comment #18 results in "does not exist" sh-4.4# ceph fs subvolume snapshot clone ocs-storagecluster-cephfilesystem csi-vol-54f89e48-b869-11eb-ac15-fe59eb9b76df csi-snap-c1d7d4e1-a29a-11eb-b5d8-0a580a800542 test --group_name csi --target_group_name csi Error ENOENT: subvolume 'csi-vol-54f89e48-b869-11eb-ac15-fe59eb9b76df' does not exist 3b) I tried with another similar subvolume and snapshot: sh-4.4# ceph fs subvolume snapshot clone ocs-storagecluster-cephfilesystem csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542 csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542 test --group_name csi --target_group_name csi 3c) subvolume "test" got created sh-4.4# ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name csi | grep test "name": "test" 3d) Clone was in-progress for about an hour, and eventually failed. I'm not sure if it's related to the original problem. sh-4.4# ceph fs clone status ocs-storagecluster-cephfilesystem test --group_name csi { "status": { "state": "failed", "source": { "volume": "ocs-storagecluster-cephfilesystem", "subvolume": "csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542", "snapshot": "csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542", "group": "csi" } } } 4) New logs after clone test https://ibm.box.com/s/1a0ot0ika28is86vufinfsn68ze63znq
Thank Henrychi for testing it out. As the manual clone is failing this looks more to be a ceph fs issue rather than a CSI issue. @Patrick @Kotresh can anyone from the cephfs team help here?
Hard to diagnose the problem without debugging turned up on the mgr. Please do: > ceph config set mgr debug_mgr 20 > ceph config set mgr debug_ms 1 > ceph config set mgr debug_client 20 and reproduce. @Humble, is this a normal error for CSI? > I0429 12:36:37.140321 1 connection.go:186] GRPC error: rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a"
(In reply to Patrick Donnelly from comment #21) > Hard to diagnose the problem without debugging turned up on the mgr. Please > do: > > > ceph config set mgr debug_mgr 20 > > ceph config set mgr debug_ms 1 > > ceph config set mgr debug_client 20 > > and reproduce. > > @Humble, is this a normal error for CSI? > > > I0429 12:36:37.140321 1 connection.go:186] GRPC error: rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a" No Patrick, this is not a normal error. Rather clearly says the snap entry is missing from OMAP which is required for clone operation. This is a workflow error.
Henry, please rerun with debug logs as asked in https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c21
I can contact the wmla team to see if I can rerun the clone test on their cluster, but I don't think I can rerun the backup and restore. The original PVC size was around 13GB and 185000 files. wmla team reduced the size and number files as a workaround so they can continue work. Regarding the no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a", I remember someone in comment #8 was able to reproduce. Is that still relevant, or a different issue?
1) Here are the debug logs for clone test "test3". This time, after waiting two hours, the status is still in-progress. https://ibm.box.com/s/1dw1yshvyol17vp29mfnuljmrkgvbp2k ceph config set mgr debug_mgr 20 ceph config set mgr debug_ms 1 ceph config set mgr debug_client 20 ceph fs subvolume snapshot clone ocs-storagecluster-cephfilesystem csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542 csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542 test3 --group_name csi --target_group_name csi ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name csi | grep test3 "name": "test3" ceph fs clone status ocs-storagecluster-cephfilesystem test3 --group_name csi { "status": { "state": "in-progress", "source": { "volume": "ocs-storagecluster-cephfilesystem", "subvolume": "csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542", "snapshot": "csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542", "group": "csi" } } } 2) Also fyi, the original pvc from the problem description happens to be still there in pending state. $ oc get pvc -n chudg NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE wmla-conda Pending ocs-storagecluster-cephfs 33d wmla-cws-share Bound pvc-3aa99b42-922b-435a-aa92-2b9ba181b489 15Gi RWX ocs-storagecluster-cephfs 33d wmla-edi Bound pvc-00a8de84-9ad3-4523-add0-a2ed8ebb522a 10Gi RWX ocs-storagecluster-cephfs 33d wmla-infoservice Bound pvc-09009aa2-8b6c-4be7-a24f-f9a3d0af6b2b 4Gi RWX ocs-storagecluster-cephfs 33d wmla-logging Bound pvc-6e2399f8-2938-47a8-84fa-fcbcceed9934 10Gi RWX ocs-storagecluster-cephfs 33d wmla-mygpfs Bound pvc-2d74707d-870e-4e9c-a74b-93beb77ec52a 20Gi RWX ocs-storagecluster-cephfs 33d
(In reply to henrychi from comment #24) > Regarding the > no snap source in omap for "csi.snap.6702cdf8-a8c9-11eb-9bc2-0a580a81022a", > I remember someone in comment #8 was able to reproduce. Is that still > relevant, or a different issue? That was a different issue as there the `deletePolicy` in snapshotclass was incorrectly set as `Delete`. For successful backup-restore operation by velero, deletePolicy should be `Retain`. Also, could you re-verify if this is not the case for you?
Correct, CloudPak for Data service teams are using volumesnapshotclasses with the Retain deletePolicy. Thanks for checking.
Setting the NI back on Patrick, please check https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c25
Patrick, did you get a chance to look into the logs?
I don't see any update in the mgr log related to clone or any operation apart from few config sets. Could you please verify ?
comment #25 shows the steps and logs of the clone test "test3". The logs were taken after waiting two hours. The logs are from all pods in the openshift-storage namespace.
Script used to get logs for i in $(kubectl get pods |cut -d" " -f1); do kubectl logs $i --all-containers=true > $i.log; done
Not sure if it helps, but there is also another set of logs in comment #19, with a similar clone test called "test".
Hi Henry, Sorry for the delay in response. I was away on PTO and am still getting caught up. The logs from rook-ceph-mgr-a-6ccf987f58-59fzg.log in https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c25 do not look like normal ceph-mgr logs: > better match: 1.5 > 1.5: config show-with-defaults <who> > better match: 5.5 > 1.5: config set <who> <name> <value> {--force} > bestcmds_sorted: > [{'flags': 0, > 'help': 'Set a configuration option for one or more entities', > 'module': 'config', > 'perm': 'rw', > 'sig': [argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=config), > argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=set), > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=who, n=1, numseen=0), > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=name, n=1, numseen=0), > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=value, n=1, numseen=0), > argdesc(<class 'ceph_argparse.CephBool'>, req=False, name=force, n=1, numseen=0)]}] > Submitting command: {'prefix': 'config set', 'who': 'mgr.a', 'name': 'mgr/prometheus/a/server_addr', 'value': '10.129.2.16', 'force': True} > failed to try resolving symlinks in path /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: lstat /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: no such file or directory I unsure what's going on here. Any idea Humble/Madhu?
(In reply to Patrick Donnelly from comment #34) > Hi Henry, > > Sorry for the delay in response. I was away on PTO and am still getting > caught up. > > The logs from rook-ceph-mgr-a-6ccf987f58-59fzg.log in > https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c25 do not look like > normal ceph-mgr logs: > > > better match: 1.5 > 1.5: config show-with-defaults <who> > > better match: 5.5 > 1.5: config set <who> <name> <value> {--force} > > bestcmds_sorted: > > [{'flags': 0, > > 'help': 'Set a configuration option for one or more entities', > > 'module': 'config', > > 'perm': 'rw', > > 'sig': [argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=config), > > argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=set), > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=who, n=1, numseen=0), > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=name, n=1, numseen=0), > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=value, n=1, numseen=0), > > argdesc(<class 'ceph_argparse.CephBool'>, req=False, name=force, n=1, numseen=0)]}] > > Submitting command: {'prefix': 'config set', 'who': 'mgr.a', 'name': 'mgr/prometheus/a/server_addr', 'value': '10.129.2.16', 'force': True} > > failed to try resolving symlinks in path /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: lstat /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: no such file or directory > > I unsure what's going on here. Any idea Humble/Madhu? AFAIC this is something related to the kubernetes logging. Just a guess when collecting the logs the mrg pod might have restarted or something like that happened. @Travis or @Seb any idea about this?
(In reply to Madhu Rajanna from comment #35) > (In reply to Patrick Donnelly from comment #34) > > Hi Henry, > > > > Sorry for the delay in response. I was away on PTO and am still getting > > caught up. > > > > The logs from rook-ceph-mgr-a-6ccf987f58-59fzg.log in > > https://bugzilla.redhat.com/show_bug.cgi?id=1952708#c25 do not look like > > normal ceph-mgr logs: > > > > > better match: 1.5 > 1.5: config show-with-defaults <who> > > > better match: 5.5 > 1.5: config set <who> <name> <value> {--force} > > > bestcmds_sorted: > > > [{'flags': 0, > > > 'help': 'Set a configuration option for one or more entities', > > > 'module': 'config', > > > 'perm': 'rw', > > > 'sig': [argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=config), > > > argdesc(<class 'ceph_argparse.CephPrefix'>, req=True, name=prefix, n=1, numseen=0, prefix=set), > > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=who, n=1, numseen=0), > > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=name, n=1, numseen=0), > > > argdesc(<class 'ceph_argparse.CephString'>, req=True, name=value, n=1, numseen=0), > > > argdesc(<class 'ceph_argparse.CephBool'>, req=False, name=force, n=1, numseen=0)]}] > > > Submitting command: {'prefix': 'config set', 'who': 'mgr.a', 'name': 'mgr/prometheus/a/server_addr', 'value': '10.129.2.16', 'force': True} > > > failed to try resolving symlinks in path /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: lstat /var/log/pods/openshift-storage_rook-ceph-mgr-a-6ccf987f58-59fzg_e3964605-1f9b-4b54-b616-48934a539ee6/mgr/0.log: no such file or directory > > > > I unsure what's going on here. Any idea Humble/Madhu? > > AFAIC this is something related to the kubernetes logging. Just a guess when > collecting the logs the mrg pod might have restarted or something like that > happened. @Travis or @Seb any idea about this? Is this coming from the container runtime? Not sure what happened here but a broken symlink sounds plausible after a restart.
Moving it to 4.9 as we don't have an RCA yet, let's continue the discussion.
I'm not sure either what happened with the mgr log, I haven't seen that before, as if the log was replaced with the ceph cmdline help.
I think to make progress on this we will need the ceph-mgr log. Can another attempt at reproducing this be made? Hopefully the logs are not corrupted this time.
Here are new debug logs for clone test "test4". https://ibm.box.com/s/ngt0li196q118ro1rp9rhjknmlpodmim sh-4.4# ceph config set mgr debug_mgr 20 sh-4.4# ceph config set mgr debug_ms 1 sh-4.4# ceph config set mgr debug_client 20 sh-4.4# ceph fs subvolume snapshot clone ocs-storagecluster-cephfilesystem csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542 csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542 test4 --group_name csi --target_group_name csi sh-4.4# ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name csi | grep test4 "name": "test4" sh-4.4# ceph fs clone status ocs-storagecluster-cephfilesystem test4 --group_name csi { "status": { "state": "in-progress", "source": { "volume": "ocs-storagecluster-cephfilesystem", "subvolume": "csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542", "snapshot": "csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542", "group": "csi" } } }
Thanks. It looks like it's just the ceph-mgr is chewing through the clone slowly and is not stuck. There are indeed a lot of files: > The PVC is around 13GB and 185K files. I'm not sure there is anything we can do about this within the ceph-mgr yet but maybe +Venky will have some thoughts. Eventually we would like to solve this by having a container spun up to execute the copies rather than done in the ceph-mgr. That's a larger project though.
Created attachment 1800121 [details] rook-ceph-mgr-a-5f65476665-jdwwk.log.tgz
A bit more info - I checked this morning and the clone eventually failed. Attached rook-ceph-mgr-a-5f65476665-jdwwk.log.tgz. sh-4.4# ceph fs clone status ocs-storagecluster-cephfilesystem test4 --group_name csi { "status": { "state": "failed", "source": { "volume": "ocs-storagecluster-cephfilesystem", "subvolume": "csi-vol-cf0ee0b9-a28e-11eb-b5d8-0a580a800542", "snapshot": "csi-snap-0e600e2d-a292-11eb-b5d8-0a580a800542", "group": "csi" } } }
(In reply to Patrick Donnelly from comment #41) > Thanks. > > It looks like it's just the ceph-mgr is chewing through the clone slowly and > is not stuck. There are indeed a lot of files: > > > The PVC is around 13GB and 185K files. > > I'm not sure there is anything we can do about this within the ceph-mgr yet > but maybe +Venky will have some thoughts. Nothing that I can think of.
Hi Patrick, Any further suggestions on how to proceed with this BZ? Should we document this limitation?
(In reply to Mudit Agarwal from comment #45) > Hi Patrick, > > Any further suggestions on how to proceed with this BZ? > Should we document this limitation? I think document and move on. An ODF RFE BZ to improve this by pulling clones out of the ceph-mgr could be created (and tracked) so we can begin prioritizing the potential enhancement.
A few questions: 1. Why did did the clone fail? 2. From a customer point a view, what conditions causes this issue to occur? 3. Is there a workaround? If the customer did a backup using snapshots, they need to be able to recover the volume.
As suggested by Patrick, we need to create an ODF RFE BZ to track the ceph-mgr enhancement. Keeping this BZ open for now till the discussions are complete.
I noticed the same issue while restoring a snapshot into PVC: ODF 4.10 development driver. This is while working on OADP operator/velero to backup and restore. Warning ProvisioningFailed 24m (x14 over 42m) openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-5cb6b9665c-b2pzx_4955c704-cb9a-44ea-95bb-b0850854cc9d failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Internal desc = key not found: no snap source in omap for "csi.snap.b5e1106a-a524-11ec-8f27-0a580a81021b"
I am not able to reproduce it with manual tests with these YAML definitions: ------ RBD: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-test-4172f02505b142da854b0b4af64f564 spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: ocs-storagecluster-ceph-rbd apiVersion: v1 kind: Pod metadata: name: pod-test-rbd-85b3dd544c8f4b7396cc4fe19c1 spec: containers: - image: quay.io/ocsci/nginx:latest name: web-server volumeMounts: - mountPath: /var/lib/www/html name: mypvc volumes: - name: mypvc persistentVolumeClaim: claimName: pvc-test-4172f02505b142da854b0b4af64f564 readOnly: false apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshot metadata: name: snapshot-test-1848ac13564a41f9ac4789f3e4 spec: source: persistentVolumeClaimName: pvc-test-4172f02505b142da854b0b4af64f564 volumeSnapshotClassName: ocs-storagecluster-rbdplugin-snapclass apiVersion: v1 kind: PersistentVolumeClaim metadata: name: restore-pvc-test-2c87a03576504f1dbbf5199 spec: accessModes: - ReadWriteOnce dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: snapshot-test-1848ac13564a41f9ac4789f3e4 resources: requests: storage: 5Gi storageClassName: ocs-storagecluster-ceph-rbd apiVersion: v1 kind: Pod metadata: name: pod-test-rbd-fee5ccb8700047c6b2c071cfa5e spec: containers: - image: quay.io/ocsci/nginx:latest name: web-server volumeMounts: - mountPath: /var/lib/www/html name: mypvc volumes: - name: mypvc persistentVolumeClaim: claimName: restore-pvc-test-2c87a03576504f1dbbf5199 readOnly: false -------FS: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-test-cee52b42858e4c949577e6879faa88c spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: ocs-storagecluster-cephfs apiVersion: v1 kind: Pod metadata: name: pod-test-cephfs-ad6b9ee20c3047ed872e9143 spec: containers: - image: quay.io/ocsci/nginx:latest name: web-server volumeMounts: - mountPath: /var/lib/www/html name: mypvc volumes: - name: mypvc persistentVolumeClaim: claimName: pvc-test-cee52b42858e4c949577e6879faa88c readOnly: false apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshot metadata: name: snapshot-test-54f67af346c844d7b388bfb2ee spec: source: persistentVolumeClaimName: pvc-test-cee52b42858e4c949577e6879faa88c volumeSnapshotClassName: ocs-storagecluster-cephfsplugin-snapclass apiVersion: v1 kind: PersistentVolumeClaim metadata: name: restore-pvc-test-0f29660a8df34418b4b6d48 spec: accessModes: - ReadWriteOnce dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: snapshot-test-54f67af346c844d7b388bfb2ee resources: requests: storage: 5Gi storageClassName: ocs-storagecluster-cephfs apiVersion: v1 kind: Pod metadata: name: pod-test-cephfs-eaf064d360124ca98352f632 spec: containers: - image: quay.io/ocsci/nginx:latest name: web-server volumeMounts: - mountPath: /var/lib/www/html name: mypvc volumes: - name: mypvc persistentVolumeClaim: claimName: restore-pvc-test-0f29660a8df34418b4b6d48 readOnly: false
Above problem goes away when I changed deletion policy on volumesnapshotclass to 'Retain'. FYI.