Description of problem (please be detailed as possible and provide log snippets): Expanding a PVC backed by SC/ocs-storagecluster-ceph-rbd in FileSystem mode fails with the following error: "Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC." [ANALYSIS] Ceph backend is healthy: $ ceph -s cluster: id: 3c97ec89-a4a9-4619-b4d8-59bd0c13dc33 health: HEALTH_OK All pvc's in the openshift-storage namespaces are active and bound [bmcmurra@supportshell-1 03417581]$ omg get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-pg-0 Bound pvc-297e4d1f-a352-452f-aadf-18efbc212728 50Gi RWO ocs-storagecluster-ceph-rbd 64d ocs-deviceset-ocs-local-volume-set-0-data-0b76rl Bound local-pv-c076fe97 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-10qcz6k Bound local-pv-4b6d6f0d 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-11fkr7z Bound local-pv-8c57f7c 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-1256lgh Bound local-pv-722d547c 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-13rd4fg Bound local-pv-6dddb05f 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-14pddrm Bound local-pv-19eeaea6 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-1gp2s4 Bound local-pv-f613332c 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-22jv78 Bound local-pv-17352a70 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-3mg2pt Bound local-pv-8487cbe0 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-4wcsqm Bound local-pv-e9d291b2 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-5xcjgn Bound local-pv-3c087778 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-6mn4kr Bound local-pv-56140414 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-7njrz7 Bound local-pv-f972ff25 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-8srxmp Bound local-pv-81f1396c 3576Gi RWO ocs-local-volume-set 64d ocs-deviceset-ocs-local-volume-set-0-data-962lx2 Bound local-pv-d2c211a0 3576Gi RWO ocs-local-volume-set 64d - All ODF pods/operators are up and healthy Version of all relevant components (if applicable): OCP 4.10.39 ODF v4.10.9 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Possibly, but I don't have a cluster prior to 4.10 to test with Steps to Reproduce: 1. Create a pvc with the ocs-storagecluster-ceph-rbd storage class in FileSystem mode 2. Try to expand the PVC using the ODF console or by editing it's yaml file 3. The pvc fails to expand and the event stream of the resource shows the following error: "Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC." Actual results: PVC fails to expand. Expected results: PVC succeeds in expanding. Additional info:
Thanks Hemant, That makes sense now regarding the behavior on my test cluster. It seemed strange at first, as I was able to expand PVC on cephfs sc and rbd block without issue with the same " test-X " PVCs unattached to any pod workload.
> * Is this fix intended to be backported? > >> Not sure about it as this case exists on all ODF versions and we have a workaround. If there is an ask for it, this needs to be decided by the program. imo, this does not qualify/satisfy backport request, so, its very unlikely to be considered.
I tried the scenario described in Comment#22 ( expanded the pvc number of times, chen changed pod count on the deployment from 0 to 1 also number of times), verified that the expansion is successful. However, I did not manage to receve a state in which the staging_target_path is missing in the NodeExpandVolume RPC call. From the logs : [ypersky@ypersky ocs-ci]$ oc logs csi-rbdplugin-n4xl8 -c csi-rbdplugin | grep NodeExpandVolume I0418 05:34:46.538397 15996 utils.go:195] ID: 22 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume I0418 08:27:47.455884 15996 utils.go:195] ID: 53 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume I0418 08:27:47.629071 15996 utils.go:195] ID: 56 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume I0418 09:27:27.334022 15996 utils.go:195] ID: 170 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume I0418 09:27:27.529704 15996 utils.go:195] ID: 173 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume [ypersky@ypersky ocs-ci]$ oc logs csi-rbdplugin-n4xl8 -c csi-rbdplugin | grep "ID: 173 " I0418 09:27:27.529704 15996 utils.go:195] ID: 173 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC call: /csi.v1.Node/NodeExpandVolume I0418 09:27:27.529896 15996 utils.go:206] ID: 173 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC request: {"capacity_range":{"required_bytes":12884901888},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.rbd.csi.ceph.com/d299fc5020760f39dbd52b7b0bf3d33f93036abba4c82e664b28257c5c71ab52/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":7}},"volume_id":"0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e","volume_path":"/var/lib/kubelet/pods/293d17fb-5196-41de-a694-8a2034959b20/volumes/kubernetes.io~csi/pvc-639692f7-45ec-4e83-99bc-c3a5bf66c461/mount"} I0418 09:27:27.592903 15996 cephcmds.go:105] ID: 173 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e command succeeded: rbd [device list --format=json --device-type krbd] I0418 09:27:27.614409 15996 utils.go:212] ID: 173 Req-ID: 0001-0011-openshift-storage-0000000000000001-8a2bf746-5abe-4120-95f4-7ad0de0a855e GRPC response: {}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3742