Description of problem: In Managed Services consumer cluster, app pods are taking more time to reach Running state due to the error "failed to get connection: connecting failed: rados: ret=-13, Permission denied ". This was observed with RBD PVC. [Creation of pods with CephFS PVC is currently blocked due to the bug #2184068]. The pod will reach running state eventually. The error is seen in RBD PVC events as well. PVC will reach Bound state. Events from the pod: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m42s default-scheduler Successfully assigned app-namespace/pod-pvc-rbd-pr2 to ip-10-0-23-25.us-east-2.compute.internal Normal SuccessfulAttachVolume 2m42s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-437cf4a0-bd99-4c6a-95c1-2540a95512e0" Warning FailedMount 91s (x8 over 2m38s) kubelet MountVolume.MountDevice failed for volume "pvc-437cf4a0-bd99-4c6a-95c1-2540a95512e0" : rpc error: code = Internal desc = failed to establish the connection: failed to get connection: connecting failed: rados: ret=-13, Permission denied Warning FailedMount 40s kubelet Unable to attach or mount volumes: unmounted volumes=[mypvc], unattached volumes=[kube-api-access-r7s4c mypvc]: timed out waiting for the condition Normal AddedInterface 20s multus Add eth0 [10.128.2.33/23] from ovn-kubernetes Normal Pulling 20s kubelet Pulling image "quay.io/ocsci/nginx:latest" Normal Pulled 17s kubelet Successfully pulled image "quay.io/ocsci/nginx:latest" in 3.41486044s (3.414875171s including waiting) Normal Created 16s kubelet Created container web-server Normal Started 16s kubelet Started container web-server PVC: $ oc describe pvc pvc2-rbd-pr2 -n app-namespace Name: pvc2-rbd-pr2 Namespace: app-namespace StorageClass: odf-storage-sc-rbd-odf-storage3 Status: Bound Volume: pvc-50b43d63-cb40-4f0d-8e63-a2571bba6234 Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: odf-storage.rbd.csi.ceph.com volume.kubernetes.io/storage-provisioner: odf-storage.rbd.csi.ceph.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 100Gi Access Modes: RWO VolumeMode: Filesystem Used By: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ExternalProvisioning 77s (x2 over 82s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "odf-storage.rbd.csi.ceph.com" or manually created by system administrator Warning ProvisioningFailed 73s (x5 over 82s) odf-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-5d9dc758b8-jwq6j_27048bf2-7e7d-48fd-945b-4ec2b0b1e141 failed to provision volume with StorageClass "odf-storage-sc-rbd-odf-storage3": rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-13, Permission denied Normal Provisioning 65s (x6 over 82s) odf-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-5d9dc758b8-jwq6j_27048bf2-7e7d-48fd-945b-4ec2b0b1e141 External provisioner is provisioning volume for claim "app-namespace/pvc2-rbd-pr2" Normal ProvisioningSucceeded 65s odf-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-5d9dc758b8-jwq6j_27048bf2-7e7d-48fd-945b-4ec2b0b1e141 Successfully provisioned volume pvc-50b43d63-cb40-4f0d-8e63-a2571bba6234 Logs from "csi-cephfsplugin-provisioner-cc87b9bf-s47jl" pod. I0405 13:22:52.935580 1 controller.go:1337] provision "app-namespace/pvc2-rbd-pr2" class "odf-storage-sc-rbd-odf-storage3": started I0405 13:22:52.935713 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"app-namespace", Name:"pvc2-rbd-pr2", UID:"50b43d63-cb40-4f0d-8e63-a2571bba6234", APIVersion:"v1", ResourceVersion:"430394", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "app-namespace/pvc2-rbd-pr2" I0405 13:22:52.940152 1 connection.go:183] GRPC call: /csi.v1.Controller/CreateVolume I0405 13:22:52.940164 1 connection.go:184] GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-50b43d63-cb40-4f0d-8e63-a2571bba6234","parameters":{"clusterID":"odf-storage-sc-rbd-odf-storage3","csi.storage.k8s.io/pv/name":"pvc-50b43d63-cb40-4f0d-8e63-a2571bba6234","csi.storage.k8s.io/pvc/name":"pvc2-rbd-pr2","csi.storage.k8s.io/pvc/namespace":"app-namespace","imageFeatures":"layering,deep-flatten,exclusive-lock,object-map,fast-diff","imageFormat":"2","pool":"cephblockpool-storageconsumer-083b4692-81bb-4ffb-ba3d-6c7a72fb8733-a2940157"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]} I0405 13:22:53.230894 1 connection.go:186] GRPC response: {} I0405 13:22:53.230933 1 connection.go:187] GRPC error: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-13, Permission denied I0405 13:22:53.230945 1 controller.go:802] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-13, Permission denied I0405 13:22:53.230975 1 controller.go:1075] Final error received, removing PVC 50b43d63-cb40-4f0d-8e63-a2571bba6234 from claims in progress W0405 13:22:53.230986 1 controller.go:934] Retrying syncing claim "50b43d63-cb40-4f0d-8e63-a2571bba6234", failure 4 E0405 13:22:53.231002 1 controller.go:957] error syncing claim "50b43d63-cb40-4f0d-8e63-a2571bba6234": failed to provision volume with StorageClass "odf-storage-sc-rbd-odf-storage3": rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-13, Permission denied I0405 13:22:53.231018 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"app-namespace", Name:"pvc2-rbd-pr2", UID:"50b43d63-cb40-4f0d-8e63-a2571bba6234", APIVersion:"v1", ResourceVersion:"430394", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "odf-storage-sc-rbd-odf-storage3": rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-13, Permission denied ============================================================================ Version-Release number of selected component (if applicable): How reproducible: 7/10 Steps to Reproduce: 1. Create provider and consumer cluster in ODF to ODF on ROSA configuration. 2. Create storageclient on consumer. 3. Create RBD PVC. 4. Create pod which use the PVC. 5. Describe PVC and Pod to check the error events. Actual results: error: connecting failed: rados: ret=-13, Permission denied Expected results: No error events which delays the creation of PVC or Pod. Additional info: must-gather logs are not attached because must-gather will not collect pod logs when the namespace is not openshift-storage.
Version-Release number of selected component: OCP 4.12.9 ODF 4.13.0-124
Hi Madhu, I couldn't reproduce this issue in another cluster.