Description of problem (please be detailed as possible and provide log snippests): Failed to create pod on ocs-storagecluster-cephfs volume Version of all relevant components (if applicable): openshift installer (4.17.0-0.nightly-2024-08-09-031511) ocs-registry:4.17.0-70 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes ( 2/2 ) Can this issue reproduce from the UI? Not tried If this is a regression, please provide more details to justify this: Yes Steps to Reproduce: 1. install ODF using ocs-ci 2. create pod using sc "ocs-storagecluster-cephfs" 3. Actual results: $ oc describe pod image-registry-557c89c7b9-8gtdx -n openshift-image-registry Name: image-registry-557c89c7b9-8gtdx Namespace: openshift-image-registry Priority: 2000000000 Priority Class Name: system-cluster-critical . . Warning FailedMount 4m39s kubelet MountVolume.MountDevice failed for volume "pvc-99a88fc8-18bb-4c9f-9b17-619803be4721" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 172.30.211.105:3300,172.30.248.126:3300,172.30.233.144:3300:/volumes/csi/csi-vol-59580dd3-824d-4d3f-b5df-1d2aae13829c/0a9fdeac-ffa6-4f02-b223-be8e0a622b2e /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/f4c977bb751a5bd86b137e853bec376625ce8faccee6b0e3e15c6a5120020a9e/globalmount -o name=csi-cephfs-node,secretfile=/tmp/csi/keys/keyfile-3526911127,mds_namespace=ocs-storagecluster-cephfilesystem,ms_mode=prefer-crc,read_from_replica=localize,crush_location=host:compute-2|rack:rack2,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon 2024-08-13T05:17:39.311+0000 7f74dea48000 -1 failed for service _ceph-mon._tcp mount error: no mds (Metadata Server) is up. The cluster might be laggy, or you may not be authorized Warning FailedMount 3m37s kubelet MountVolume.MountDevice failed for volume "pvc-99a88fc8-18bb-4c9f-9b17-619803be4721" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 172.30.211.105:3300,172.30.248.126:3300,172.30.233.144:3300:/volumes/csi/csi-vol-59580dd3-824d-4d3f-b5df-1d2aae13829c/0a9fdeac-ffa6-4f02-b223-be8e0a622b2e /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/f4c977bb751a5bd86b137e853bec376625ce8faccee6b0e3e15c6a5120020a9e/globalmount -o name=csi-cephfs-node,secretfile=/tmp/csi/keys/keyfile-3461892681,mds_namespace=ocs-storagecluster-cephfilesystem,ms_mode=prefer-crc,read_from_replica=localize,crush_location=host:compute-2|rack:rack2,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon 2024-08-13T05:18:42.272+0000 7f6df8ce3000 -1 failed for service _ceph-mon._tcp mount error: no mds (Metadata Server) is up. The cluster might be laggy, or you may not be authorized Warning FailedMount 2m36s kubelet MountVolume.MountDevice failed for volume "pvc-99a88fc8-18bb-4c9f-9b17-619803be4721" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 172.30.211.105:3300,172.30.248.126:3300,172.30.233.144:3300:/volumes/csi/csi-vol-59580dd3-824d-4d3f-b5df-1d2aae13829c/0a9fdeac-ffa6-4f02-b223-be8e0a622b2e /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/f4c977bb751a5bd86b137e853bec376625ce8faccee6b0e3e15c6a5120020a9e/globalmount -o name=csi-cephfs-node,secretfile=/tmp/csi/keys/keyfile-1980131443,mds_namespace=ocs-storagecluster-cephfilesystem,ms_mode=prefer-crc,read_from_replica=localize,crush_location=host:compute-2|rack:rack2,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon Expected results: pod should be running Additional info: Tried creating the demo-pod( nginx ) and hit the same issue job: https://url.corp.redhat.com/23da020 must gather: https://url.corp.redhat.com/8fb8363
Please update the RDT flag/text appropriately.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:8676