Created attachment 1713881 [details] OSP manila for NFS share network topology Created attachment 1713881 [details] OSP manila for NFS share network topology Description of problem: manila-csi is able to create the required share on OSP, the PV is bound to the PVC but the containers are not able to mount the created volume. The NFS share is reachable through a private network which is added to the OCP workers as a second interface. The share can be mounted manually on the worker as well. The issue is that pod/csi-nodeplugin-nfsplugin is not able to see the worker/hostnetwork and based on that it can't mount the share automatically "There is a suggested solution in the Expected results" Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Install OSP integrated with Ceph and enable manila for fileShare following the standard Architecture design considering the network as attached 2. Install OCP on OSP 3. Install manila-csi operator and driver while attaching a second interface on workers for StoaregNFS reachability 4. Create PVC using the newly generated manila storage class 5. Attach the PVC to one of the containers like image-registry Actual results: $ oc get pods -n openshift-image-registry NAME READY STATUS RESTARTS AGE cluster-image-registry-operator-7c7c9d6bf6-99hg2 2/2 Running 0 43h image-pruner-1599350400-9gtp8 0/1 Completed 0 12h image-registry-747ccb4b66-b8wjf 0/1 ContainerCreating 0 9s --------------> stuck in ContainerCreating node-ca-9cw9w 1/1 Running 0 43h node-ca-cngrl 1/1 Running 0 43h node-ca-ffpmz 1/1 Running 0 39h node-ca-kvgwh 1/1 Running 0 39h node-ca-pcqt4 1/1 Running 0 43h node-ca-vwxth 1/1 Running 0 43h node-ca-w74rh 1/1 Running 0 43h 4m7s Warning FailedMount pod/image-registry-747ccb4b66-b8wjf MountVolume.SetUp failed for volume "pvc-15a5342e-6870-4950-9d1a-0eba0378746e" : rpc error: code = DeadlineExceeded desc = context deadline exceeded 20m Warning FailedMount pod/image-registry-747ccb4b66-b8wjf Unable to attach or mount volumes: unmounted volumes=[registry-storage], unattached volumes=[registry-certificates trusted-ca installation-pull-secrets registry-token-gkswx registry-storage registry-tls]: timed out waiting for the condition 7m45s Warning FailedMount pod/image-registry-747ccb4b66-b8wjf Unable to attach or mount volumes: unmounted volumes=[registry-storage], unattached volumes=[registry-storage registry-tls registry-certificates trusted-ca installation-pull-secrets registry-token-gkswx]: timed out waiting for the condition 5m42s Warning FailedMount pod/image-registry-747ccb4b66-b8wjf Unable to attach or mount volumes: unmounted volumes=[registry-storage], unattached volumes=[registry-tls registry-certificates trusted-ca installation-pull-secrets registry-token-gkswx registry-storage]: timed out waiting for the condition 13m Warning FailedMount pod/image-registry-747ccb4b66-b8wjf Unable to attach or mount volumes: unmounted volumes=[registry-storage], unattached volumes=[trusted-ca installation-pull-secrets registry-token-gkswx registry-storage registry-tls registry-certificates]: timed out waiting for the condition 3m39s Warning FailedMount pod/image-registry-747ccb4b66-b8wjf Unable to attach or mount volumes: unmounted volumes=[registry-storage], unattached volumes=[installation-pull-secrets registry-token-gkswx registry-storage registry-tls registry-certificates trusted-ca]: timed out waiting for the condition 3m38s Warning FailedMount pod/image-registry-747ccb4b66-b8wjf MountVolume.SetUp failed for volume "pvc-15a5342e-6870-4950-9d1a-0eba0378746e" : rpc error: code = Unavailable desc = transport is closing $ oc logs csi-nodeplugin-nfsplugin-f8c9w -n openshift-manila-csi-driver I0906 12:02:25.734215 1 nfs.go:49] Driver: nfs.csi.k8s.io version: 2.0.0 I0906 12:02:25.734918 1 nfs.go:99] Enabling volume access mode: SINGLE_NODE_WRITER I0906 12:02:25.734923 1 nfs.go:99] Enabling volume access mode: SINGLE_NODE_READER_ONLY I0906 12:02:25.734925 1 nfs.go:99] Enabling volume access mode: MULTI_NODE_READER_ONLY I0906 12:02:25.734928 1 nfs.go:99] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER I0906 12:02:25.734930 1 nfs.go:99] Enabling volume access mode: MULTI_NODE_MULTI_WRITER I0906 12:02:25.734936 1 nfs.go:110] Enabling controller service capability: UNKNOWN I0906 12:02:25.736767 1 server.go:92] Listening for connections on address: &net.UnixAddr{Name:"/plugin/csi.sock", Net:"unix"} E0906 12:02:27.668921 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:03:16.979205 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:03:17.280677 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:05:19.048241 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:05:19.349408 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:05:52.343538 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:05:52.343774 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:07:21.119619 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:07:21.417581 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:07:52.507238 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:07:52.507226 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:08:13.255901 1 mount_linux.go:139] Mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs -o nfsvers=4.1 10.206.120.6:/volumes/_nogroup/9bcfc69b-788f-4ac7-a201-cead05379743 /var/lib/kubelet/pods/0de89c17-75b0-4702-bcdc-41c87e02ad7b/volumes/kubernetes.io~csi/pvc-15a5342e-6870-4950-9d1a-0eba0378746e/mount Output: mount.nfs: Connection timed out E0906 12:08:13.257048 1 utils.go:50] GRPC error: rpc error: code = Internal desc = mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs -o nfsvers=4.1 10.206.120.6:/volumes/_nogroup/9bcfc69b-788f-4ac7-a201-cead05379743 /var/lib/kubelet/pods/0de89c17-75b0-4702-bcdc-41c87e02ad7b/volumes/kubernetes.io~csi/pvc-15a5342e-6870-4950-9d1a-0eba0378746e/mount Output: mount.nfs: Connection timed out E0906 12:09:23.196106 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:09:23.497019 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:09:52.687017 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:09:52.687208 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:11:25.291848 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted E0906 12:11:25.593596 1 utils.go:50] GRPC error: rpc error: code = NotFound desc = Volume not mounted Expected results: The container should be able to mount the nfs share. $ oc get pods -n openshift-image-registry NAME READY STATUS RESTARTS AGE cluster-image-registry-operator-7c7c9d6bf6-99hg2 2/2 Running 0 44h image-pruner-1599350400-9gtp8 0/1 Completed 0 12h image-registry-747ccb4b66-b8wjf 1/1 Running 0 47m --------------> up and running node-ca-9cw9w 1/1 Running 0 44h node-ca-cngrl 1/1 Running 0 44h node-ca-ffpmz 1/1 Running 0 39h node-ca-kvgwh 1/1 Running 0 39h node-ca-pcqt4 1/1 Running 0 44h node-ca-vwxth 1/1 Running 0 44h node-ca-w74rh 1/1 Running 0 44h [ocp4@rhel8-node ~]$ A suggested solution is to add "spec.hostNetwork: true" to the csi-nodeplugin-nfsplugin daemonset -----------------------------------> suggested solution $ oc get daemonset.apps/csi-nodeplugin-nfsplugin -o yaml -n openshift-manila-csi-driver apiVersion: apps/v1 kind: DaemonSet metadata: annotations: deprecated.daemonset.template.generation: "4" labels: app: openstack-manila-csi component: nfs-nodeplugin name: csi-nodeplugin-nfsplugin namespace: openshift-manila-csi-driver resourceVersion: "1289489" selfLink: /apis/apps/v1/namespaces/openshift-manila-csi-driver/daemonsets/csi-nodeplugin-nfsplugin uid: b538297c-3074-4fdc-8a59-bc7d8eb6427a spec: revisionHistoryLimit: 10 selector: matchLabels: app: openstack-manila-csi component: nfs-nodeplugin template: metadata: creationTimestamp: null labels: app: openstack-manila-csi component: nfs-nodeplugin spec: containers: - args: - --nodeid=$(NODE_ID) - --endpoint=unix://plugin/csi.sock - --mount-permissions=0777 env: - name: NODE_ID valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName image: registry.redhat.io/openshift4/ose-csi-driver-nfs-rhel7@sha256:da67709ab66079b798914f1fe5cf867d6a050635534d2e56588164d8d9189183 imagePullPolicy: IfNotPresent name: nfs resources: {} securityContext: allowPrivilegeEscalation: true capabilities: add: - SYS_ADMIN privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /plugin name: plugin-dir - mountPath: /var/lib/kubelet/pods mountPropagation: Bidirectional name: pods-mount-dir dnsPolicy: ClusterFirst hostNetwork: true --------------------------------------------------> should be added so that the manila nfs plugin will see the worker/host networks restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: csi-nodeplugin serviceAccountName: csi-nodeplugin terminationGracePeriodSeconds: 30 volumes: - hostPath: path: /var/lib/kubelet/plugins/csi-nfsplugin type: DirectoryOrCreate name: plugin-dir - hostPath: path: /var/lib/kubelet/pods type: Directory name: pods-mount-dir updateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate status: currentNumberScheduled: 4 desiredNumberScheduled: 4 numberAvailable: 4 numberMisscheduled: 0 numberReady: 4 observedGeneration: 4 updatedNumberScheduled: 4 Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
Thanks for the report. It looks like a dup of bug #1867152 - different symptoms, but the same solution. *** This bug has been marked as a duplicate of bug 1867152 ***