Bug 1944655

Summary: [manila-csi-driver-operator] openstack-manila-csi-nodeplugin pods stucked with ".. still connecting to unix:///var/lib/kubelet/plugins/csi-nfsplugin/csi.sock"
Product: OpenShift Container Platform Reporter: Mauro Oddi <moddi>
Component: StorageAssignee: Matthew Booth <mbooth>
Storage sub component: OpenStack CSI Drivers QA Contact: rlobillo
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: aos-bugs, asimonel, atn, eduen, gcharot, mbooth, mfedosin, pprinett, rlobillo, tbarron, vimartin
Version: 4.7Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1969345 (view as bug list) Environment:
Last Closed: 2021-07-27 22:56:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1969345    

Description Mauro Oddi 2021-03-30 12:20:33 UTC
Description of problem:

After enabling Manila at OSP level, Manila CSI operator triggers the creation of all the resources for Manila CSI driver, however many are in error state.


Version-Release number of selected component (if applicable):
OCP 4.7.2

How reproducible:


Steps to Reproduce:
1. Deploy OCP 4.7.2 on stack UPI
2. Enable Manila at OSP level
3. 

Actual results:
Many pods are failing to connect to its csi.sock like openstack-manila-csi-nodeplugin ones:

W0322 07:39:21.229527       1 builder.go:82] still connecting to unix:///var/lib/kubelet/plugins/csi-nfsplugin/csi.sock


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 10 Mauro Oddi 2021-05-10 14:27:40 UTC
Thanks Matt, great progress.
I will let the customer know.
If you have any idea about a possible w/a for this please let me know.

Cheers,
Mauro.

Comment 11 Matthew Booth 2021-05-11 13:29:45 UTC
Submitted an upstream patch to CPO.

Comment 12 Mike Fedosin 2021-05-26 17:49:05 UTC
*** Bug 1952182 has been marked as a duplicate of this bug. ***

Comment 17 rlobillo 2021-06-16 07:32:30 UTC
Verified on 4.8.0-0.nightly-2021-06-14-145150 on OSP16.1 (RHOS-16.1-RHEL-8-20210506.n.1) with manila enabled.

Steps:

1 - Performing IPI installation and confirming that manila is working fine:

$ oc get pods
NAME                           READY   STATUS    RESTARTS   AGE
demo-manila-7b5f47f4f8-nm2l7   1/1     Running   0          17s

$ oc get pvc
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS         AGE
pvc-manila   Bound    pvc-0e55a9be-bae0-45e1-854c-18f74a4cfd23   1Gi        RWO            csi-manila-default   46s

$ manila list
+--------------------------------------+------------------------------------------+------+-------------+-----------+-----------+-----------------+------+-------------------+
| ID                                   | Name                                     | Size | Share Proto | Status    | Is Public | Share Type Name | Host | Availability Zone |
+--------------------------------------+------------------------------------------+------+-------------+-----------+-----------+-----------------+------+-------------------+
| 508f3d29-4b28-45c1-9940-c6e9e1e15d2e | pvc-0e55a9be-bae0-45e1-854c-18f74a4cfd23 | 1    | NFS         | available | False     | default         |      | nova              |
+--------------------------------------+------------------------------------------+------+-------------+-----------+-----------+-----------------+------+-------------------+

2 - Configuring the proxy:

$ cat proxy_cluster.yaml 
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  name: cluster
spec:
  httpProxy: http://squid.corp.redhat.com:3128/

$ oc apply -f proxy_cluster.yaml

3 - Wait until pods on openshift-manila-csi-driver are restarted.

4 - Check nodeplugin logs:

[stack@undercloud-0 ~]$ for i in $(oc get pods -n openshift-manila-csi-driver -l component=nodeplugin -o NAME); do echo ***$i; oc logs -n openshift-manila-csi-driver $i -c csi-driver; echo; done
***pod/openstack-manila-csi-nodeplugin-49cwb
I0616 07:03:44.459084       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:03:44.460595       1 driver.go:128] Driver version: 0.9.0@
I0616 07:03:44.460600       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:03:44.460723       1 driver.go:132] Operating on NFS shares
I0616 07:03:44.460734       1 driver.go:137] Topology awareness disabled
I0616 07:03:44.463810       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:03:44.463871       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:03:44.463877       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:03:44.463881       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:03:44.463885       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:03:44.463889       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:03:44.463892       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
I0616 07:03:44.480066       1 connection.go:261] Probing CSI driver for readiness
I0616 07:03:44.497773       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:03:44.498643       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:03:44.499925       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

***pod/openstack-manila-csi-nodeplugin-5grpp
I0616 07:05:57.921084       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:05:57.921219       1 driver.go:128] Driver version: 0.9.0@
I0616 07:05:57.921225       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:05:57.921231       1 driver.go:132] Operating on NFS shares
I0616 07:05:57.921239       1 driver.go:137] Topology awareness disabled
I0616 07:05:57.928725       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:05:57.928796       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:05:57.928803       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:05:57.928808       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:05:57.928811       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:05:57.928815       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:05:57.928820       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
W0616 07:05:58.931612       1 builder.go:88] still connecting to unix:///var/lib/kubelet/plugins/csi-nfsplugin/csi.sock
I0616 07:05:58.950407       1 connection.go:261] Probing CSI driver for readiness
I0616 07:05:58.970823       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:05:58.971556       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:05:58.971842       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

***pod/openstack-manila-csi-nodeplugin-fs9fh
I0616 07:11:26.731968       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:11:26.732074       1 driver.go:128] Driver version: 0.9.0@
I0616 07:11:26.732080       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:11:26.732086       1 driver.go:132] Operating on NFS shares
I0616 07:11:26.732093       1 driver.go:137] Topology awareness disabled
I0616 07:11:26.738040       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:11:26.738074       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:11:26.738079       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:11:26.738083       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:11:26.738087       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:11:26.738090       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:11:26.738130       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
W0616 07:11:27.743462       1 builder.go:88] still connecting to unix:///var/lib/kubelet/plugins/csi-nfsplugin/csi.sock
I0616 07:11:27.751600       1 connection.go:261] Probing CSI driver for readiness
I0616 07:11:27.775556       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:11:27.776124       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:11:27.788033       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

***pod/openstack-manila-csi-nodeplugin-jxrwb
I0616 07:09:05.540960       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:09:05.541254       1 driver.go:128] Driver version: 0.9.0@
I0616 07:09:05.541266       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:09:05.541276       1 driver.go:132] Operating on NFS shares
I0616 07:09:05.541297       1 driver.go:137] Topology awareness disabled
I0616 07:09:05.541316       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:09:05.541325       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:09:05.541333       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:09:05.541341       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:09:05.541348       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:09:05.541355       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:09:05.541362       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
W0616 07:09:06.542965       1 builder.go:88] still connecting to unix:///var/lib/kubelet/plugins/csi-nfsplugin/csi.sock
I0616 07:09:06.554994       1 connection.go:261] Probing CSI driver for readiness
I0616 07:09:06.575799       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:09:06.581834       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:09:06.583346       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

***pod/openstack-manila-csi-nodeplugin-w5m4h
I0616 07:08:41.411371       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:08:41.411982       1 driver.go:128] Driver version: 0.9.0@
I0616 07:08:41.412039       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:08:41.412071       1 driver.go:132] Operating on NFS shares
I0616 07:08:41.412123       1 driver.go:137] Topology awareness disabled
I0616 07:08:41.412196       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:08:41.412225       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:08:41.412248       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:08:41.412281       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:08:41.412301       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:08:41.412337       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:08:41.412377       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
I0616 07:08:41.442953       1 connection.go:261] Probing CSI driver for readiness
I0616 07:08:41.515822       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:08:41.521301       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:08:41.522350       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

***pod/openstack-manila-csi-nodeplugin-wxkp9
I0616 07:03:56.309662       1 driver.go:127] Driver: manila.csi.openstack.org
I0616 07:03:56.309811       1 driver.go:128] Driver version: 0.9.0@
I0616 07:03:56.309817       1 driver.go:129] CSI spec version: 1.2.0
I0616 07:03:56.309823       1 driver.go:132] Operating on NFS shares
I0616 07:03:56.309830       1 driver.go:137] Topology awareness disabled
I0616 07:03:56.313585       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_VOLUME
I0616 07:03:56.313606       1 driver.go:200] Enabling controller service capability: CREATE_DELETE_SNAPSHOT
I0616 07:03:56.313626       1 driver.go:219] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I0616 07:03:56.313631       1 driver.go:219] Enabling volume access mode: MULTI_NODE_SINGLE_WRITER
I0616 07:03:56.313634       1 driver.go:219] Enabling volume access mode: MULTI_NODE_READER_ONLY
I0616 07:03:56.313654       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_WRITER
I0616 07:03:56.313657       1 driver.go:219] Enabling volume access mode: SINGLE_NODE_READER_ONLY
I0616 07:03:56.332411       1 connection.go:261] Probing CSI driver for readiness
I0616 07:03:56.356088       1 driver.go:266] proxying CSI driver nfs.csi.k8s.io version 2.0.0
I0616 07:03:56.357005       1 driver.go:230] Enabling node service capability: UNKNOWN
I0616 07:03:56.357403       1 driver.go:330] listening for connections on &net.UnixAddr{Name:"/var/lib/kubelet/plugins/manila.csi.openstack.org/csi.sock", Net:"unix"}

5 - Create new pod + pvc using manila:

$ cat manila_1.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "pvc-manila-1"
  namespace: "topologyaware-test"
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-manila-default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-manila-1
  namespace: "topologyaware-test"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: demo-manila-1
      cinder-az: nova
      nova-az: AZ-0
  template:
    metadata:
      labels:
        app: demo-manila-1
        cinder-az: nova
        nova-az: AZ-0
    spec:
      containers:
      - name: demo
        image: quay.io/kuryr/demo
        ports:
        - containerPort: 80
          protocol: TCP
        volumeMounts:
          - mountPath: /var/lib/www/data
            name: mydata
      nodeSelector:
        topology.cinder.csi.openstack.org/zone: AZ-0
      volumes:
        - name: mydata
          persistentVolumeClaim:
            claimName: pvc-manila-1
            readOnly: false

$ oc apply -f manila_1.yaml 

$ oc get pods,pvc 
NAME                                 READY   STATUS    RESTARTS   AGE
pod/demo-manila-1-5b9c4d65f8-kd9bk   1/1     Running   0          25m
pod/demo-manila-7b5f47f4f8-ws2kk     1/1     Running   0          25m

NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS         AGE
persistentvolumeclaim/pvc-manila     Bound    pvc-0e55a9be-bae0-45e1-854c-18f74a4cfd23   1Gi        RWO            csi-manila-default   46m
persistentvolumeclaim/pvc-manila-1   Bound    pvc-2c637ad3-9d61-4693-953f-b686f8d8f046   1Gi        RWO            csi-manila-default   28m

Comment 19 errata-xmlrpc 2021-07-27 22:56:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438