Bug 1902546 - Cinder csi driver node pod doesn't run on master node
Summary: Cinder csi driver node pod doesn't run on master node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.7.0
Assignee: Martin André
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-30 02:51 UTC by Wei Duan
Modified: 2021-02-24 15:36 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:36:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openstack-cinder-csi-driver-operator pull 18 0 None open Bug 1902546: Allow cinder-csi-driver-node pods to run everywhere 2021-01-12 12:43:43 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:36:52 UTC

Description Wei Duan 2020-11-30 02:51:09 UTC
Description of problem:
Cinder-csi-driver-node pod doesn't run on master node

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-29-133728

Steps to Reproduce:
1. Install OSP cluster and cinder csi driver is installed. 

2. Check CSI driver pods:
   oc -n openshift-cluster-csi-drivers get pod -o wide

3. Create a pod on a master that uses PVC.

Actual results:
1. CSI driver node pods run only on worker nodes
$ oc -n openshift-cluster-csi-drivers get pod -o wide
...
openstack-cinder-csi-driver-node-42svt                    2/2     Running   0          49m   192.168.2.110   wduan-1130a-fcw45-worker-0-jvtg8   <none>           <none>
openstack-cinder-csi-driver-node-72flp                    2/2     Running   1          47m   192.168.3.54    wduan-1130a-fcw45-worker-0-vnqmn   <none>           <none>
openstack-cinder-csi-driver-node-mh9js                    2/2     Running   0          49m   192.168.0.14    wduan-1130a-fcw45-worker-0-qh5dx   <none>           <none>
...

2. Masters can't use a PVC provided by the CSI driver

Expected results:
Masters should have openstack-cinder-csi-driver-node pod then Masters can use a PVC provided by the CSI driver.

Comment 1 Martin André 2020-12-10 09:18:05 UTC
We believe this might have been an infra issue. Mike to double check.

Comment 2 Martin André 2021-01-11 18:12:10 UTC
Hey Wei Duan, just to make sure that I understand the issue, do we expect the cinder-csi-driver-node pods to run on the master nodes all the time or only when they are schedulable?

If it's the former, I think we can add the following toleration to the Deployment spec:

    tolerations:
    - key: node-role.kubernetes.io/master
      operator: Exists
      effect: "NoSchedule"

And I suppose it's the same for the cinder-csi-driver-controller pods in https://bugzilla.redhat.com/show_bug.cgi?id=1902547 ?

Comment 3 Wei Duan 2021-01-12 02:27:09 UTC
Hi replied in the slack, let's discuss there and make a decision.

Comment 5 Wei Duan 2021-01-13 07:28:31 UTC
Verified pass on 4.7.0-0.nightly-2021-01-12-203716

$ oc -n openshift-cluster-csi-drivers get pod -o wide | grep "cinder-csi-driver-node"
openstack-cinder-csi-driver-node-8p9t6                    2/2     Running   0          20m   192.168.2.181   wduan-0113b-x98wd-master-2         <none>           <none>
openstack-cinder-csi-driver-node-fbmv9                    2/2     Running   0          20m   192.168.1.51    wduan-0113b-x98wd-master-1         <none>           <none>
openstack-cinder-csi-driver-node-lll9s                    2/2     Running   0          20m   192.168.2.208   wduan-0113b-x98wd-worker-0-nb4rq   <none>           <none>
openstack-cinder-csi-driver-node-nblb7                    2/2     Running   0          20m   192.168.1.22    wduan-0113b-x98wd-worker-0-q7xfx   <none>           <none>
openstack-cinder-csi-driver-node-nxkv7                    2/2     Running   0          20m   192.168.3.40    wduan-0113b-x98wd-worker-0-qjshv   <none>           <none>
openstack-cinder-csi-driver-node-pnddv                    2/2     Running   0          19m   192.168.3.129   wduan-0113b-x98wd-master-0         <none>           <none>

And test pod could be running on a master.
$ oc get pod -o wide -w
NAME    READY   STATUS              RESTARTS   AGE   IP       NODE                         NOMINATED NODE   READINESS GATES
mypod   1/1     Running             0          23s   10.128.0.93   wduan-0113b-x98wd-master-0   <none>           <none>

Comment 9 errata-xmlrpc 2021-02-24 15:36:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.