Bug 1902546

Summary:	Cinder csi driver node pod doesn't run on master node
Product:	OpenShift Container Platform	Reporter:	Wei Duan <wduan>
Component:	Storage	Assignee:	Martin André <m.andre>
Storage sub component:	OpenStack CSI Drivers	QA Contact:	Wei Duan <wduan>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	unspecified	CC:	aos-bugs, m.andre, pprinett
Version:	4.7	Keywords:	UpcomingSprint
Target Milestone:	---
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-02-24 15:36:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Wei Duan 2020-11-30 02:51:09 UTC

Description of problem:
Cinder-csi-driver-node pod doesn't run on master node

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-29-133728

Steps to Reproduce:
1. Install OSP cluster and cinder csi driver is installed. 

2. Check CSI driver pods:
   oc -n openshift-cluster-csi-drivers get pod -o wide

3. Create a pod on a master that uses PVC.

Actual results:
1. CSI driver node pods run only on worker nodes
$ oc -n openshift-cluster-csi-drivers get pod -o wide
...
openstack-cinder-csi-driver-node-42svt                    2/2     Running   0          49m   192.168.2.110   wduan-1130a-fcw45-worker-0-jvtg8   <none>           <none>
openstack-cinder-csi-driver-node-72flp                    2/2     Running   1          47m   192.168.3.54    wduan-1130a-fcw45-worker-0-vnqmn   <none>           <none>
openstack-cinder-csi-driver-node-mh9js                    2/2     Running   0          49m   192.168.0.14    wduan-1130a-fcw45-worker-0-qh5dx   <none>           <none>
...

2. Masters can't use a PVC provided by the CSI driver

Expected results:
Masters should have openstack-cinder-csi-driver-node pod then Masters can use a PVC provided by the CSI driver.

Comment 1 Martin André 2020-12-10 09:18:05 UTC

We believe this might have been an infra issue. Mike to double check.

Comment 2 Martin André 2021-01-11 18:12:10 UTC

Hey Wei Duan, just to make sure that I understand the issue, do we expect the cinder-csi-driver-node pods to run on the master nodes all the time or only when they are schedulable?

If it's the former, I think we can add the following toleration to the Deployment spec:

    tolerations:
    - key: node-role.kubernetes.io/master
      operator: Exists
      effect: "NoSchedule"

And I suppose it's the same for the cinder-csi-driver-controller pods in https://bugzilla.redhat.com/show_bug.cgi?id=1902547 ?

Comment 3 Wei Duan 2021-01-12 02:27:09 UTC

Hi replied in the slack, let's discuss there and make a decision.

Comment 5 Wei Duan 2021-01-13 07:28:31 UTC

Verified pass on 4.7.0-0.nightly-2021-01-12-203716

$ oc -n openshift-cluster-csi-drivers get pod -o wide | grep "cinder-csi-driver-node"
openstack-cinder-csi-driver-node-8p9t6                    2/2     Running   0          20m   192.168.2.181   wduan-0113b-x98wd-master-2         <none>           <none>
openstack-cinder-csi-driver-node-fbmv9                    2/2     Running   0          20m   192.168.1.51    wduan-0113b-x98wd-master-1         <none>           <none>
openstack-cinder-csi-driver-node-lll9s                    2/2     Running   0          20m   192.168.2.208   wduan-0113b-x98wd-worker-0-nb4rq   <none>           <none>
openstack-cinder-csi-driver-node-nblb7                    2/2     Running   0          20m   192.168.1.22    wduan-0113b-x98wd-worker-0-q7xfx   <none>           <none>
openstack-cinder-csi-driver-node-nxkv7                    2/2     Running   0          20m   192.168.3.40    wduan-0113b-x98wd-worker-0-qjshv   <none>           <none>
openstack-cinder-csi-driver-node-pnddv                    2/2     Running   0          19m   192.168.3.129   wduan-0113b-x98wd-master-0         <none>           <none>

And test pod could be running on a master.
$ oc get pod -o wide -w
NAME    READY   STATUS              RESTARTS   AGE   IP       NODE                         NOMINATED NODE   READINESS GATES
mypod   1/1     Running             0          23s   10.128.0.93   wduan-0113b-x98wd-master-0   <none>           <none>

Comment 9 errata-xmlrpc 2021-02-24 15:36:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633