2229670 – All the Controller Operations should reach the one Controller (active) not multiple Controllers

Bug 2229670 - All the Controller Operations should reach the one Controller (active) not multiple Controllers

Summary: All the Controller Operations should reach the one Controller (active) not mu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	csi-addons
Sub Component:
Version:	4.14
Hardware:	All
OS:	All
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	ODF 4.15.0
Assignee:	Niels de Vos
QA Contact:	jpinto
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2165941 2246375
TreeView+	depends on / blocked

Reported:	2023-08-07 09:41 UTC by Madhu Rajanna
Modified:	2024-03-19 15:22 UTC (History)
CC List:	7 users (show)
Fixed In Version:	4.15.0-117
Doc Type:	Enhancement
Doc Text:	.All controller operations to reach one controller When a CSI-driver provides the `CONTROLLER_SERVICE` capability, the sidecar tries to become the leader by obtaining a lease based on the name of the CSI-driver. The Kubernetes CSI-Addons Operator tries to connect to the random CSI-Addons sidecar that is registered and try to make the RPC calls to the random sidecar. This can create a problem if the CSI-driver has implemented some internal locking mechanism or has some local cache for the lifetime of that instance. The NetworkFence (and other CSI-Addons) operations are only sent to a CSI-Addons sidecar that has the `CONTROLLER_SERVICE` capability. There is a single leader for the CSI-Addons sidecars that support that, and the leader can be identified by the Lease object for the `CSI-drivername`.
Clone Of:
Environment:
Last Closed:	2024-03-19 15:22:34 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	csi-addons kubernetes-csi-addons issues 422	0	None	closed	All the Controller Operations should reach the one Controller (active) not multiple Controllers	2023-12-21 12:02:28 UTC
Red Hat Product Errata	RHSA-2024:1383	0	None	None	None	2024-03-19 15:22:45 UTC

Description Madhu Rajanna 2023-08-07 09:41:53 UTC

As of today, the kubernetes csi addons try to connect to the random controller that are registered and try to make the RPC calls to the random controller. This can create a problem if the csi driver has implemented some internal locking mechanism or has some local cache for the lifetime of that instance.

Example as below:-

CephCSI runs deployments for Replication/Reclaimspace etc and we will have two instances running. CephCSI Internally takes a lock and processes a request one at a time based on its internal logic. With the current kubernetes sidecar, it's not a problem because the sidecar runs with a leader election and only one can process a request. Still, with kubernetes-csiaddons it becomes a problem as we don't have any such mechanism to reach the same controller/deployment which is processing the requests.

The request is to provide this kind of functionality so that it will be helpful for the CSI driver who has this kind of requirement and not to run active/active models as it can lead to many different models.


Example:-

For example in ControllerSpace Reclaim operation, if the space is getting reclaimed by one csi driver instance, if there is any CR update we might end up making one more call to the same volume to another csi driver instance.

Mainly for VolumeReplication operations, we have different RPC calls for enable,disable,promote,demote,get etc we might end up issues different RPC calls for the same volume to different CSI driver instances.

Comment 4 Niels de Vos 2023-12-07 12:35:42 UTC

The following approach is in development:

- the CSI-Addons sidecar will setup leader election when the driver supports CONTROLLER_SERVICE

- leader election creates a Lease object, that
  1. is unique per CSI driver name
  2. identifies a single CSI-Addons Pod to be the leader

- the CSI-Addons controller will get the leader from the lease

- the CSI-Addons controller will send NetworkFence requests to the leader only


Leader election comes with default options for timeout/retry/.. and so on. OpenShift prefers to reconfigure these option to less aggressive values to reduce the load on the API-server. These options will need to be set in the deployment for the Ceph-CSI driver (ocs-operator).

Comment 5 Niels de Vos 2023-12-07 16:12:07 UTC

https://github.com/csi-addons/kubernetes-csi-addons/pull/492 is the upstream PR

Comment 6 Niels de Vos 2023-12-20 11:39:32 UTC

Available in the release-4.15 branch of kubernetes-csi-addons.

Comment 8 Niels de Vos 2024-01-03 13:44:08 UTC

To verify the correct working:

1. make sure there are multiple Ceph-CSI RBD provisioners
2. run NetworkFence operations (can be done with non-workernode IP-addresses to keep the cluster functional)
3. verify that the NetworkFence operations are handled by the Ceph-CSI provisioner that holds the csi-addons-..drivername.. lease
4. delete the Ceph-CSI provisioner that holds the lease
5. run more NetworkFence operations
6. verify that the operations stay pending until an other Ceph-CSI RBD provisioner obtains the lease and handled the operations

While checking that the provisioner that holds the lease handles the NetworkFence operations, none of the other Ceph-CSI RBD provisioners should handle the NetworkFence operations. The logs of the kubernetes-csi-addons-operator contain details about the selected csi-addons sidecar where operations are sent to.

Comment 13 Joy John Pinto 2024-02-19 09:49:22 UTC

Verified with OCP build 4.15.0-0.nightly-2024-02-14-214710 and ODF 4.15.0-142

1. On 3M-6W node cluster, created rbd networkfence by tainting the node that has rbd volume using following command 'oc adm taint nodes compute-5 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute'
   [jopinto@jopinto 5mon]$ oc get pvc
   NAME                                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
   logwriter-rbd-new-logwriter-rbd-new-0   Bound    pvc-a8442931-3254-4a04-a33b-e54bd154e912   10Gi       RWO            ocs-storagecluster-ceph-rbd   38s

   [jopinto@jopinto 5mon]$ oc adm taint nodes compute-5 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute
   node/compute-5 tainted

2. Networkfence gets created as expected and 'oc get lease -n openshift-storage' shows the rbd provisioner leader pod.
   [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io  
   NAME                              DRIVER                               CIDRS               FENCESTATE   AGE   RESULT
   compute-5-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.6/32"]   Fenced       6s    Succeeded

   [jopinto@jopinto 5mon]$ oc get leases -n openshift-storage
   NAME                                                                HOLDER                                                                                  AGE
   4fd470de.openshift.io                                               odf-operator-controller-manager-799b94c5bd-2gxtz_99e29a55-d118-494c-9606-1d22d327a253   2d19h
   .
   .
   openshift-storage-rbd-csi-ceph-com-csi-addons                       csi-rbdplugin-provisioner-f99d45b67-hbh4h                                               2d19h

   csi-addon container log of pod 'csi-rbdplugin-provisioner-f99d45b67-hbh4h' also logged networkfence creation status

3. Delete the pod that has the lease (csi-rbdplugin-provisioner-f99d45b67-hbh4h) and create networfence on a new node
   [jopinto@jopinto 5mon]$ oc get pods -o wide
   NAME                   READY   STATUS        RESTARTS   AGE    IP            NODE        NOMINATED NODE   READINESS GATES
   logwriter-rbd-new-0    0/1     Terminating   0          18m    10.128.2.43   compute-5   <none>           <none>
   logwriter-rbd-new1-0   1/1     Running       0          111s   10.130.2.39   compute-4   <none>           <none>
   logwriter-rbd-new1-1   1/1     Running       0          98s    10.129.2.33   compute-3   <none>           <none>

4. [jopinto@jopinto 5mon]$ oc delete pod csi-rbdplugin-provisioner-f99d45b67-hbh4h -n openshift-storage
   pod "csi-rbdplugin-provisioner-f99d45b67-hbh4h" deleted
   [jopinto@jopinto 5mon]$ oc adm taint nodes compute-4 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute
   node/compute-4 tainted
   [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io  
   NAME                              DRIVER                               CIDRS               FENCESTATE   AGE   RESULT
   compute-4-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.8/32"]   Fenced       7s    
   compute-5-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.6/32"]   Fenced       21m   Succeeded
   [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io  
   NAME                              DRIVER                               CIDRS               FENCESTATE   AGE   RESULT
   compute-4-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.8/32"]   Fenced       18s   
   compute-5-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.6/32"]   Fenced       21m   Succeeded

   The network fenc estate stays in pending state unless new leader is elected

5. Get new leader by 'oc get lease -n openshift-storage' command, once new leader gets elected, network fence state goes to succeded state
   [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io  
   NAME                              DRIVER                               CIDRS               FENCESTATE   AGE     RESULT
   compute-4-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.8/32"]   Fenced       2m49s   
   compute-5-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.6/32"]   Fenced       23m     Succeeded
   [jopinto@jopinto 5mon]$ oc get leases -n openshift-storage
   NAME                                                                HOLDER                                                                                  AGE
   4fd470de.openshift.io                                               odf-operator-controller-manager-799b94c5bd-2gxtz_99e29a55-d118-494c-9606-1d22d327a253   2d20h
   .
   .
   openshift-storage-rbd-csi-ceph-com-csi-addons                       csi-rbdplugin-provisioner-f99d45b67-chcph                                               2d20h
   [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io  
   NAME                              DRIVER                               CIDRS               FENCESTATE   AGE     RESULT
   compute-4-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.8/32"]   Fenced       3m31s   Succeeded
   compute-5-rbd-openshift-storage   openshift-storage.rbd.csi.ceph.com   ["100.64.0.6/32"]   Fenced       24m     Succeeded

Same can verified by csi-addon container log of newly elected leader pod

I0219 09:31:33.161210       1 leaderelection.go:255] failed to acquire lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:32:05.711435       1 leaderelection.go:354] lock is held by csi-rbdplugin-provisioner-f99d45b67-hbh4h and has not yet expired
I0219 09:32:05.711492       1 leaderelection.go:255] failed to acquire lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:32:40.459571       1 leaderelection.go:260] successfully acquired lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:32:40.459878       1 leader_election.go:184] new leader detected, current leader: csi-rbdplugin-provisioner-f99d45b67-chcph
I0219 09:32:40.460579       1 leader_election.go:177] became leader, starting
I0219 09:32:40.460611       1 main.go:140] Obtained leader status: lease name "openshift-storage-rbd-csi-ceph-com-csi-addons", receiving CONTROLLER_SERVICE requests
I0219 09:32:40.469957       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:32:49.234141       1 connection.go:244] GRPC call: /fence.FenceController/FenceClusterNetwork
I0219 09:32:49.234177       1 connection.go:245] GRPC request: {"cidrs":[{"cidr":"100.64.0.8/32"}],"parameters":{"clusterID":"openshift-storage"},"secrets":"***stripped***"}
I0219 09:32:55.683768       1 connection.go:251] GRPC response: {}
I0219 09:32:55.683793       1 connection.go:252] GRPC error: <nil>
I0219 09:33:06.488275       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:33:32.566363       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:33:58.613941       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:34:24.678497       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:34:50.721443       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:35:16.763902       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:35:42.787795       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons
I0219 09:36:08.810169       1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons

This is as mentioned in verification steps https://bugzilla.redhat.com/show_bug.cgi?id=2229670#c8, Hence marking the bug as verified

Comment 14 errata-xmlrpc 2024-03-19 15:22:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383

Note You need to log in before you can comment on or make changes to this bug.