Bug 2229670 - All the Controller Operations should reach the one Controller (active) not multiple Controllers
Summary: All the Controller Operations should reach the one Controller (active) not mu...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-addons
Version: 4.14
Hardware: All
OS: All
unspecified
high
Target Milestone: ---
: ---
Assignee: Niels de Vos
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-07 09:41 UTC by Madhu Rajanna
Modified: 2023-08-09 16:37 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github csi-addons kubernetes-csi-addons issues 422 0 None open All the Controller Operations should reach the one Controller (active) not multiple Controllers 2023-08-15 12:43:01 UTC

Description Madhu Rajanna 2023-08-07 09:41:53 UTC
As of today, the kubernetes csi addons try to connect to the random controller that are registered and try to make the RPC calls to the random controller. This can create a problem if the csi driver has implemented some internal locking mechanism or has some local cache for the lifetime of that instance.

Example as below:-

CephCSI runs deployments for Replication/Reclaimspace etc and we will have two instances running. CephCSI Internally takes a lock and processes a request one at a time based on its internal logic. With the current kubernetes sidecar, it's not a problem because the sidecar runs with a leader election and only one can process a request. Still, with kubernetes-csiaddons it becomes a problem as we don't have any such mechanism to reach the same controller/deployment which is processing the requests.

The request is to provide this kind of functionality so that it will be helpful for the CSI driver who has this kind of requirement and not to run active/active models as it can lead to many different models.


Example:-

For example in ControllerSpace Reclaim operation, if the space is getting reclaimed by one csi driver instance, if there is any CR update we might end up making one more call to the same volume to another csi driver instance.

Mainly for VolumeReplication operations, we have different RPC calls for enable,disable,promote,demote,get etc we might end up issues different RPC calls for the same volume to different CSI driver instances.


Note You need to log in before you can comment on or make changes to this bug.