As of today, the kubernetes csi addons try to connect to the random controller that are registered and try to make the RPC calls to the random controller. This can create a problem if the csi driver has implemented some internal locking mechanism or has some local cache for the lifetime of that instance. Example as below:- CephCSI runs deployments for Replication/Reclaimspace etc and we will have two instances running. CephCSI Internally takes a lock and processes a request one at a time based on its internal logic. With the current kubernetes sidecar, it's not a problem because the sidecar runs with a leader election and only one can process a request. Still, with kubernetes-csiaddons it becomes a problem as we don't have any such mechanism to reach the same controller/deployment which is processing the requests. The request is to provide this kind of functionality so that it will be helpful for the CSI driver who has this kind of requirement and not to run active/active models as it can lead to many different models. Example:- For example in ControllerSpace Reclaim operation, if the space is getting reclaimed by one csi driver instance, if there is any CR update we might end up making one more call to the same volume to another csi driver instance. Mainly for VolumeReplication operations, we have different RPC calls for enable,disable,promote,demote,get etc we might end up issues different RPC calls for the same volume to different CSI driver instances.
The following approach is in development: - the CSI-Addons sidecar will setup leader election when the driver supports CONTROLLER_SERVICE - leader election creates a Lease object, that 1. is unique per CSI driver name 2. identifies a single CSI-Addons Pod to be the leader - the CSI-Addons controller will get the leader from the lease - the CSI-Addons controller will send NetworkFence requests to the leader only Leader election comes with default options for timeout/retry/.. and so on. OpenShift prefers to reconfigure these option to less aggressive values to reduce the load on the API-server. These options will need to be set in the deployment for the Ceph-CSI driver (ocs-operator).
https://github.com/csi-addons/kubernetes-csi-addons/pull/492 is the upstream PR
Available in the release-4.15 branch of kubernetes-csi-addons.
To verify the correct working: 1. make sure there are multiple Ceph-CSI RBD provisioners 2. run NetworkFence operations (can be done with non-workernode IP-addresses to keep the cluster functional) 3. verify that the NetworkFence operations are handled by the Ceph-CSI provisioner that holds the csi-addons-..drivername.. lease 4. delete the Ceph-CSI provisioner that holds the lease 5. run more NetworkFence operations 6. verify that the operations stay pending until an other Ceph-CSI RBD provisioner obtains the lease and handled the operations While checking that the provisioner that holds the lease handles the NetworkFence operations, none of the other Ceph-CSI RBD provisioners should handle the NetworkFence operations. The logs of the kubernetes-csi-addons-operator contain details about the selected csi-addons sidecar where operations are sent to.
Verified with OCP build 4.15.0-0.nightly-2024-02-14-214710 and ODF 4.15.0-142 1. On 3M-6W node cluster, created rbd networkfence by tainting the node that has rbd volume using following command 'oc adm taint nodes compute-5 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute' [jopinto@jopinto 5mon]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE logwriter-rbd-new-logwriter-rbd-new-0 Bound pvc-a8442931-3254-4a04-a33b-e54bd154e912 10Gi RWO ocs-storagecluster-ceph-rbd 38s [jopinto@jopinto 5mon]$ oc adm taint nodes compute-5 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute node/compute-5 tainted 2. Networkfence gets created as expected and 'oc get lease -n openshift-storage' shows the rbd provisioner leader pod. [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io NAME DRIVER CIDRS FENCESTATE AGE RESULT compute-5-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.6/32"] Fenced 6s Succeeded [jopinto@jopinto 5mon]$ oc get leases -n openshift-storage NAME HOLDER AGE 4fd470de.openshift.io odf-operator-controller-manager-799b94c5bd-2gxtz_99e29a55-d118-494c-9606-1d22d327a253 2d19h . . openshift-storage-rbd-csi-ceph-com-csi-addons csi-rbdplugin-provisioner-f99d45b67-hbh4h 2d19h csi-addon container log of pod 'csi-rbdplugin-provisioner-f99d45b67-hbh4h' also logged networkfence creation status 3. Delete the pod that has the lease (csi-rbdplugin-provisioner-f99d45b67-hbh4h) and create networfence on a new node [jopinto@jopinto 5mon]$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES logwriter-rbd-new-0 0/1 Terminating 0 18m 10.128.2.43 compute-5 <none> <none> logwriter-rbd-new1-0 1/1 Running 0 111s 10.130.2.39 compute-4 <none> <none> logwriter-rbd-new1-1 1/1 Running 0 98s 10.129.2.33 compute-3 <none> <none> 4. [jopinto@jopinto 5mon]$ oc delete pod csi-rbdplugin-provisioner-f99d45b67-hbh4h -n openshift-storage pod "csi-rbdplugin-provisioner-f99d45b67-hbh4h" deleted [jopinto@jopinto 5mon]$ oc adm taint nodes compute-4 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute node/compute-4 tainted [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io NAME DRIVER CIDRS FENCESTATE AGE RESULT compute-4-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.8/32"] Fenced 7s compute-5-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.6/32"] Fenced 21m Succeeded [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io NAME DRIVER CIDRS FENCESTATE AGE RESULT compute-4-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.8/32"] Fenced 18s compute-5-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.6/32"] Fenced 21m Succeeded The network fenc estate stays in pending state unless new leader is elected 5. Get new leader by 'oc get lease -n openshift-storage' command, once new leader gets elected, network fence state goes to succeded state [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io NAME DRIVER CIDRS FENCESTATE AGE RESULT compute-4-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.8/32"] Fenced 2m49s compute-5-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.6/32"] Fenced 23m Succeeded [jopinto@jopinto 5mon]$ oc get leases -n openshift-storage NAME HOLDER AGE 4fd470de.openshift.io odf-operator-controller-manager-799b94c5bd-2gxtz_99e29a55-d118-494c-9606-1d22d327a253 2d20h . . openshift-storage-rbd-csi-ceph-com-csi-addons csi-rbdplugin-provisioner-f99d45b67-chcph 2d20h [jopinto@jopinto 5mon]$ oc get networkfences.csiaddons.openshift.io NAME DRIVER CIDRS FENCESTATE AGE RESULT compute-4-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.8/32"] Fenced 3m31s Succeeded compute-5-rbd-openshift-storage openshift-storage.rbd.csi.ceph.com ["100.64.0.6/32"] Fenced 24m Succeeded Same can verified by csi-addon container log of newly elected leader pod I0219 09:31:33.161210 1 leaderelection.go:255] failed to acquire lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:32:05.711435 1 leaderelection.go:354] lock is held by csi-rbdplugin-provisioner-f99d45b67-hbh4h and has not yet expired I0219 09:32:05.711492 1 leaderelection.go:255] failed to acquire lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:32:40.459571 1 leaderelection.go:260] successfully acquired lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:32:40.459878 1 leader_election.go:184] new leader detected, current leader: csi-rbdplugin-provisioner-f99d45b67-chcph I0219 09:32:40.460579 1 leader_election.go:177] became leader, starting I0219 09:32:40.460611 1 main.go:140] Obtained leader status: lease name "openshift-storage-rbd-csi-ceph-com-csi-addons", receiving CONTROLLER_SERVICE requests I0219 09:32:40.469957 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:32:49.234141 1 connection.go:244] GRPC call: /fence.FenceController/FenceClusterNetwork I0219 09:32:49.234177 1 connection.go:245] GRPC request: {"cidrs":[{"cidr":"100.64.0.8/32"}],"parameters":{"clusterID":"openshift-storage"},"secrets":"***stripped***"} I0219 09:32:55.683768 1 connection.go:251] GRPC response: {} I0219 09:32:55.683793 1 connection.go:252] GRPC error: <nil> I0219 09:33:06.488275 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:33:32.566363 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:33:58.613941 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:34:24.678497 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:34:50.721443 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:35:16.763902 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:35:42.787795 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons I0219 09:36:08.810169 1 leaderelection.go:281] successfully renewed lease openshift-storage/openshift-storage-rbd-csi-ceph-com-csi-addons This is as mentioned in verification steps https://bugzilla.redhat.com/show_bug.cgi?id=2229670#c8, Hence marking the bug as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:1383