Description of problem (please be detailed as possible and provide log snippests): After running a set of automated tests, it is observed that the deployments csi-rbdplugin-provisioner, csi-cephfsplugin-provisioner and the daemonsets csi-cephfsplugin, csi-rbdplugin were deployed on provider cluster when ROOK_CSI_DISABLE_DRIVER is "true". Upon further investigation, it was found that the deletion of rook-ceph-operator pod caused this. If ocs-client-operator-controller-manager is restarted after this, these deployments and daemonsets owned by rook-ceph-operator deployment will be deleted. Initial status % oc get daemonsets NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE openshift-storage.cephfs.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h openshift-storage.rbd.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h % oc get deployments | grep -E "provisioner|ctrlplugin" openshift-storage.cephfs.csi.ceph.com-ctrlplugin 2/2 2 2 41h openshift-storage.rbd.csi.ceph.com-ctrlplugin 2/2 2 2 41h % % % oc get pods | grep ocs-client-operator-controller-manager ocs-client-operator-controller-manager-9bd575ccb-rxgjs 2/2 Running 0 5m2s % % oc get pods | grep rook-ceph-operator rook-ceph-operator-8654886f75-vz9z7 1/1 Running 0 21h % % Delete rook-ceph-operator pod. % oc delete pod rook-ceph-operator-8654886f75-vz9z7 pod "rook-ceph-operator-8654886f75-vz9z7" deleted % % % % oc get pods | grep rook-ceph-operator rook-ceph-operator-8654886f75-qxs7g 1/1 Running 0 33s % % New csi daemonsets and deployments created by rook % oc get daemonsets NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE csi-cephfsplugin 3 3 3 3 3 <none> 27s csi-rbdplugin 3 3 0 3 0 <none> 27s openshift-storage.cephfs.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h openshift-storage.rbd.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h % % % oc get deployments | grep -E "provisioner|ctrlplugin" csi-cephfsplugin-provisioner 2/2 2 2 43s csi-rbdplugin-provisioner 2/2 2 2 43s openshift-storage.cephfs.csi.ceph.com-ctrlplugin 2/2 2 2 41h openshift-storage.rbd.csi.ceph.com-ctrlplugin 2/2 2 2 41h % % Delete ocs-client-operator-controller-manager pod. % oc delete pod ocs-client-operator-controller-manager-9bd575ccb-rxgjs pod "ocs-client-operator-controller-manager-9bd575ccb-rxgjs" deleted % % % oc get pods | grep ocs-client-operator-controller-manager ocs-client-operator-controller-manager-9bd575ccb-dpvfc 0/2 Init:0/1 0 69s % % oc get pods | grep ocs-client-operator-controller-manager ocs-client-operator-controller-manager-9bd575ccb-dpvfc 0/2 PodInitializing 0 91s % % % oc get pods | grep ocs-client-operator-controller-manager ocs-client-operator-controller-manager-9bd575ccb-dpvfc 2/2 Running 0 102s % % Previously created daemonsets and deployments are deleted automatically. % oc get daemonsets NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE openshift-storage.cephfs.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h openshift-storage.rbd.csi.ceph.com-nodeplugin 3 3 3 3 3 <none> 41h % oc get deployments | grep -E "provisioner|ctrlplugin" openshift-storage.cephfs.csi.ceph.com-ctrlplugin 2/2 2 2 41h openshift-storage.rbd.csi.ceph.com-ctrlplugin 2/2 2 2 41h % oc get cm ocs-operator-config -o yaml apiVersion: v1 data: CSI_CLUSTER_NAME: 883d5e66-3214-42cf-8dec-17630a5f4328 CSI_DISABLE_HOLDER_PODS: "true" CSI_ENABLE_TOPOLOGY: "false" CSI_TOPOLOGY_DOMAIN_LABELS: "" ROOK_CSI_DISABLE_DRIVER: "true" ROOK_CSI_ENABLE_NFS: "false" ROOK_CURRENT_NAMESPACE_ONLY: "true" kind: ConfigMap metadata: creationTimestamp: "2024-09-18T19:47:39Z" name: ocs-operator-config namespace: openshift-storage ownerReferences: - apiVersion: ocs.openshift.io/v1 blockOwnerDeletion: true controller: true kind: OCSInitialization name: ocsinit uid: b1fdeeaf-706f-4897-a906-a595452b123c resourceVersion: "90610" uid: 62d174d7-5275-41cc-96f8-a227192a5da9 % oc exec rook-ceph-operator-8654886f75-vz9z7 -- printenv | grep -i disable_driver ROOK_CSI_DISABLE_DRIVER=true ownerReferences for provisioner deployments and csi-cephfsplugin, csi-rbdplugin daemonsets is ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: false controller: true kind: Deployment name: rook-ceph-operator ======================================================= Version of all relevant components (if applicable): % oc get csv NAME DISPLAY VERSION REPLACES PHASE cephcsi-operator.v4.17.0-101.stable CephCSI operator 4.17.0-101.stable Succeeded ingress-node-firewall.v4.16.0-202408262007 Ingress Node Firewall Operator 4.16.0-202408262007 ingress-node-firewall.v4.16.0-202409051837 Succeeded mcg-operator.v4.17.0-101.stable NooBaa Operator 4.17.0-101.stable Succeeded metallb-operator.v4.17.0-202409182235 MetalLB Operator 4.17.0-202409182235 metallb-operator.v4.17.0-202409161407 Succeeded ocs-client-operator.v4.17.0-101.stable OpenShift Data Foundation Client 4.17.0-101.stable Succeeded ocs-operator.v4.17.0-101.stable OpenShift Container Storage 4.17.0-101.stable Succeeded odf-csi-addons-operator.v4.17.0-101.stable CSI Addons 4.17.0-101.stable Succeeded odf-operator.v4.17.0-101.stable OpenShift Data Foundation 4.17.0-101.stable Succeeded odf-prometheus-operator.v4.17.0-101.stable Prometheus Operator 4.17.0-101.stable Succeeded recipe.v4.17.0-101.stable Recipe 4.17.0-101.stable Succeeded rook-ceph-operator.v4.17.0-101.stable Rook-Ceph 4.17.0-101.stable Succeeded % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.16.10 True False 39h Cluster version is 4.16.10 ================================================== Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Reporting the first occurrence Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Yes ============================================ Steps to Reproduce: 1. On a 4.17 provider cluster, delete rook-ceph-operator pod 2. Check the presence of deployments csi-cephfsplugin-provisioner, csi-rbdplugin-provisioner and daemonsets csi-cephfsplugin, csi-rbdplugin. These should not be present. ============================================== Actual results: deployments csi-rbdplugin-provisioner, csi-cephfsplugin-provisioner and the daemonsets csi-cephfsplugin, csi-rbdplugin were deployed. Expected results: Rook should not deploy CSI as long as ROOK_CSI_DISABLE_DRIVER is True Additional info: Must gather log: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2313736/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:8676