Bug 2219033

Summary: rook-ceph-operator overwrites CSI driver DaemonSet and clears tolerations when it's restarted
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: adpawar
Component: rookAssignee: Madhu Rajanna <mrajanna>
Status: CLOSED NOTABUG QA Contact: Neha Berry <nberry>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.11CC: brgardne, mrajanna, muagarwa, ocs-bugs, odf-bz-bot
Target Milestone: ---Flags: mrajanna: needinfo? (adpawar)
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-12 06:58:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description adpawar 2023-07-01 06:17:32 UTC
Description of problem (please be detailed as possible and provide a log snippets):

I tested with OCP 4.11.42 and ODF 4.11.8 and rook-ceph-operator overwrites CSI driver DaemonSet and clears tolerations when it's restarted.


Does this issue impact your ability to continue to work with the product?
(please explain in detail what is the user impact)?
Every time the operator restarts, cu has to add the toleration manually. 

Is there any workaround available to the best of your knowledge?
Adding toleration again manually

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue be reproducible?
Yes, I have provided a reproducer. 

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:

1. Install OCP 4.11.42 and ODF 4.11.8
2. `oc -n openshift-storage edit  daemonset csi-rbdplugin` + add a new toleration to it
3. observe that the toleration is not deleted by anything
4. restart rook-ceph-operator ( oc scale --replicas=0 -n openshift-storage deployment/rook-ceph-operator; sleep 1; oc scale --replicas=1 -n openshift-storage deployment/rook-ceph-operator )
5. observe that the toleration is removed.

Version of all relevant components (if applicable):

OCP 4.11.42 and ODF 4.11.8

Actual results:
Tolerations are getting wiped out after the operator restarts.

Expected results:
Toleration should be intact even after restart due to any reason. 
Additional info:
This was mainly experienced during the OCP upgrade. However I don't think its specific to upgrade, It is just that here upgrade is one of the many reasons to cause rook-ceph-operator to restart.

Comment 3 Blaine Gardner 2023-07-11 15:11:12 UTC
@mrajanna do you think there should be a doc update here, or close with NOTABUG?

@muagarwa thoughts?