1992472 – How to add toleration to OCS pods for any non OCS taints?

Bug 1992472 - How to add toleration to OCS pods for any non OCS taints?

Summary: How to add toleration to OCS pods for any non OCS taints?

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.8
Hardware:	All
OS:	All
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	ODF 4.9.0
Assignee:	Subham Rai
QA Contact:	Shrivaibavi Raghaventhiran
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1949553 (view as bug list)
Depends On:	2005937
Blocks:	1999158
TreeView+	depends on / blocked

Reported:	2021-08-11 07:43 UTC by Bipin Kunal
Modified:	2023-08-09 17:03 UTC (History)
CC List:	20 users (show)
Fixed In Version:	v4.9.0-158.ci
Doc Type:	Bug Fix
Doc Text:	.Adding toleration to OpenShift Container Storage pods for any non OpenShift Container Storage taints Previously, pods could not be scheduled on non OpenShift Container Storage taints as the tolerations were not applied. With this update, tolerations are applied successfully and the pods can be scheduled on non OpenShift Container Storage taints.
Clone Of:
Clones:	1999158 (view as bug list)
Environment:
Last Closed:	2021-12-13 17:44:58 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	red-hat-storage ocs-ci pull 5206	None	Merged	<Closed loop BZ: 1992472> Non-ocs Taint and toleration	2022-01-05 11:27:35 UTC
Github	rook rook pull 8566	None	None	None	2021-08-19 11:29:06 UTC
Red Hat Knowledge Base (Article)	6408481	None	None	None	2021-10-29 13:56:52 UTC
Red Hat Product Errata	RHSA-2021:5086	None	None	None	2021-12-13 17:45:57 UTC

Description Bipin Kunal 2021-08-11 07:43:22 UTC

Describe the issue:

We do not have any documented way to add toleration to the OCS-specific pods for any random non-OCS taints. 


Describe the task you were trying to accomplish:


The Use case here is to allow OCS pods to run on the worker nodes which have some non-OCS taints. 


Suggestions for improvement:

We need to have the steps documented so that customers can use them when needed.

Comment 7 David Juran 2021-08-11 11:14:26 UTC

*** Bug 1949553 has been marked as a duplicate of this bug. ***

Comment 11 Subham Rai 2021-08-12 13:45:56 UTC

Hi Bipul,

try `storageClassDeviceSets` instead of `storageDeviceSets`
In Rook, we set placement for osds/prepare-osd under `storageClassDeviceSets`

see this `StorageClassDeviceSets []StorageClassDeviceSet `json:"storageClassDeviceSets,omitempty"`
https://github.com/rook/rook/blob/master/pkg/apis/ceph.rook.io/v1/types.go#L1913

Comment 12 Subham Rai 2021-08-12 13:47:34 UTC

sorry (In reply to subham from comment #11)
> Hi Bipul,
sorry Bipin
> 
> try `storageClassDeviceSets` instead of `storageDeviceSets`
> In Rook, we set placement for osds/prepare-osd under `storageClassDeviceSets`
> 
> see this `StorageClassDeviceSets []StorageClassDeviceSet
> `json:"storageClassDeviceSets,omitempty"`
> https://github.com/rook/rook/blob/master/pkg/apis/ceph.rook.io/v1/types.
> go#L1913

Comment 13 Subham Rai 2021-08-16 04:55:44 UTC

the workaround that I commented on the google chat(link mentioned in c6) will not work as it is reading the `"osd"` key if `noPlacement` will true and `supportTSC` is false which is not the case.

Comment 15 Jose A. Rivera 2021-08-16 17:15:27 UTC

When using a StorageCluster to create and manage a CephCluster, you *MUST NOT* try and modify the CephCluster CR directly. ocs-operator will always revert any and all changes on each iteration of its reconcile loop. To have any changes persist, you must either: 1. Only interact with the StorageCluster, or 2. Scale the ocs-operator Deployment to 0 so the operator is no longer running.

By default we specify three Placements: all, mon, and arbiter. See: https://github.com/openshift/ocs-operator/blob/release-4.8/controllers/storagecluster/cephcluster.go#L306-L308 With the "all" Placement, rook-ceph-operator should be trying to merge it with any other Placements for more specific components, giving preference to the values in the more specific Placements. As such, even *if* we specify or generate Placements for the osd and osd-prepare Pods, the values in "all" (specifically for Tolerations) should be included in the Placements calculated by rook-ceph-operator. For completeness, here is where we generate the OSD Placements: https://github.com/openshift/ocs-operator/blob/release-4.8/controllers/storagecluster/cephcluster.go#L523-L624

@tnielsen Could you have a look to see if ocs-operator is doing something wrong? If not we may have a bug with StorageClassDeviceSets in Rook.

Comment 16 Travis Nielsen 2021-08-16 18:15:03 UTC

Bipin Can you attach the full CephCluster CR? It will help understand what Rook is actually trying to reconcile. 

If the tolerations are specified both on the "all" and the storageClassDeviceSet placement or preparePlacement, only one of them will be applied. I believe the storageClassDeviceSet tolerations will have higher precedence than the "all" tolerations. But the tolerations are not merged. Only nodeAffinity is merged for OSDs with "all".

Comment 17 Prasad Desala 2021-08-17 05:32:04 UTC

Setting the needinfo on Vaibhavi to provide the requested information in Comment16

Comment 20 Travis Nielsen 2021-08-18 16:46:46 UTC

From the cluster.yaml attached, I see that the tolerations are specified both in "all" and under the storageClassDeviceSets, and the tolerations from "all" are not being applied. 
      all:
        tolerations:
        - effect: NoSchedule
          key: xyz
          operator: Equal
          value: "true"
      storageClassDeviceSets:
      - count: 3
        encrypted: true
        name: ocs-deviceset-localblock-0
        placement:
          tolerations:
          - effect: NoSchedule
            key: node.ocs.openshift.io/storage
            operator: Equal
            value: "true"


Subham Can you take a look at merging the tolerations when specified in both places? ApplyToPodSpec() only takes one, then ignores the tolerations from "all". But similar to the node affinity if onlyApplyOSDPlacement: true, we would not want to merge it.

Comment 25 Mudit Agarwal 2021-09-01 05:42:23 UTC

Yes Bipin, it is available in the latest 4.9 downstream build

Comment 29 Shrivaibavi Raghaventhiran 2021-09-03 12:01:47 UTC

Can you please let us know the steps to perform on OCS 4.9 ?

Comment 30 Subham Rai 2021-09-03 13:57:28 UTC

(In reply to Shrivaibavi Raghaventhiran from comment #29)
> Can you please let us know the steps to perform on OCS 4.9 ?

1. Add taints on the nodes(ex: oc adm taint nodes node1 xyz=true:NoSchedule)
2. Add tolerations under placements in storagecluser.yaml

Note: 
1) If you want the toleration to be applied on all the ceph pods(like MON, OSD, MGR), then you can directly add toleration under placements.ALL.
2) If you just want to add toleration for OSD, then add toleration under Storage.StorageClassDeviceSets.Placement

Comment 31 Travis Nielsen 2021-09-03 14:26:30 UTC

Also see https://bugzilla.redhat.com/show_bug.cgi?id=1992472#c20 for the placement that was causing the tolerations specified in "all" to not be applied.

Comment 47 Santosh Pillai 2021-09-17 08:00:22 UTC

could be an issue with OCS when we only pass `tolerations` for `mds` in the storageClusters spec and don't pass any podAntiAffinity along with it. 

For example: Below spec is added to StorageCluster yaml

```
placement:
    all:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
    mds:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
```


This is causing nil pointer exception while ensuring CephFileSystem in ocs (https://github.com/red-hat-storage/ocs-operator/blob/32158124bba496f625d6c2f01c31affde8713fa7/controllers/storagecluster/placement.go#L66) 


Error:

```
{"level":"info","ts":1631864507.7231574,"logger":"controllers.StorageCluster","msg":"Adding topology label from Node.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","Node":"ip-10-0-175-201.ec2.internal","Label":"failure-domain.beta.kubernetes.io/zone","Value":"us-east-1c"}
{"level":"info","ts":1631864507.7231665,"logger":"controllers.StorageCluster","msg":"Adding topology label from Node.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","Node":"ip-10-0-139-228.ec2.internal","Label":"failure-domain.beta.kubernetes.io/zone","Value":"us-east-1a"}
{"level":"info","ts":1631864507.7235525,"logger":"controllers.StorageCluster","msg":"Restoring original CephBlockPool.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephBlockPool":"openshift-storage/ocs-storagecluster-cephblockpool"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x163b1b9]

goroutine 1200 [running]:
github.com/openshift/ocs-operator/controllers/storagecluster.getPlacement(0xc000877180, 0x1a76153, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/remote-source/app/controllers/storagecluster/placement.go:66 +0x299
github.com/openshift/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).newCephFilesystemInstances(0xc0005ee3c0, 0xc000877180, 0xc00012a008, 0x1ce8cc0, 0xc000401680, 0x0, 0x0)
	/remote-source/app/controllers/storagecluster/cephfilesystem.go:42 +0x1fd
github.com/openshift/ocs-operator/controllers/storagecluster.(*ocsCephFilesystems).ensureCreated(0x283e6d0, 0xc0005ee3c0, 0xc000877180, 0x0, 0x0)
	/remote-source/app/controllers/storagecluster/cephfilesystem.go:68 +0x85
github.com/openshift/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).reconcilePhases(0xc0005ee3c0, 0xc000877180, 0xc000fcf6c8, 0x11, 0xc000fcf6b0, 0x12, 0x0, 0x0, 0xc000877180, 0x0)
	/remote-source/app/controllers/storagecluster/reconcile.go:375 +0xc7f
github.com/openshift/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).Reconcile(0xc0005ee3c0, 0x1cc58d8, 0xc002414030, 0xc000fcf6c8, 0x11, 0xc000fcf6b0, 0x12, 0xc002414000, 0x0, 0x0, ...)
	/remote-source/app/controllers/storagecluster/reconcile.go:160 +0x6c5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000178d20, 0x1cc5830, 0xc000d37900, 0x18d32c0, 0xc000af77a0)
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x30d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000178d20, 0x1cc5830, 0xc000d37900, 0xc000a75f00)
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2(0xc0002bd090, 0xc000178d20, 0x1cc5830, 0xc000d37900)
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 +0x6b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x425

```

Comment 49 Travis Nielsen 2021-09-20 18:39:09 UTC

We need a fix from the OCS operator as tracked in the new BZ for applying the tolerations for MDS: https://bugzilla.redhat.com/show_bug.cgi?id=2005937.
The CSI driver can have tolerations applied in the operator settings override as mentioned previously. 
This issue tracks the fix for merging the tolerations for all other Ceph daemons besides the mds. 

@Bipin Anything else needed from this BZ besides QE validation?

Comment 53 Mudit Agarwal 2021-09-24 16:03:52 UTC

Fix can be verified only after BZ #2005937 is fixed.

Comment 57 Shrivaibavi Raghaventhiran 2021-10-05 11:55:51 UTC

Tested environment:
-------------------
VMWARE 3M, 3W

Versions:
----------
OCP - 4.9.0-0.nightly-2021-10-01-034521
ODF - odf-operator.v4.9.0-164.ci

Steps Performed :
------------------
1. Tainted all nodes masters and workers with taint 'xyz'

2. Edited storagecluster yaml with below values

  placement:
    all:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
    mds:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
    noobaa-core:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
    rgw:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"

3. Noticed the pods under openshift-storage respin after updating the storagecluster yaml (Expected)

4. Also edited rook-ceph-config with below values

  CSI_PLUGIN_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
    - key: xyz
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_PROVISIONER_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
    - key: xyz
      operator: Equal
      value: "true"
      effect: NoSchedule


Few Points to be noted:
------------------------

1. Before editing the storagecluster yaml there were ocs tolerations present on almost every pod (except noobaa-operator,odf-console, odf-operator-controller-manager) But after applying non-ocs tolerations via storagecluster the existing OCS taint was overriden on pods which comes under "all" placements group.
>> so current solution is to apply both default and new tolerations on storagecluster yaml as mentioned in step-2

2. Pods odf-console odf-operator-controller-manager and noobaa-operator doesn't have OCS tolerations on them by default

3. Where to set tolerations for non-ocs taint for tool-box pod ??

4. When setting tolerations under "placements 'all'" osd-prepare pods are not updated with 'xyz' tolerations

5. Setting tolerations for operators is currently unknown

Console o/p
------------
After editing storagecluster yaml below are the pods which are not updated

$ oc get pods -n openshift-storage | grep 4d1h
ocs-metrics-exporter-7566789b65-5n25b                             1/1     Running     0               4d1h
ocs-operator-8588554d5-dlrzp                                      1/1     Running     0               4d1h
odf-console-5c7446d49f-nk7v7                                      1/1     Running     0               4d1h
odf-operator-controller-manager-67fc478859-fj6rm                  2/2     Running     8 (7h41m ago)   4d1h
rook-ceph-operator-749d46bd8-5jbg4                                1/1     Running     0               4d1h
rook-ceph-osd-prepare-ocs-deviceset-0-data-05xb9n--1-rvdx7        0/1     Completed   0               4d1h
rook-ceph-osd-prepare-ocs-deviceset-1-data-0hgnq9--1-t5gl4        0/1     Completed   0               4d1h
rook-ceph-osd-prepare-ocs-deviceset-2-data-0sdsd2--1-bckwq        0/1     Completed   0               4d1h
rook-ceph-tools-f57d97cc6-q4thh                                   1/1     Running     0               4d1h

Storagecluster yaml before edit : http://pastebin.test.redhat.com/998781
Storagecluster yaml after edit : http://pastebin.test.redhat.com/998782

Console o/p command before[1] and after edit[2]:
for i in $(oc get deployment.apps -n openshift-storage|awk '{print$1}') ; do echo $i; echo  "==============" ;oc -n openshift-storage get deployment.apps $i -o yaml | grep -v NAME| grep  tolerations -A 16 ; done
[1] http://pastebin.test.redhat.com/998783
[2] http://pastebin.test.redhat.com/998784

Comment 58 Subham Rai 2021-10-05 13:04:12 UTC

(In reply to Shrivaibavi Raghaventhiran from comment #57)
> Tested environment:
> -------------------
> VMWARE 3M, 3W
> 
> Versions:
> ----------
> OCP - 4.9.0-0.nightly-2021-10-01-034521
> ODF - odf-operator.v4.9.0-164.ci
> 
> Steps Performed :
> ------------------
> 1. Tainted all nodes masters and workers with taint 'xyz'
> 
> 2. Edited storagecluster yaml with below values
> 
>   placement:
>     all:
>       tolerations:
>       - effect: NoSchedule
>         key: xyz
>         operator: Equal
>         value: "true"
>       - effect: NoSchedule
>         key: node.ocs.openshift.io/storage
>         operator: Equal
>         value: "true"
>     mds:
>       tolerations:
>       - effect: NoSchedule
>         key: xyz
>         operator: Equal
>         value: "true"
>       - effect: NoSchedule
>         key: node.ocs.openshift.io/storage
>         operator: Equal
>         value: "true"
>     noobaa-core:
>       tolerations:
>       - effect: NoSchedule
>         key: xyz
>         operator: Equal
>         value: "true"
>       - effect: NoSchedule
>         key: node.ocs.openshift.io/storage
>         operator: Equal
>         value: "true"
>     rgw:
>       tolerations:
>       - effect: NoSchedule
>         key: xyz
>         operator: Equal
>         value: "true"
>       - effect: NoSchedule
>         key: node.ocs.openshift.io/storage
>         operator: Equal
>         value: "true"
> 
> 3. Noticed the pods under openshift-storage respin after updating the
> storagecluster yaml (Expected)
> 
> 4. Also edited rook-ceph-config with below values
> 
>   CSI_PLUGIN_TOLERATIONS: |2-
> 
>     - key: node.ocs.openshift.io/storage
>       operator: Equal
>       value: "true"
>       effect: NoSchedule
>     - key: xyz
>       operator: Equal
>       value: "true"
>       effect: NoSchedule
>   CSI_PROVISIONER_TOLERATIONS: |2-
> 
>     - key: node.ocs.openshift.io/storage
>       operator: Equal
>       value: "true"
>       effect: NoSchedule
>     - key: xyz
>       operator: Equal
>       value: "true"
>       effect: NoSchedule
> 
> 
> Few Points to be noted:
> ------------------------
> 
> 1. Before editing the storagecluster yaml there were ocs tolerations present
> on almost every pod (except noobaa-operator,odf-console,
> odf-operator-controller-manager) But after applying non-ocs tolerations via
> storagecluster the existing OCS taint was overriden on pods which comes
> under "all" placements group.
> >> so current solution is to apply both default and new tolerations on storagecluster yaml as mentioned in step-2
> 
> 2. Pods odf-console odf-operator-controller-manager and noobaa-operator
> doesn't have OCS tolerations on them by default
> 
> 3. Where to set tolerations for non-ocs taint for tool-box pod ??
I'll check and update for rook-ceph-operator and toolbox.
> 
> 4. When setting tolerations under "placements 'all'" osd-prepare pods are
> not updated with 'xyz' tolerations
> 
> 5. Setting tolerations for operators is currently unknown
I'll update 
> 
> Console o/p
> ------------
> After editing storagecluster yaml below are the pods which are not updated
> 
> $ oc get pods -n openshift-storage | grep 4d1h
> ocs-metrics-exporter-7566789b65-5n25b                             1/1    
> Running     0               4d1h
> ocs-operator-8588554d5-dlrzp                                      1/1    
> Running     0               4d1h
> odf-console-5c7446d49f-nk7v7                                      1/1    
> Running     0               4d1h
> odf-operator-controller-manager-67fc478859-fj6rm                  2/2    
> Running     8 (7h41m ago)   4d1h
> rook-ceph-operator-749d46bd8-5jbg4                                1/1    
> Running     0               4d1h
> rook-ceph-osd-prepare-ocs-deviceset-0-data-05xb9n--1-rvdx7        0/1    
> Completed   0               4d1h
> rook-ceph-osd-prepare-ocs-deviceset-1-data-0hgnq9--1-t5gl4        0/1    
> Completed   0               4d1h
> rook-ceph-osd-prepare-ocs-deviceset-2-data-0sdsd2--1-bckwq        0/1    
> Completed   0               4d1h
> rook-ceph-tools-f57d97cc6-q4thh                                   1/1    
> Running     0               4d1h
> 
According to your comment on steps performed`1. Tainted all nodes masters and workers with taint 'xyz'` I see `rook-ceph-osd-prepare` pods are running. 
> Storagecluster yaml before edit : http://pastebin.test.redhat.com/998781
> Storagecluster yaml after edit : http://pastebin.test.redhat.com/998782
> 
> Console o/p command before[1] and after edit[2]:
> for i in $(oc get deployment.apps -n openshift-storage|awk '{print$1}') ; do
> echo $i; echo  "==============" ;oc -n openshift-storage get deployment.apps
> $i -o yaml | grep -v NAME| grep  tolerations -A 16 ; done

so this command is for deployments only and `osd-prepare` is job so I don't see enough logs to confirm toleration didn't applied to `rook-ceph-osd-prepare` pod and can I get prepare pod yaml and operator logs to confirm. I have tested multiple tims with rook upstream toleration are applied on prepare pod and other pods. 
> [1] http://pastebin.test.redhat.com/998783
> [2] http://pastebin.test.redhat.com/998784

Comment 59 Shrivaibavi Raghaventhiran 2021-10-05 13:51:08 UTC

(In reply to Subham Rai from comment #58)
> (In reply to Shrivaibavi Raghaventhiran from comment #57)
> > Tested environment:
> > -------------------
> > VMWARE 3M, 3W
> > 
> > Versions:
> > ----------
> > OCP - 4.9.0-0.nightly-2021-10-01-034521
> > ODF - odf-operator.v4.9.0-164.ci
> > 
> > Steps Performed :
> > ------------------
> > 1. Tainted all nodes masters and workers with taint 'xyz'
> > 
> > 2. Edited storagecluster yaml with below values
> > 
> >   placement:
> >     all:
> >       tolerations:
> >       - effect: NoSchedule
> >         key: xyz
> >         operator: Equal
> >         value: "true"
> >       - effect: NoSchedule
> >         key: node.ocs.openshift.io/storage
> >         operator: Equal
> >         value: "true"
> >     mds:
> >       tolerations:
> >       - effect: NoSchedule
> >         key: xyz
> >         operator: Equal
> >         value: "true"
> >       - effect: NoSchedule
> >         key: node.ocs.openshift.io/storage
> >         operator: Equal
> >         value: "true"
> >     noobaa-core:
> >       tolerations:
> >       - effect: NoSchedule
> >         key: xyz
> >         operator: Equal
> >         value: "true"
> >       - effect: NoSchedule
> >         key: node.ocs.openshift.io/storage
> >         operator: Equal
> >         value: "true"
> >     rgw:
> >       tolerations:
> >       - effect: NoSchedule
> >         key: xyz
> >         operator: Equal
> >         value: "true"
> >       - effect: NoSchedule
> >         key: node.ocs.openshift.io/storage
> >         operator: Equal
> >         value: "true"
> > 
> > 3. Noticed the pods under openshift-storage respin after updating the
> > storagecluster yaml (Expected)
> > 
> > 4. Also edited rook-ceph-config with below values
> > 
> >   CSI_PLUGIN_TOLERATIONS: |2-
> > 
> >     - key: node.ocs.openshift.io/storage
> >       operator: Equal
> >       value: "true"
> >       effect: NoSchedule
> >     - key: xyz
> >       operator: Equal
> >       value: "true"
> >       effect: NoSchedule
> >   CSI_PROVISIONER_TOLERATIONS: |2-
> > 
> >     - key: node.ocs.openshift.io/storage
> >       operator: Equal
> >       value: "true"
> >       effect: NoSchedule
> >     - key: xyz
> >       operator: Equal
> >       value: "true"
> >       effect: NoSchedule
> > 
> > 
> > Few Points to be noted:
> > ------------------------
> > 
> > 1. Before editing the storagecluster yaml there were ocs tolerations present
> > on almost every pod (except noobaa-operator,odf-console,
> > odf-operator-controller-manager) But after applying non-ocs tolerations via
> > storagecluster the existing OCS taint was overriden on pods which comes
> > under "all" placements group.
> > >> so current solution is to apply both default and new tolerations on storagecluster yaml as mentioned in step-2
> > 
> > 2. Pods odf-console odf-operator-controller-manager and noobaa-operator
> > doesn't have OCS tolerations on them by default
> > 
> > 3. Where to set tolerations for non-ocs taint for tool-box pod ??
> I'll check and update for rook-ceph-operator and toolbox.
> > 
> > 4. When setting tolerations under "placements 'all'" osd-prepare pods are
> > not updated with 'xyz' tolerations
> > 
> > 5. Setting tolerations for operators is currently unknown
> I'll update 
> > 
> > Console o/p
> > ------------
> > After editing storagecluster yaml below are the pods which are not updated
> > 
> > $ oc get pods -n openshift-storage | grep 4d1h
> > ocs-metrics-exporter-7566789b65-5n25b                             1/1    
> > Running     0               4d1h
> > ocs-operator-8588554d5-dlrzp                                      1/1    
> > Running     0               4d1h
> > odf-console-5c7446d49f-nk7v7                                      1/1    
> > Running     0               4d1h
> > odf-operator-controller-manager-67fc478859-fj6rm                  2/2    
> > Running     8 (7h41m ago)   4d1h
> > rook-ceph-operator-749d46bd8-5jbg4                                1/1    
> > Running     0               4d1h
> > rook-ceph-osd-prepare-ocs-deviceset-0-data-05xb9n--1-rvdx7        0/1    
> > Completed   0               4d1h
> > rook-ceph-osd-prepare-ocs-deviceset-1-data-0hgnq9--1-t5gl4        0/1    
> > Completed   0               4d1h
> > rook-ceph-osd-prepare-ocs-deviceset-2-data-0sdsd2--1-bckwq        0/1    
> > Completed   0               4d1h
> > rook-ceph-tools-f57d97cc6-q4thh                                   1/1    
> > Running     0               4d1h
> > 
> According to your comment on steps performed`1. Tainted all nodes masters
> and workers with taint 'xyz'` I see `rook-ceph-osd-prepare` pods are
> running. 
> > Storagecluster yaml before edit : http://pastebin.test.redhat.com/998781
> > Storagecluster yaml after edit : http://pastebin.test.redhat.com/998782
> > 
> > Console o/p command before[1] and after edit[2]:
> > for i in $(oc get deployment.apps -n openshift-storage|awk '{print$1}') ; do
> > echo $i; echo  "==============" ;oc -n openshift-storage get deployment.apps
> > $i -o yaml | grep -v NAME| grep  tolerations -A 16 ; done
> 
> so this command is for deployments only and `osd-prepare` is job so I don't
> see enough logs to confirm toleration didn't applied to
> `rook-ceph-osd-prepare` pod and can I get prepare pod yaml and operator logs
> to confirm. I have tested multiple tims with rook upstream toleration are
> applied on prepare pod and other pods. 
> > [1] http://pastebin.test.redhat.com/998783
> > [2] http://pastebin.test.redhat.com/998784

I checked the pod yaml too, But did not see tolerations applied for 'xyz' taint.

```  tolerations:
  - effect: NoSchedule
    key: node.ocs.openshift.io/storage
    operator: Equal
    value: "true"
  - effect: NoSchedule
    key: node.ocs.openshift.io/storage
    operator: Equal
    value: "true"
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
```
Pod yaml: http://pastebin.test.redhat.com/998865

Let me know if you need separate bz to track this.

For other logs i have sent the live cluster details in gchat.

Comment 60 Subham Rai 2021-10-06 13:48:47 UTC

After talking with Sebastien about how we reconcile `osd-prepare`.

So it is expected that updating the placement will not update the `OSD-prepare` pod as it is just a Job that has finished his job. If we want to test this scenario on the live cluster(where we already have storagecluster) we first have to add a new OSD and the new `OSD-prepare pod will have the latest toleration. But the older `OSD-preapre` pod still will not update.

Comment 61 Shrivaibavi Raghaventhiran 2021-10-08 07:33:48 UTC

Version used:
ODF - 4.9.0-164.ci
OCP - 4.9.0-0.nightly-2021-09-27-105859

Test steps:
-----------
1. Added tolerations for taint 'xyz' in subscription (for operators), storagecluster(for ocs-pods) and configmap rook-ceph-operator-config (for csi plugin and provisioners pods)
2. Add-capacity by editing storagecluster (increasing the count to 2 in StorageDeviceSets)
3. Respinned operator pods and other pods
4. Rebooted nodes one by one

All the above test steps passed and No issues noticed. The newly added osds also had tolerations and were up and running, node reboots did not cause any issue as well all tolerations were intact.

With the above verifications moving the BZ to verified.

Tool-box was in the Pending state because it did not have any tolerations, This will be tracked in a separate BZ

Comment 64 errata-xmlrpc 2021-12-13 17:44:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086

Note You need to log in before you can comment on or make changes to this bug.

assingh
borazem
djuran
ebenahar
fherrman
jarrpa
jelopez
kbg
madam
muagarwa
nbecker
ocs-bugs
odf-bz-bot
prpandey
sapillai
sraghave
srai
tdesala
tnielsen
vumrao