Bug 2010200 - Not able to add toleration for NooBaa pods
Summary: Not able to add toleration for NooBaa pods
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Nimrod Becker
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-04 07:45 UTC by Bipin Kunal
Modified: 2023-08-09 16:49 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-04 10:34:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Bipin Kunal 2021-10-04 07:45:32 UTC
Description of problem (please be detailed as possible and provide log
snippests):

We have a specific taint on the worker+master nodes and in order to run NooBaa/ODF, pods should tolerate the taint. In order to achieve that we are trying to add toleration to the noobaa pod via storage cluster CR but it doesn't seem to be working.

in the Storage cluster CR we have the following tolerations:

=========================================================
spec:
  placement:
    all:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
    mds:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
    noobaa-core:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"

=========================================================

Niether "all" nor "noobaa-core" works.

Version of all relevant components (if applicable):
All the versions. 
But my testing is with ODF 4.9.0-164.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes


Steps to Reproduce:
1. Install ODF 4.9
2. Add random taint to worker nodes
3. Add toleration into storagecluster CR
3. respin the noobaa pods


Actual results:

NooBaa pods gets stuck in pending state complaining about toleration


Expected results:

NooBaa pods should come fine tolerating the taint.


Additional info: Check comments below.

Comment 2 Bipin Kunal 2021-10-04 10:20:40 UTC
$ oc get noobaa noobaa -o yaml | grep -A 10 "tolerations:" 
  tolerations:
  - effect: NoSchedule
    key: xyz
    operator: Equal
    value: "true"
  - effect: NoSchedule
    key: node.ocs.openshift.io/storage
    operator: Equal
    value: "true"
status:
  accounts:


$ oc get statefulset noobaa-core -oyaml | grep -A 10  "tolerations:" 
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
      volumes:
      - emptyDir: {}


$ oc get pod noobaa-core-0 -oyaml |grep -A 10  "tolerations:"
  tolerations:
  - effect: NoSchedule
    key: xyz
    operator: Equal
    value: "true"
  - effect: NoSchedule
    key: node.ocs.openshift.io/storage
    operator: Equal
    value: "true"
  - effect: NoExecute
    key: node.kubernetes.io/not-ready


$ oc get statefulset noobaa-core  -oyaml |grep -A 10  "tolerations:"
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
      volumes:
      - emptyDir: {}

Comment 3 Bipin Kunal 2021-10-04 10:30:15 UTC
It looks like it is working for Noobaa pods as well. All the three noobaa pods : noobaa-core, noobaa-db-pg and noobaa-endpoint are working..... Only problem here is that adding toleration for noobaa-core overrides the default  *node.ocs.openshift.io/storage* for noobaa pods, so toleration for *node.ocs.openshift.io/storage* needs to be added explicitly in the storagecluster CR like:

========================

    noobaa-core:
      tolerations:
      - effect: NoSchedule
        key: xyz
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"

========================

Comment 4 Bipin Kunal 2021-10-04 10:34:40 UTC
I am closing this bug for now, I will open a new bug to ensure that passing additional toleration in storagecluster CR doesn't override the default or existing tolerations.


Note You need to log in before you can comment on or make changes to this bug.