Bug 1862878 - The Tolerations does not work for file-integrity-operator
Summary: The Tolerations does not work for file-integrity-operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: File Integrity Operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Matt Rogers
QA Contact: xiyuan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-03 05:41 UTC by xiyuan
Modified: 2021-01-04 14:42 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:22:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:22:22 UTC

Description xiyuan 2020-08-03 05:41:38 UTC
Description of problem 
The Tolerations does not work for file-integrity-operator

Version-Release -Cluster version 
4.6.0-0.nightly-2020-08-02-091622

Reproduce
Always

Reproduce step
1. install file-integrity-operator:
$ git clone git:openshift/file-integrity-operator.git
oc login -u kubeadmin -p <pw>
oc create -f file-integrity-operator/deploy/ns.yaml
oc project openshift-file-integrity
for l in `ls -1 file-integrity-operator/deploy/crds/*crd.yaml`; do oc create -f $l; done
oc create -f file-integrity-operator/deploy/

2. add taint for one worker node
$ kubectl taint nodes ip-10-0-135-150.us-east-2.compute.internal key1=value1:NoSchedule
node/ip-10-0-135-150.us-east-2.compute.internal tainted

3. create fileintegrities with  tolerations configured
 $ oc create -f - << EOF
 apiVersion: fileintegrity.openshift.io/v1alpha1
 kind: FileIntegrity
 metadata:
   name: example-fileintegrity6
   namespace: openshift-file-integrity
 spec:
   config:
     name: myconf
     namespace: openshift-file-integrity
     key: aide-conf
     gracePeriod: 20
   debug: true
   nodeSelector:
     node-role.kubernetes.io/worker: ""
   tolerations:
   - key: "key1"
     value: "value1"
     operator: "Equal"
     effect: "NoSchedule"
 EOF


Actual result
No aide-ds pod is scheduled on the tainted pod.
$ oc get pod -o wide
NAME                                                   READY   STATUS      RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
aide-ds-example-fileintegrity6-4z9gf                   2/2     Running     0          11m   10.128.2.12   ip-10-0-187-86.us-east-2.compute.internal    <none>           <none>
aide-ds-example-fileintegrity6-jc58b                   2/2     Running     0          11m   10.129.2.24   ip-10-0-213-238.us-east-2.compute.internal   <none>           <none>
file-integrity-operator-65db875847-xmvwz               1/1     Running     0          29m   10.130.0.8    ip-10-0-208-173.us-east-2.compute.internal   <none>           <none>
ip-10-0-135-150.us-east-2.compute.internal-rmholdoff   0/1     Completed   0          29m   10.131.0.19   ip-10-0-135-150.us-east-2.compute.internal   <none>           <none>
ip-10-0-150-159.us-east-2.compute.internal-rmholdoff   0/1     Completed   0          29m   10.128.0.21   ip-10-0-150-159.us-east-2.compute.internal   <none>           <none>
ip-10-0-185-181.us-east-2.compute.internal-rmholdoff   0/1     Completed   0          29m   10.129.0.42   ip-10-0-185-181.us-east-2.compute.internal   <none>           <none>
ip-10-0-187-86.us-east-2.compute.internal-rmholdoff    0/1     Completed   0          29m   10.128.2.4    ip-10-0-187-86.us-east-2.compute.internal    <none>           <none>
ip-10-0-208-173.us-east-2.compute.internal-rmholdoff   0/1     Completed   0          29m   10.130.0.9    ip-10-0-208-173.us-east-2.compute.internal   <none>           <none>
ip-10-0-213-238.us-east-2.compute.internal-rmholdoff   0/1     Completed   0          29m   10.129.2.16   ip-10-0-213-238.us-east-2.compute.internal   <none>           <none>

Expected result
The tolerations should work for file-integrity-operator and the aide-ds pod should be able to schedule on the tainted pod.

Comment 2 Juan Antonio Osorio 2020-08-17 11:30:53 UTC
tolerations don't work properly with the holdoff pods. We need to fix that.

Comment 3 xiyuan 2020-08-27 12:26:44 UTC
Tried with latest load, the issue in https://bugzilla.redhat.com/show_bug.cgi?id=1862878#c1 not reproduced, and the tolerations also worked as expected.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-08-25-222652   True        False         32h     Cluster version is 4.6.0-0.nightly-2020-08-25-222652

$ oc get node
NAME                                                 STATUS   ROLES    AGE   VERSION
qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal     Ready    master   32h   v1.19.0-rc.2+aaf4ce1-dirty
qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal     Ready    master   32h   v1.19.0-rc.2+aaf4ce1-dirty
qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal     Ready    master   32h   v1.19.0-rc.2+aaf4ce1-dirty
qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal   Ready    worker   32h   v1.19.0-rc.2+aaf4ce1-dirty
qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal   Ready    worker   32h   v1.19.0-rc.2+aaf4ce1-dirty

$ kubectl taint node qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal key1=value1:NoSchedule
node/qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal tainted
$ oc create -f - << EOF
> apiVersion: fileintegrity.openshift.io/v1alpha1
> kind: FileIntegrity
> metadata:
>   name: example-fileintegrity5
>   namespace: openshift-file-integrity
> spec:
>   config:
>     name: myconf
>     namespace: openshift-file-integrity
>     key: aide-conf
>     gracePeriod: 20
>   debug: true
>   nodeSelector:
>     node-role.kubernetes.io/worker: ""
> EOF
fileintegrity.fileintegrity.openshift.io/example-fileintegrity5 created
$ oc get pod
NAME                                                               READY   STATUS      RESTARTS   AGE
pod/aide-ds-example-fileintegrity5-mwgr9                           1/1     Running     0          10s
pod/file-integrity-operator-bcd5f54c8-vrnvs                        1/1     Running     0          3h20m
pod/qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h20m
pod/qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h20m
pod/qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h20m
pod/qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal-rmholdoff   0/1     Completed   0          3h20m
pod/qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal-rmholdoff   0/1     Completed   0          3h20m
$ oc delete fileintegrity example-fileintegrity5
fileintegrity.fileintegrity.openshift.io "example-fileintegrity5" deleted

$ oc create -f - << EOF
> apiVersion: fileintegrity.openshift.io/v1alpha1
> kind: FileIntegrity
> metadata:
>   name: example-fileintegrity6
>   namespace: openshift-file-integrity
> spec:
>   config:
>     name: myconf
>     namespace: openshift-file-integrity
>     key: aide-conf
>     gracePeriod: 20
>   debug: true
>   nodeSelector:
>     node-role.kubernetes.io/worker: ""
>   tolerations:
>   - key: "key1"
>     value: "value1"
>     operator: "Equal"
>     effect: "NoSchedule"
> EOF
fileintegrity.fileintegrity.openshift.io/example-fileintegrity6 created
$ oc get pod -o wide
NAME                                                           READY   STATUS      RESTARTS   AGE     IP             NODE                                                 NOMINATED NODE   READINESS GATES
aide-ds-example-fileintegrity6-f6lp8                           1/1     Running     0          96s     10.131.1.76    qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal   <none>           <none>
aide-ds-example-fileintegrity6-n5jl2                           1/1     Running     0          104s    10.128.3.253   qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal   <none>           <none>
file-integrity-operator-bcd5f54c8-vrnvs                        1/1     Running     0          3h23m   10.129.0.100   qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal     <none>           <none>
qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h23m   10.129.0.101   qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal     <none>           <none>
qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h23m   10.130.0.91    qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal     <none>           <none>
qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal-rmholdoff     0/1     Completed   0          3h23m   10.128.0.87    qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal     <none>           <none>
qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal-rmholdoff   0/1     Completed   0          3h23m   10.131.0.73    qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal   <none>           <none>
qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal-rmholdoff   0/1     Completed   0          3h23m   10.128.3.175   qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal   <none>           <none>

Comment 4 Juan Antonio Osorio 2020-09-09 17:28:33 UTC
So, should we close this?

Comment 7 errata-xmlrpc 2020-10-27 16:22:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.