Description of problem The Tolerations does not work for file-integrity-operator Version-Release -Cluster version 4.6.0-0.nightly-2020-08-02-091622 Reproduce Always Reproduce step 1. install file-integrity-operator: $ git clone git:openshift/file-integrity-operator.git oc login -u kubeadmin -p <pw> oc create -f file-integrity-operator/deploy/ns.yaml oc project openshift-file-integrity for l in `ls -1 file-integrity-operator/deploy/crds/*crd.yaml`; do oc create -f $l; done oc create -f file-integrity-operator/deploy/ 2. add taint for one worker node $ kubectl taint nodes ip-10-0-135-150.us-east-2.compute.internal key1=value1:NoSchedule node/ip-10-0-135-150.us-east-2.compute.internal tainted 3. create fileintegrities with tolerations configured $ oc create -f - << EOF apiVersion: fileintegrity.openshift.io/v1alpha1 kind: FileIntegrity metadata: name: example-fileintegrity6 namespace: openshift-file-integrity spec: config: name: myconf namespace: openshift-file-integrity key: aide-conf gracePeriod: 20 debug: true nodeSelector: node-role.kubernetes.io/worker: "" tolerations: - key: "key1" value: "value1" operator: "Equal" effect: "NoSchedule" EOF Actual result No aide-ds pod is scheduled on the tainted pod. $ oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES aide-ds-example-fileintegrity6-4z9gf 2/2 Running 0 11m 10.128.2.12 ip-10-0-187-86.us-east-2.compute.internal <none> <none> aide-ds-example-fileintegrity6-jc58b 2/2 Running 0 11m 10.129.2.24 ip-10-0-213-238.us-east-2.compute.internal <none> <none> file-integrity-operator-65db875847-xmvwz 1/1 Running 0 29m 10.130.0.8 ip-10-0-208-173.us-east-2.compute.internal <none> <none> ip-10-0-135-150.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.131.0.19 ip-10-0-135-150.us-east-2.compute.internal <none> <none> ip-10-0-150-159.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.128.0.21 ip-10-0-150-159.us-east-2.compute.internal <none> <none> ip-10-0-185-181.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.129.0.42 ip-10-0-185-181.us-east-2.compute.internal <none> <none> ip-10-0-187-86.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.128.2.4 ip-10-0-187-86.us-east-2.compute.internal <none> <none> ip-10-0-208-173.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.130.0.9 ip-10-0-208-173.us-east-2.compute.internal <none> <none> ip-10-0-213-238.us-east-2.compute.internal-rmholdoff 0/1 Completed 0 29m 10.129.2.16 ip-10-0-213-238.us-east-2.compute.internal <none> <none> Expected result The tolerations should work for file-integrity-operator and the aide-ds pod should be able to schedule on the tainted pod.
tolerations don't work properly with the holdoff pods. We need to fix that.
Tried with latest load, the issue in https://bugzilla.redhat.com/show_bug.cgi?id=1862878#c1 not reproduced, and the tolerations also worked as expected. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-25-222652 True False 32h Cluster version is 4.6.0-0.nightly-2020-08-25-222652 $ oc get node NAME STATUS ROLES AGE VERSION qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal Ready master 32h v1.19.0-rc.2+aaf4ce1-dirty qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal Ready master 32h v1.19.0-rc.2+aaf4ce1-dirty qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal Ready master 32h v1.19.0-rc.2+aaf4ce1-dirty qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal Ready worker 32h v1.19.0-rc.2+aaf4ce1-dirty qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal Ready worker 32h v1.19.0-rc.2+aaf4ce1-dirty $ kubectl taint node qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal key1=value1:NoSchedule node/qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal tainted $ oc create -f - << EOF > apiVersion: fileintegrity.openshift.io/v1alpha1 > kind: FileIntegrity > metadata: > name: example-fileintegrity5 > namespace: openshift-file-integrity > spec: > config: > name: myconf > namespace: openshift-file-integrity > key: aide-conf > gracePeriod: 20 > debug: true > nodeSelector: > node-role.kubernetes.io/worker: "" > EOF fileintegrity.fileintegrity.openshift.io/example-fileintegrity5 created $ oc get pod NAME READY STATUS RESTARTS AGE pod/aide-ds-example-fileintegrity5-mwgr9 1/1 Running 0 10s pod/file-integrity-operator-bcd5f54c8-vrnvs 1/1 Running 0 3h20m pod/qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h20m pod/qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h20m pod/qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h20m pod/qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h20m pod/qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h20m $ oc delete fileintegrity example-fileintegrity5 fileintegrity.fileintegrity.openshift.io "example-fileintegrity5" deleted $ oc create -f - << EOF > apiVersion: fileintegrity.openshift.io/v1alpha1 > kind: FileIntegrity > metadata: > name: example-fileintegrity6 > namespace: openshift-file-integrity > spec: > config: > name: myconf > namespace: openshift-file-integrity > key: aide-conf > gracePeriod: 20 > debug: true > nodeSelector: > node-role.kubernetes.io/worker: "" > tolerations: > - key: "key1" > value: "value1" > operator: "Equal" > effect: "NoSchedule" > EOF fileintegrity.fileintegrity.openshift.io/example-fileintegrity6 created $ oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES aide-ds-example-fileintegrity6-f6lp8 1/1 Running 0 96s 10.131.1.76 qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal <none> <none> aide-ds-example-fileintegrity6-n5jl2 1/1 Running 0 104s 10.128.3.253 qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal <none> <none> file-integrity-operator-bcd5f54c8-vrnvs 1/1 Running 0 3h23m 10.129.0.100 qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal <none> <none> qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h23m 10.129.0.101 qe-weinliu46-6-vvwbv-m-0.c.openshift-qe.internal <none> <none> qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h23m 10.130.0.91 qe-weinliu46-6-vvwbv-m-1.c.openshift-qe.internal <none> <none> qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h23m 10.128.0.87 qe-weinliu46-6-vvwbv-m-2.c.openshift-qe.internal <none> <none> qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h23m 10.131.0.73 qe-weinliu46-6-vvwbv-w-a-0.c.openshift-qe.internal <none> <none> qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal-rmholdoff 0/1 Completed 0 3h23m 10.128.3.175 qe-weinliu46-6-vvwbv-w-b-1.c.openshift-qe.internal <none> <none>
So, should we close this?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196