Description of problem: There is no way to define a nodeSelector for the job hawkular-metrics-schema. All the other components can have a nodeSelector: - openshift_metrics_hawkular_nodeselector - openshift_metrics_cassandra_nodeselector - openshift_metrics_heapster_nodeselector - openshift_metrics_hawkular_agent_nodeselector This should have one for consistency. Version-Release number of the following components: All 3.10 branch How reproducible: Always Steps to Reproduce: - Actual results: There is no way to define a nodeSelector for hawkular-metrics-schema Expected results: There should be a way to define nodeSelector for it. Additional info:
Hawkular has its own component, moving this to that component. Monitoring is about Prometheus.
PR for 3.11 branch: https://github.com/openshift/openshift-ansible/pull/10526 PR for 3.10 branch: https://github.com/openshift/openshift-ansible/pull/10527
OCP 3.6-3.10 is no longer on full support [1]. Marking un-triaged bugs CLOSED DEFERRED. If you have a customer case with a support exception or have reproduced on 3.11+, please reopen and include those details. When reopening, please set the Version to the appropriate version where reproduced. [1]: https://access.redhat.com/support/policy/updates/openshift
There is no nodeSelector on 3.11 either: https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_metrics/templates/hawkular_metrics_schema_job.j2
I have reopened the following PR since it was stale: https://github.com/openshift/openshift-ansible/pull/10526 Please take a look once it is possible to be checked.
Could we get a /LGTM in order to end this process? The PR of the 3.10 branch got it but it was closed as it does not accept new updates as per expected, but in OCP 3.11 we would just need another LGTM. Please let us know Jan, appreciate it.
I'm not quite sure who best to contact who could get this merged, I don't have the permission for that myself. I pinged Vadim who might be able to help. If you know about anyone else, feel free to chime in on the PR
openshift_metrics_schema pod could be deployed on the node match the nodeSelector, example, deploy it to openshift_metrics_schema_job_nodeselector={'deploy': 'metrics-schema'} # oc -n openshift-infra get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE hawkular-cassandra-1-thwww 1/1 Running 0 14m 10.129.0.10 juzhao-errnode-registry-router-1 <none> hawkular-metrics-mhmfd 1/1 Running 0 14m 10.131.0.14 juzhao-errnode-2 <none> hawkular-metrics-schema-946nf 0/1 Completed 0 14m 10.130.0.8 juzhao-errnode-1 <none> heapster-dmgz5 1/1 Running 0 14m 10.128.0.13 juzhao-errmaster-etcd-nfs-1 <none> # oc -n openshift-infra get po hawkular-metrics-schema-946nf -oyaml | grep nodeSelector -A2 nodeSelector: deploy: metrics-schema priority: 0 # oc -n openshift-infra get job/hawkular-metrics-schema -oyaml | grep nodeSelector -A2 nodeSelector: deploy: metrics-schema restartPolicy: OnFailure # oc get node juzhao-errnode-1 --show-labels | grep deploy=metrics-schema juzhao-errnode-1 Ready compute 1h v1.11.0+d4cacc0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=a9acc2de-39d7-4148-8d16-413c3b696e9d,beta.kubernetes.io/os=linux,deploy=metrics-schema,failure-domain.beta.kubernetes.io/region=regionOne,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/hostname=juzhao-errnode-1,node-role.kubernetes.io/compute=true,role=node # rpm -qa | grep openshift-ansible openshift-ansible-3.11.218-1.git.0.6f55149.el7.noarch openshift-ansible-docs-3.11.218-1.git.0.6f55149.el7.noarch openshift-ansible-roles-3.11.218-1.git.0.6f55149.el7.noarch openshift-ansible-playbooks-3.11.218-1.git.0.6f55149.el7.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2215