Document URL: https://docs.openshift.com/container-platform/4.2/machine_management/creating-infrastructure-machinesets.html#infrastructure-moving-monitoring_creating-infrastructure-machinesets Section Number and Name: Creating infrastructure MachineSets - Moving the monitoring solution | Machine management | OpenShift Container Platform 4.2 Describe the issue: Q1: I was able to move monitoring pod (excluding openshift-state-metrics-7f4bdfbdf9-qpdn4) to infra node referring to [1], but is this procedure supportable and applicable in a vSphere environment without machinesets? [cloud-user@rhcl-0 ~]$ oc get pod -n openshift-monitoring -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 3/3 Running 0 79s 10.130.2.17 infra-0.test4.example.com <none> <none> alertmanager-main-1 3/3 Running 0 99s 10.131.2.13 infra-1.test4.example.com <none> <none> alertmanager-main-2 3/3 Running 0 2m20s 10.130.2.13 infra-0.test4.example.com <none> <none> cluster-monitoring-operator-6bf7c89799-m5jn5 1/1 Running 0 7d1h 10.130.0.19 master-2.test4.example.com <none> <none> grafana-59dffb4f5-8hsvn 2/2 Running 0 2m19s 10.130.2.14 infra-0.test4.example.com <none> <none> kube-state-metrics-68bc45c96b-98pg4 3/3 Running 0 2m33s 10.131.2.9 infra-1.test4.example.com <none> <none> node-exporter-6xsgm 2/2 Running 0 7d1h 172.16.231.188 master-1.test4.example.com <none> <none> node-exporter-77mqn 2/2 Running 0 7d1h 172.16.231.195 worker-1.test4.example.com <none> <none> node-exporter-95kdc 2/2 Running 0 7d1h 172.16.231.187 master-0.test4.example.com <none> <none> node-exporter-9n49n 2/2 Running 2 25h 172.16.231.190 infra-0.test4.example.com <none> <none> node-exporter-kq6q2 2/2 Running 0 7d1h 172.16.231.189 master-2.test4.example.com <none> <none> node-exporter-kz9mh 2/2 Running 2 25h 172.16.231.191 infra-1.test4.example.com <none> <none> node-exporter-l2qtm 2/2 Running 0 7d1h 172.16.231.196 worker-2.test4.example.com <none> <none> node-exporter-s68xg 2/2 Running 0 7d1h 172.16.231.194 worker-0.test4.example.com <none> <none> openshift-state-metrics-7f4bdfbdf9-qpdn4 3/3 Running 0 7d1h 10.131.0.4 worker-0.test4.example.com <none> <none> prometheus-adapter-5546dc5fb4-27v8q 1/1 Running 0 2m 10.130.2.15 infra-0.test4.example.com <none> <none> prometheus-adapter-5546dc5fb4-rlb42 1/1 Running 0 2m20s 10.131.2.11 infra-1.test4.example.com <none> <none> prometheus-k8s-0 6/6 Running 1 87s 10.130.2.16 infra-0.test4.example.com <none> <none> prometheus-k8s-1 6/6 Running 1 2m17s 10.131.2.12 infra-1.test4.example.com <none> <none> prometheus-operator-b95584fbb-7qzdc 1/1 Running 0 2m33s 10.130.2.12 infra-0.test4.example.com <none> <none> telemeter-client-64955f868f-rm7lp 3/3 Running 0 2m24s 10.131.2.10 infra-1.test4.example.com <none> <none> [cloud-user@rhcl-0 ~]$ Q2: Please tell me why "openshift-state-metrics-7f4bdfbdf9-qpdn4 pod" does not move to infrastructure node. - If there is a problem in the implementation procedure, please tell me how to deal with it. - If you have a reason not to move to an infrastructure node, let us know. Q3: I suspect that this procedure is not the correct procedure for a vSphere environment (without machinesets) and that there is something more appropriate. I couldn't find any other documentation, would you please let me know if there are any more appropriate instructions? Suggestions for improvement: I wrote and provided following answer to the customer based on the result by our Sr, SME ( please see Additional information). This should be doable, as our Sr, SME reviewed my proposing answer to provide, and then responded that the answers look correct and reasonable. Please explicitly describe exact steps and results for moving the monitoring solution. > Q1: > I was able to move monitoring pod (excluding openshift-state-metrics-7f4bdfbdf9-qpdn4) to infra node referring to [1], but is this procedure supportable and applicable in a vSphere environment without machinesets? ... Yes, this is supportable without machineset as this is vSphere environment. Please note that creating machineconfigpool is also very important. > ... > > Q2: > Please tell me why "openshift-state-metrics-7f4bdfbdf9-qpdn4 pod" does not move to infrastructure node. > - If there is a problem in the implementation procedure, please tell me how to deal with it. > - If you have a reason not to move to an infrastructure node, let us know. During our testing by our Sr, SME, we found that it seems that we need to delete some pods, to reschedule it to ran on another places, but this should work and supportable as we answered on Q1. > > Q3: > I suspect that this procedure is not the correct procedure for a vSphere environment (without machinesets) and that there is something more appropriate. > I couldn't find any other documentation, would you please let me know if there are any more appropriate instructions? This is supportable and doable (just need to delete it once before rescheduing). Additional information: Here is the result which our Sr SME ( rvanderp ) verified on sfdc#02546889. There is a procedure in article[ref 0] which describes the process of creating an infra node. The article you linked covers the creation of the new machine config pool which is very important as well. You might try deleting all the pods out of the openshift-monitoring namespace. I ran through the process this morning of creating infra nodes and configuring the monitoring stack to schedule on the infra nodes. Some pods did not reschedule until I deleted them. References: [0] - https://access.redhat.com/solutions/4287111 [1] - configmap [shadowman@gss-ose-3-openshift ~]$ oc -n openshift-monitoring get configmap cluster-monitoring-config -o yaml apiVersion: v1 data: config.yaml: | prometheusOperator: nodeSelector: node-role.kubernetes.io/infra: "" prometheusK8s: nodeSelector: node-role.kubernetes.io/infra: "" alertmanagerMain: nodeSelector: node-role.kubernetes.io/infra: "" kubeStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" grafana: nodeSelector: node-role.kubernetes.io/infra: "" telemeterClient: nodeSelector: node-role.kubernetes.io/infra: "" k8sPrometheusAdapter: nodeSelector: node-role.kubernetes.io/infra: "" kind: ConfigMap metadata: creationTimestamp: "2019-12-31T17:16:26Z" name: cluster-monitoring-config namespace: openshift-monitoring resourceVersion: "3739349" selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config uid: 46b4d51a-2bf1-11ea-a09e-5254002ffd64 [2] - oc get nodes [shadowman@gss-ose-3-openshift ~]$ oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready master,worker 9d v1.14.6+cebabbf4a master-1 Ready master,worker 9d v1.14.6+cebabbf4a master-2 Ready master,worker 9d v1.14.6+cebabbf4a worker-0 Ready infra 9d v1.14.6+cebabbf4a worker-1 Ready infra 9d v1.14.6+cebabbf4a worker-2 Ready worker 9d v1.14.6+cebabbf4a [3] - oc get pods -o wide [shadowman@gss-ose-3-openshift ~]$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 3/3 Running 0 5m20s 10.128.2.54 worker-0 <none> <none> alertmanager-main-1 3/3 Running 0 5m46s 10.131.0.64 worker-1 <none> <none> alertmanager-main-2 3/3 Running 0 6m12s 10.131.0.61 worker-1 <none> <none> cluster-monitoring-operator-598656fd74-79ndn 1/1 Running 0 6m49s 10.129.0.131 master-2 <none> <none> grafana-77bf6d8bf-lvkjx 2/2 Running 0 6m10s 10.128.2.52 worker-0 <none> <none> kube-state-metrics-6b9f864976-f2ft9 3/3 Running 0 6m20s 10.128.2.48 worker-0 <none> <none> node-exporter-8wd5w 2/2 Running 0 6m35s 192.168.100.10 master-0 <none> <none> node-exporter-dn77v 2/2 Running 0 6m41s 192.168.100.21 worker-1 <none> <none> node-exporter-fqpxr 2/2 Running 0 6m38s 192.168.100.22 worker-2 <none> <none> node-exporter-k6gcj 2/2 Running 0 6m37s 192.168.100.20 worker-0 <none> <none> node-exporter-lzfv5 2/2 Running 0 6m39s 192.168.100.12 master-2 <none> <none> node-exporter-xskn7 2/2 Running 0 6m37s 192.168.100.11 master-1 <none> <none> openshift-state-metrics-65488cbc6f-2znkm 3/3 Running 0 6m45s 10.131.0.59 worker-1 <none> <none> prometheus-adapter-79884dd577-4bmmd 1/1 Running 0 5m56s 10.131.0.63 worker-1 <none> <none> prometheus-adapter-79884dd577-sx4zt 1/1 Running 0 6m11s 10.128.2.51 worker-0 <none> <none> prometheus-k8s-0 6/6 Running 1 5m18s 10.128.2.55 worker-0 <none> <none> prometheus-k8s-1 6/6 Running 1 5m40s 10.131.0.65 worker-1 <none> <none> prometheus-operator-7494cfc564-74rtp 1/1 Running 0 6m21s 10.128.2.47 worker-0 <none> <none> telemeter-client-67c86784bd-m9fsf 3/3 Running 0 6m14s 10.128.2.49 worker-0 <none> <none>
(In reply to Maxim Svistunov from comment #1) <...> > Unless I am mistaken, the rest of the customers questions have been > positively answered by rvanderp and do not require changes in documentation. <...> Hi Maxim Svistunov, Thank you for your help and support on this BZ. In addtion to what Richard Vanderpool kindly answered on sfdc#02546889, I think we need to have some note as Pawel Krupa suggested on https://bugzilla.redhat.com/show_bug.cgi?id=1807852#c6 here as well ( or should we cover all related things regarding openshiftStateMetrics on Bug 1808183, instead ??) Anyway I'm leaving this comment for the reference between this BZ and 1808183. - https://bugzilla.redhat.com/show_bug.cgi?id=1807852#c6 Pawel Krupa 2020-02-27 23:17:04 JST This feature is supported since 4.2 (or maybe earlier). You need to add: openshiftStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" to cluster-monitoring-config ConfigMap. The same way as for other components. Thank you, BR, Masaki