Description of problem: label all nodes with monitoring=true # for i in $(kubectl get node --no-headers | awk '{print $1}'); do echo $i; kubectl label node $i monitoring=true --overwrite=true;done taint all nodes with NoExecute except the node where cluster-monitoring-operator pod is running at # for i in $(kubectl get node --no-headers | grep -v ${cluster-monitoring-operator_node} | awk '{print $1}'); do echo $i; kubectl taint nodes $i monitoring=true:NoExecute; done create toleration.yaml, to tolerate monitoring=true:NoExecute nodes, no need to replace the nodeSelector for node-exporter, content see below ************************************************ apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | alertmanagerMain: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" grafana: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" kubeStateMetrics: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" k8sPrometheusAdapter: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" nodeExporter: tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" prometheusK8s: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" prometheusOperator: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" telemeterClient: nodeSelector: monitoring: true tolerations: - key: "monitoring" operator: "Exists" effect: "NoExecute" ************************************************ after creating the cluster-monitoring-config configmap, alertmanager-main-0 is in pending status # oc -n openshift-monitoring get pod NAME READY STATUS RESTARTS AGE alertmanager-main-0 0/3 Pending 0 8m38s cluster-monitoring-operator-6cd9cd6d86-xf6rg 1/1 Running 0 27m grafana-5967db758b-c8f7z 2/2 Running 0 2m10s kube-state-metrics-757745bf56-swvmv 3/3 Running 0 2m21s node-exporter-46b98 2/2 Running 0 27m node-exporter-bljh5 2/2 Running 0 27m node-exporter-j9cxh 2/2 Running 0 27m node-exporter-mxnvm 2/2 Running 0 27m node-exporter-nkrr5 2/2 Running 0 27m node-exporter-w4z2k 2/2 Running 0 27m prometheus-adapter-7584485777-8j4jm 1/1 Running 0 2m11s prometheus-adapter-7584485777-dkrhm 1/1 Running 0 2m3s prometheus-k8s-0 6/6 Running 1 108s prometheus-k8s-1 6/6 Running 1 119s prometheus-operator-5859876475-8kf8z 1/1 Running 0 2m21s telemeter-client-68fd784876-8fcxt 3/3 Running 0 2m15s ************************************************ # oc -n openshift-monitoring describe pod alertmanager-main-0 Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <invalid> (x18 over 5m17s) default-scheduler 0/6 nodes are available: 6 node(s) had taints that the pod didn't tolerate. ************************************************ thre is not key: "monitoring" toleration in alertmanager-main-0 pod # oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep tolerations -A13 tolerations: - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - name: config-volume *************************************************************************** key: "monitoring" toleration already in alertmanager-main statefulset # oc -n openshift-monitoring get statefulset alertmanager-main -oyaml | grep tolerations -A13 tolerations: - effect: NoExecute key: monitoring operator: Exists volumes: - name: config-volume secret: defaultMode: 420 secretName: alertmanager-main - name: secret-alertmanager-main-tls secret: defaultMode: 420 secretName: alertmanager-main-tls - name: secret-alertmanager-main-proxy *************************************************************************** delete alertmanager-main-0 pod, alertmanager-main pods will be started # oc -n openshift-monitoring delete pod alertmanager-main-0 pod "alertmanager-main-0" deleted # oc -n openshift-monitoring get pod NAME READY STATUS RESTARTS AGE alertmanager-main-0 3/3 Running 0 3m8s alertmanager-main-1 3/3 Running 0 2m59s alertmanager-main-2 3/3 Running 0 2m50s cluster-monitoring-operator-6cd9cd6d86-xf6rg 1/1 Running 0 33m grafana-5967db758b-c8f7z 2/2 Running 0 7m40s kube-state-metrics-757745bf56-swvmv 3/3 Running 0 7m51s node-exporter-46b98 2/2 Running 0 33m node-exporter-bljh5 2/2 Running 0 32m node-exporter-j9cxh 2/2 Running 0 32m node-exporter-mxnvm 2/2 Running 0 33m node-exporter-nkrr5 2/2 Running 0 33m node-exporter-w4z2k 2/2 Running 0 32m prometheus-adapter-7584485777-8j4jm 1/1 Running 0 7m41s prometheus-adapter-7584485777-dkrhm 1/1 Running 0 7m33s prometheus-k8s-0 6/6 Running 1 7m18s prometheus-k8s-1 6/6 Running 1 7m29s prometheus-operator-5859876475-8kf8z 1/1 Running 0 7m51s telemeter-client-68fd784876-8fcxt 3/3 Running 0 7m45s key: "monitoring" toleration already in alertmanager-main pods # oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep tolerations -A15 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists - effect: NoExecute key: monitoring operator: Exists volumes: Version-Release number of selected component (if applicable): 4.2.0-0.ci-2019-06-19-023510 How reproducible: Always Steps to Reproduce: 1. See the description part 2. 3. Actual results: Expected results: Additional info:
The PRs are merged, but issue is not fixed, payload: 4.2.0-0.nightly-2019-06-30-221852 # oc -n openshift-monitoring logs prometheus-operator-54cddf9c9d-dd977 | grep "Starting Prometheus Operator version" ts=2019-07-01T09:07:05.850590352Z caller=main.go:181 msg="Starting Prometheus Operator version '0.31.1'." # oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep terminationGracePeriod terminationGracePeriodSeconds: 120 # oc -n openshift-monitoring get pod NAME READY STATUS RESTARTS AGE alertmanager-main-0 0/3 Pending 0 5m12s # oc -n openshift-monitoring describe pod alertmanager-main-0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 44s (x8 over 2m28s) default-scheduler 0/6 nodes are available: 6 node(s) had taints that the pod didn't tolerate. # oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep tolerations -A13 tolerations: - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 volumes: - name: config-volume Delete alertmanager-main-0 pod, then the toleration config is applied # oc -n openshift-monitoring delete pod alertmanager-main-0 pod "alertmanager-main-0" deleted # oc -n openshift-monitoring get pod | grep alertmanager-main alertmanager-main-0 3/3 Running 0 45s alertmanager-main-1 3/3 Running 0 36s alertmanager-main-2 3/3 Running 0 27s # oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep tolerations -A13 tolerations: - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists - effect: NoExecute key: monitoring operator: Exists - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists
Could you share prometheus-operator logs as well as the statefulset generated by the prometheus-operator?
Created attachment 1586479 [details] monitoring dump
Created attachment 1586560 [details] controller_scheduler logs
That looks like the openshift-controller-manager, could you share the kube-controller-manager logs?
Looks like https://github.com/openshift/prometheus-operator/pull/35 fixed things.
Follow the steps in Comment 0, alertmanager-main pod now can apply the toleration config # oc -n openshift-monitoring get pod | grep alertmanager-main alertmanager-main-0 3/3 Running 0 55m alertmanager-main-1 3/3 Running 0 55m alertmanager-main-2 3/3 Running 0 55m payload: 4.2.0-0.nightly-2019-07-29-154123
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922