Hide Forgot
Description of problem: Deploy cluster monitoring, prometheus-k8s pod is in ContainerCreating # oc -n openshift-monitoring get pod NAME READY STATUS RESTARTS AGE cluster-monitoring-operator-bb9c969fd-c2jmj 1/1 Running 0 13m grafana-867fb88f6-wcr28 2/2 Running 0 10m prometheus-k8s-0 0/4 ContainerCreating 0 9m prometheus-operator-7b9988b85d-jhr5k 1/1 Running 0 12m # oc -n openshift-monitoring describe pod prometheus-k8s-0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 10m (x3 over 10m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times) Normal Scheduled 10m default-scheduler Successfully assigned openshift-monitoring/prometheus-k8s-0 to preserved-juzhao-40-nrr-1 Normal SuccessfulAttachVolume 10m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-ce393e74-de70-11e8-bf6a-fa163e8ec639" Warning FailedMount 1m (x12 over 10m) kubelet, preserved-juzhao-40-nrr-1 MountVolume.SetUp failed for volume "secret-kube-etcd-client-certs" : secrets "kube-etcd-client-certs" not found # oc -n openshift-monitoring get secret kube-etcd-client-certs No resources found. Error from server (NotFound): secrets "kube-etcd-client-certs" not found Version-Release number of selected component (if applicable): ose-prometheus-operator-v4.0.0-0.43.0.0 How reproducible: Always Steps to Reproduce: 1. Deploy cluster monitoring 4.0 2. 3. Actual results: secret "kube-etcd-client-certs" is not created Expected results: secret "kube-etcd-client-certs" should be created Additional info:
NOTE: The ose-cluster-monitoring-operator image has not been packaged telemeter client.
What is the content of the `cluster-monitoring` configmap in the `openshift-monitroing` namespace?
Created attachment 1502796 [details] cluster-monitoring-config configmap used origin images
quay.io/openshift/origin-cluster-monitoring-operator:v4.0 openshift/prometheus:v2.4.2 quay.io/coreos/prometheus-config-reloader:v0.25.0 openshift/oauth-proxy:v1.1.0 quay.io/coreos/kube-rbac-proxy:v0.4.0 quay.io/coreos/prom-label-proxy:v0.1.0 quay.io/coreos/configmap-reload:v0.0.1
Blocked installation with origin images
Strange, this does look like a bug, we will have to investigate. etcd monitoring is not enabled in the configmap so it shouldn't be attempting to mount the secret.
(In reply to Frederic Branczyk from comment #6) > Strange, this does look like a bug, we will have to investigate. etcd > monitoring is not enabled in the configmap so it shouldn't be attempting to > mount the secret. Is it related to the grafana-dashboard-etcd is created? from the attachment in Comment 3 oc -n openshift-monitoring get cm NAME DATA AGE cluster-monitoring-config 1 11m grafana-dashboard-etcd 1 10m grafana-dashboard-k8s-cluster-rsrc-use 1 10m grafana-dashboard-k8s-node-rsrc-use 1 10m grafana-dashboard-k8s-resources-cluster 1 10m grafana-dashboard-k8s-resources-namespace 1 10m grafana-dashboard-k8s-resources-pod 1 10m grafana-dashboards 1 10m prometheus-k8s-rulefiles-0 1 10m prometheus-serving-certs-ca-bundle 1 10m
Workaround is create kube-etcd-client-certs secret, then prometheus-k8s pod will be started up # cat kube-etcd-client-certs.yaml apiVersion: v1 data: etcd-client-ca.crt: "" etcd-client.crt: "" etcd-client.key: "" kind: Secret metadata: name: kube-etcd-client-certs namespace: openshift-monitoring type: Opaque
Note: The previous issue is happen when installing cluster-monitoring with openshift-ansible 4.0. Also installed OCP on libvirt by using Next-Gen installer, the issue is not happen, it is because, although not enabled etcd monitoring, kube-etcd-client-certs secret and grafana-dashboard-etcd are created $ oc -n openshift-monitoring get cm cluster-monitoring-config -oyaml apiVersion: v1 data: config.yaml: | prometheusOperator: baseImage: quay.io/coreos/prometheus-operator prometheusConfigReloaderBaseImage: quay.io/coreos/prometheus-config-reloader configReloaderBaseImage: quay.io/coreos/configmap-reload prometheusK8s: baseImage: openshift/prometheus alertmanagerMain: baseImage: openshift/prometheus-alertmanager nodeExporter: baseImage: openshift/prometheus-node-exporter kubeRbacProxy: baseImage: quay.io/coreos/kube-rbac-proxy kubeStateMetrics: baseImage: quay.io/coreos/kube-state-metrics grafana: baseImage: grafana/grafana auth: baseImage: openshift/oauth-proxy kind: ConfigMap metadata: creationTimestamp: 2018-11-14T04:24:10Z name: cluster-monitoring-config namespace: openshift-monitoring resourceVersion: "6231" selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config uid: 2244f71d-e7c5-11e8-83ba-5282253f2bb7 $ oc -n openshift-monitoring get secret kube-etcd-client-certs -oyaml apiVersion: v1 data: etcd-client-ca.crt: "" etcd-client.crt: "" etcd-client.key: "" kind: Secret metadata: creationTimestamp: 2018-11-14T04:24:10Z name: kube-etcd-client-certs namespace: openshift-monitoring resourceVersion: "6235" selfLink: /api/v1/namespaces/openshift-monitoring/secrets/kube-etcd-client-certs uid: 224914ed-e7c5-11e8-83ba-5282253f2bb7 type: Opaque $ oc -n openshift-monitoring get cm NAME DATA AGE cluster-monitoring-config 1 1h grafana-dashboard-etcd 1 1h grafana-dashboard-k8s-cluster-rsrc-use 1 1h grafana-dashboard-k8s-node-rsrc-use 1 1h grafana-dashboard-k8s-resources-cluster 1 1h grafana-dashboard-k8s-resources-namespace 1 1h grafana-dashboard-k8s-resources-pod 1 1h grafana-dashboards 1 1h prometheus-k8s-rulefiles-0 1 1h prometheus-serving-certs-ca-bundle 1 1h
I think this is due to the semantics of the configuration file. If no etcd configuration is specified, monitoring etcd defaults to true: https://github.com/openshift/cluster-monitoring-operator/blob/master/pkg/manifests/config.go#L108-L115. We could change the semantics of the defaulting, or change the default config to have `etcd.enabled=false`
Agreed. When no config is given we should default to not monitoring etcd.
now,grafana-dashboard-etcd configmap is not created by default $ oc -n openshift-monitoring get cm NAME DATA AGE adapter-config 1 18m cluster-monitoring-config 1 29m grafana-dashboard-k8s-cluster-rsrc-use 1 28m grafana-dashboard-k8s-node-rsrc-use 1 28m grafana-dashboard-k8s-resources-cluster 1 28m grafana-dashboard-k8s-resources-namespace 1 28m grafana-dashboard-k8s-resources-pod 1 28m grafana-dashboards 1 28m prometheus-adapter-prometheus-config 1 18m prometheus-k8s-rulefiles-0 1 21m serving-certs-ca-bundle 1 27m sharing-config 3 17m telemeter-client-serving-certs-ca-bundle 1 18m $ oc version oc v4.0.0-alpha.0+9d2874f-759 kubernetes v1.11.0+9d2874f used images docker.io/grafana/grafana:5.2.4 docker.io/openshift/oauth-proxy:v1.1.0 docker.io/openshift/prometheus-alertmanager:v0.15.2 docker.io/openshift/prometheus-node-exporter:v0.16.0 docker.io/openshift/prometheus:v2.5.0 quay.io/coreos/configmap-reload:v0.0.1 quay.io/coreos/kube-rbac-proxy:v0.4.0 quay.io/coreos/kube-state-metrics:v1.4.0 quay.io/coreos/prom-label-proxy:v0.1.0 quay.io/coreos/prometheus-config-reloader:v0.26.0 quay.io/coreos/prometheus-operator:v0.26.0 quay.io/openshift/origin-configmap-reload:v3.11 quay.io/openshift/origin-telemeter:v4.0 quay.io/surbania/k8s-prometheus-adapter-amd64:326bf3c quay.io/openshift-release-dev/ocp-v4.0@sha256:4f94db8849ed915994678726680fc39bdb47722d3dd570af47b666b0160602e5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758