Bug 1651835

Summary:	telemeter-client-serving-certs-ca-bundle confimap is missing for telemeter-client pod
Product:	OpenShift Container Platform	Reporter:	Junqi Zhao <juzhao>
Component:	Monitoring	Assignee:	lserven
Status:	CLOSED ERRATA	QA Contact:	Junqi Zhao <juzhao>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.1.0	Keywords:	TestBlocker
Target Milestone:	---
Target Release:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-06-04 10:41:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Junqi Zhao 2018-11-21 01:39:30 UTC

Description of problem:
This bug is cloned from https://jira.coreos.com/browse/MON-457
File it again for QE team to track the monitoring issue in Bugzilla.

telemeter-client pod waits on ContainerCreating status, configmap "telemeter-client-serving-certs-ca-bundle" is not found

#oc -n openshift-monitoring get pod
NAME                                          READY     STATUS              RESTARTS   AGE
alertmanager-main-0                           3/3       Running             0          38m
alertmanager-main-1                           3/3       Running             0          38m
alertmanager-main-2                           3/3       Running             0          37m
cluster-monitoring-operator-8fbbc8d47-96hm7   1/1       Running             0          49m
grafana-5647c9bdf9-hz78b                      2/2       Running             0          49m
kube-state-metrics-548d4df845-krgsr           3/3       Running             0          35m
node-exporter-bcz2k                           2/2       Running             0          37m
node-exporter-gcxz7                           2/2       Running             0          37m
node-exporter-jvx7t                           2/2       Running             0          37m
prometheus-k8s-0                              6/6       Running             1          48m
prometheus-k8s-1                              6/6       Running             1          39m
prometheus-operator-775f4798f7-l6tmt          1/1       Running             0          39m
telemeter-client-747b776f55-p7rrm             0/3       ContainerCreating   0          35m

#oc -n openshift-monitoring describe pod telemeter-client-747b776f55-p7rrm
************************snipped************************
Events:
  Type     Reason       Age                 From                                   Message
  ----     ------       ----                ----                                   -------
  Normal   Scheduled    35m                 default-scheduler                      Successfully assigned openshift-monitoring/telemeter-client-747b776f55-p7rrm to ip-172-18-12-60.ec2.internal
  Warning  FailedMount  13m (x10 over 33m)  kubelet, ip-172-18-12-60.ec2.internal  Unable to mount volumes for pod "telemeter-client-747b776f55-p7rrm_openshift-monitoring(e7e2c52e-e7cf-11e8-b201-0e1fbfa3d478)": timeout expired waiting for volumes to attach or mount for pod "openshift-monitoring"/"telemeter-client-747b776f55-p7rrm". list of unmounted volumes=[serving-certs-ca-bundle]. list of unattached volumes=[serving-certs-ca-bundle secret-telemeter-client telemeter-client-tls telemeter-client-token-m9277]
  Warning  FailedMount  4m (x23 over 35m)   kubelet, ip-172-18-12-60.ec2.internal  MountVolume.SetUp failed for volume "serving-certs-ca-bundle" : configmaps "telemeter-client-serving-certs-ca-bundle" not found

#oc -n openshift-monitoring get cm
NAME                                        DATA      AGE
cluster-monitoring-config                   1         50m
grafana-dashboard-etcd                      1         49m
grafana-dashboard-k8s-cluster-rsrc-use      1         49m
grafana-dashboard-k8s-node-rsrc-use         1         49m
grafana-dashboard-k8s-resources-cluster     1         49m
grafana-dashboard-k8s-resources-namespace   1         49m
grafana-dashboard-k8s-resources-pod         1         49m
grafana-dashboards                          1         49m
prometheus-k8s-rulefiles-0                  1         49m
prometheus-serving-certs-ca-bundle          1         49m

Version-Release number of selected component (if applicable):
quay.io/openshift/origin-telemeter:v4.0
quay.io/openshift/origin-configmap-reload:v3.11
quay.io/coreos/kube-rbac-proxy:v0.4.0

How reproducible:
Always

Steps to Reproduce:
1. Deploy cluster monitoring
2.
3.

Actual results:
telemeter-client pod waits on ContainerCreating status

Expected results:
telemeter-client pod should be healthy

Additional info:

Comment 1 Junqi Zhao 2018-11-21 01:41:26 UTC

Blocks telemeter testing

Comment 2 Junqi Zhao 2018-11-21 01:41:59 UTC

Tested with the new Installer by libvirt installation, no telemeter-client pod is created

$ oc -n openshift-monitoring get pod
NAME                                           READY     STATUS    RESTARTS   AGE
alertmanager-main-0                            3/3       Running   0          5m
alertmanager-main-1                            3/3       Running   0          4m
alertmanager-main-2                            3/3       Running   0          3m
cluster-monitoring-operator-78f6f75c4b-cwkjm   1/1       Running   0          5m
grafana-57f595895d-ffs97                       2/2       Running   0          5m
kube-state-metrics-dcf7dc56d-zf42g             3/3       Running   0          58s
node-exporter-d54z4                            2/2       Running   0          3m
node-exporter-hltnm                            2/2       Running   0          3m
prometheus-k8s-0                               6/6       Running   0          1h
prometheus-k8s-1                               6/6       Running   1          25m
prometheus-operator-7ff6f965f-7l6cd            1/1       Running   0          5m

 

$ oc -n openshift-monitoring get cm
NAME                                        DATA      AGE
cluster-monitoring-config                   1         4h
grafana-dashboard-etcd                      1         4h
grafana-dashboard-k8s-cluster-rsrc-use      1         4h
grafana-dashboard-k8s-node-rsrc-use         1         4h
grafana-dashboard-k8s-resources-cluster     1         4h
grafana-dashboard-k8s-resources-namespace   1         4h
grafana-dashboard-k8s-resources-pod         1         4h
grafana-dashboards                          1         4h
prometheus-k8s-rulefiles-0                  1         1h
prometheus-serving-certs-ca-bundle          1         4h

Comment 3 Junqi Zhao 2018-12-11 02:54:01 UTC

Tested with the new Installer on AWS, telemeter-client-serving-certs-ca-bundle configmap is created.
$ oc -n openshift-monitoring get pod
NAME                                           READY     STATUS    RESTARTS   AGE
alertmanager-main-0                            3/3       Running   0          12h
alertmanager-main-1                            3/3       Running   0          12h
alertmanager-main-2                            3/3       Running   0          12h
cluster-monitoring-operator-5fb87b895f-cmzpd   1/1       Running   1          12h
grafana-58456d859d-tpjkn                       2/2       Running   0          12h
kube-state-metrics-dcf7dc56d-dp2gv             3/3       Running   0          12h
node-exporter-64vgt                            2/2       Running   0          12h
node-exporter-6bdwr                            2/2       Running   0          12h
node-exporter-f4rmk                            2/2       Running   0          12h
node-exporter-j4c7d                            2/2       Running   0          12h
node-exporter-lnb9m                            2/2       Running   0          12h
node-exporter-n7nqx                            2/2       Running   0          12h
prometheus-adapter-bdc5f58cb-fv6rb             1/1       Running   0          12h
prometheus-k8s-0                               6/6       Running   1          12h
prometheus-k8s-1                               6/6       Running   1          12h
prometheus-operator-5456f94fb9-2jql6           1/1       Running   1          12h
telemeter-client-747b776f55-dcddl              3/3       Running   0          12h
$ oc -n openshift-monitoring get cm
NAME                                        DATA      AGE
adapter-config                              1         12h
cluster-monitoring-config                   1         12h
grafana-dashboard-k8s-cluster-rsrc-use      1         12h
grafana-dashboard-k8s-node-rsrc-use         1         12h
grafana-dashboard-k8s-resources-cluster     1         12h
grafana-dashboard-k8s-resources-namespace   1         12h
grafana-dashboard-k8s-resources-pod         1         12h
grafana-dashboards                          1         12h
prometheus-adapter-prometheus-config        1         12h
prometheus-k8s-rulefiles-0                  1         12h
serving-certs-ca-bundle                     1         12h
sharing-config                              3         12h
telemeter-client-serving-certs-ca-bundle    1         12h

used images
docker.io/grafana/grafana:5.2.4
docker.io/openshift/oauth-proxy:v1.1.0
docker.io/openshift/prometheus-alertmanager:v0.15.2
docker.io/openshift/prometheus-node-exporter:v0.16.0
docker.io/openshift/prometheus:v2.5.0
quay.io/coreos/configmap-reload:v0.0.1
quay.io/coreos/k8s-prometheus-adapter-amd64:v0.4.0
quay.io/coreos/kube-rbac-proxy:v0.4.0
quay.io/coreos/kube-state-metrics:v1.4.0
quay.io/coreos/prom-label-proxy:v0.1.0
quay.io/coreos/prometheus-config-reloader:v0.26.0
quay.io/coreos/prometheus-operator:v0.26.0
quay.io/openshift/origin-configmap-reload:v3.11
quay.io/openshift/origin-telemeter:v4.0
registry.svc.ci.openshift.org/openshift/origin-v4.0-2018-12-07-201539@sha256:38cdc34ca6a8af948423655ca5f6a10eafa4c85763dd606c894ed6017d1e446e

$ oc version
oc v4.0.0-alpha.0+c5644a8-763
kubernetes v1.11.0+c5644a8
features: Basic-Auth GSSAPI Kerberos SPNEGO


telemeter-client pod not created is tracked in Bug 1655841

Comment 6 errata-xmlrpc 2019-06-04 10:41:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758