1617695 – kube-state-metrics deployment is not created

Bug 1617695 - kube-state-metrics deployment is not created

Summary: kube-state-metrics deployment is not created

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.11.0
Assignee:	Frederic Branczyk
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-08-16 04:25 UTC by Junqi Zhao
Modified:	2021-12-10 17:01 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-10-11 07:24:57 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
No CPU/Memory/IO data (79.09 KB, image/png) 2018-08-16 08:11 UTC, Junqi Zhao	no flags	Details
no instance listed under Nodes and CPU/Memory/Dish data is empty (86.60 KB, image/png) 2018-08-17 12:14 UTC, Junqi Zhao	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:2652	0	None	None	None	2018-10-11 07:25:28 UTC

Description Junqi Zhao 2018-08-16 04:25:11 UTC

Description of problem:
Deploy cluster monitoring, kube-state-metrics pod/service/deployment/replicaset are not created

# kubectl -n openshift-monitoring get all
NAME                                               READY     STATUS    RESTARTS   AGE
pod/alertmanager-main-0                            3/3       Running   0          1h
pod/alertmanager-main-1                            3/3       Running   0          1h
pod/alertmanager-main-2                            3/3       Running   0          1h
pod/cluster-monitoring-operator-6d7c9f5759-256xw   1/1       Running   0          1h
pod/grafana-7476cc5c4b-pkxpb                       2/2       Running   0          1h
pod/node-exporter-42n9x                            2/2       Running   0          1h
pod/node-exporter-9s8b9                            2/2       Running   0          1h
pod/node-exporter-wxr46                            2/2       Running   0          1h
pod/prometheus-k8s-0                               4/4       Running   1          1h
pod/prometheus-k8s-1                               4/4       Running   1          1h
pod/prometheus-operator-7cbb8d577f-zk9w4           1/1       Running   0          1h

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/alertmanager-main             ClusterIP   172.30.187.172   <none>        9094/TCP            1h
service/alertmanager-operated         ClusterIP   None             <none>        9093/TCP,6783/TCP   1h
service/cluster-monitoring-operator   ClusterIP   None             <none>        8080/TCP            1h
service/grafana                       ClusterIP   172.30.142.91    <none>        3000/TCP            1h
service/node-exporter                 ClusterIP   None             <none>        9100/TCP            1h
service/prometheus-k8s                ClusterIP   172.30.87.55     <none>        9091/TCP            1h
service/prometheus-operated           ClusterIP   None             <none>        9090/TCP            1h
service/prometheus-operator           ClusterIP   None             <none>        8080/TCP            1h

NAME                           DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
daemonset.apps/node-exporter   3         3         3         3            3           beta.kubernetes.io/os=linux   1h

NAME                                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cluster-monitoring-operator   1         1         1            1           1h
deployment.apps/grafana                       1         1         1            1           1h
deployment.apps/prometheus-operator           1         1         1            1           1h

NAME                                                     DESIRED   CURRENT   READY     AGE
replicaset.apps/cluster-monitoring-operator-6d7c9f5759   1         1         1         1h
replicaset.apps/grafana-7476cc5c4b                       1         1         1         1h
replicaset.apps/prometheus-operator-7cbb8d577f           1         1         1         1h

NAME                                 DESIRED   CURRENT   AGE
statefulset.apps/alertmanager-main   3         3         1h
statefulset.apps/prometheus-k8s      2         2         1h

NAME                                         HOST/PORT                                                             PATH      SERVICES            PORT      TERMINATION   WILDCARD
route.route.openshift.io/alertmanager-main   alertmanager-main-openshift-monitoring.apps.0816-xud.qe.rhcloud.com             alertmanager-main   web       reencrypt     None
route.route.openshift.io/grafana             grafana-openshift-monitoring.apps.0816-xud.qe.rhcloud.com                       grafana             https     reencrypt     None
route.route.openshift.io/prometheus-k8s      prometheus-k8s-openshift-monitoring.apps.0816-xud.qe.rhcloud.com                prometheus-k8s      web       reencrypt     None

Version-Release number of selected component (if applicable):
ose-prometheus-operator:v3.11.0-0.16.0.0

# openshift version
openshift v3.11.0-0.16.0

How reproducible:
Always

Steps to Reproduce:
1. Deploy cluster monitoring
2.
3.

Actual results:
no kube-state-metrics pod is created

Expected results:
Should have kube-state-metrics pod

Additional info:
# parameters
openshift_cluster_monitoring_operator_install=true
openshift_cluster_monitoring_operator_node_selector={'role': 'node'}

Comment 1 Junqi Zhao 2018-08-16 08:11:15 UTC

Created attachment 1476373 [details]
No CPU/Memory/IO data

Comment 2 Frederic Branczyk 2018-08-16 08:25:52 UTC

Could you share the logs of the cluster-monitoring-operator Pod? Thanks!

Comment 3 DeShuai Ma 2018-08-16 08:30:17 UTC

cluster-monitoring-operator failed task Updating kube-state-metrics


//cluster-monitoring-operator logs
I0816 08:22:58.371739       1 tasks.go:37] running task Updating kube-state-metrics
I0816 08:22:58.371807       1 decoder.go:224] decoding stream as YAML
I0816 08:22:58.379803       1 decoder.go:224] decoding stream as YAML
I0816 08:22:58.467631       1 decoder.go:224] decoding stream as YAML
E0816 08:22:58.482241       1 operator.go:206] Syncing "openshift-monitoring/cluster-monitoring-config" failed
E0816 08:22:58.482266       1 operator.go:207] sync "openshift-monitoring/cluster-monitoring-config" failed: running task Updating kube-state-metrics failed: reconciling kube-state-metrics ClusterRole failed: creating ClusterRole object failed: clusterroles.rbac.authorization.k8s.io "kube-state-metrics" is forbidden: attempt to grant extra privileges: [{[list] [apps] [replicasets] [] []} {[watch] [apps] [replicasets] [] []}] user=&{system:serviceaccount:openshift-monitoring:cluster-monitoring-operator 21b2106a-a11b-11e8-8686-0eee3f410034 [system:serviceaccounts system:serviceaccounts:openshift-monitoring system:authenticated] map[]} ownerrules=[{[get] [ user.openshift.io] [users] [~] []} {[list] [ project.openshift.io] [projectrequests] [] []} {[get list] [ authorization.openshift.io] [clusterroles] [] []} {[get list watch] [rbac.authorization.k8s.io] [clusterroles] [] []} {[get list] [storage.k8s.io] [storageclasses] [] []} {[list watch] [ project.openshift.io] [projects] [] []} {[create] [ authorization.openshift.io] [selfsubjectrulesreviews] [] []} {[create] [authorization.k8s.io] [selfsubjectaccessreviews] [] []} {[create get list watch update delete] [rbac.authorization.k8s.io] [roles rolebindings clusterroles clusterrolebindings] [] []} {[create get list watch update delete] [] [serviceaccounts] [] []} {[create get list watch update delete] [apps] [deployments daemonsets] [] []} {[create get list watch update delete] [route.openshift.io] [routes] [] []} {[create get list watch update delete] [security.openshift.io] [securitycontextconstraints] [] []} {[create] [authentication.k8s.io] [tokenreviews] [] []} {[create] [authorization.k8s.io] [subjectaccessreviews] [] []} {[list watch] [] [nodes pods services resourcequotas replicationcontrollers limitranges persistentvolumeclaims persistentvolumes namespaces endpoints] [] []} {[list watch] [extensions] [daemonsets deployments replicasets] [] []} {[list watch] [apps] [statefulsets] [] []} {[list watch] [batch] [cronjobs jobs] [] []} {[list watch] [autoscaling] [horizontalpodautoscalers] [] []} {[create] [authentication.k8s.io] [tokenreviews] [] []} {[create] [authorization.k8s.io] [subjectaccessreviews] [] []} {[get] [] [pods] [] []} {[get update] [extensions] [deployments] [kube-state-metrics] []} {[create] [authentication.k8s.io] [tokenreviews] [] []} {[create] [authorization.k8s.io] [subjectaccessreviews] [] []} {[get] [] [] [] [/metrics]} {[create] [authentication.k8s.io] [tokenreviews] [] []} {[create] [authorization.k8s.io] [subjectaccessreviews] [] []} {[get] [] [namespaces nodes/metrics] [] []} {[get list watch] [] [nodes services endpoints pods] [] []} {[get] [] [configmaps] [] []} {[*] [extensions] [thirdpartyresources] [] []} {[*] [apiextensions.k8s.io] [customresourcedefinitions] [] []} {[*] [monitoring.coreos.com] [alertmanagers prometheuses prometheuses/finalizers alertmanagers/finalizers servicemonitors prometheusrules] [] []} {[*] [apps] [statefulsets] [] []} {[*] [] [configmaps secrets] [] []} {[list delete] [] [pods] [] []} {[get create update] [] [services endpoints] [] []} {[list watch] [] [nodes] [] []} {[list] [] [namespaces] [] []} {[get] [] [] [] [/healthz /healthz/*]} {[get] [] [] [] [/version /version/* /api /api/* /apis /apis/* /oapi /oapi/* /openapi/v2 /swaggerapi /swaggerapi/* /swagger.json /swagger-2.0.0.pb-v1 /osapi /osapi/ /.well-known /.well-known/* /]} {[create] [ authorization.openshift.io] [selfsubjectrulesreviews] [] []} {[create] [authorization.k8s.io] [selfsubjectaccessreviews] [] []} {[list watch get] [servicecatalog.k8s.io] [clusterserviceclasses clusterserviceplans] [] []} {[create] [authorization.k8s.io] [selfsubjectaccessreviews selfsubjectrulesreviews] [] []} {[create] [ build.openshift.io] [builds/docker builds/optimizeddocker] [] []} {[create] [ build.openshift.io] [builds/jenkinspipeline] [] []} {[create] [ build.openshift.io] [builds/source] [] []} {[get] [] [] [] [/version /version/* /api /api/* /apis /apis/* /oapi /oapi/* /openapi/v2 /swaggerapi /swaggerapi/* /swagger.json /swagger-2.0.0.pb-v1 /osapi /osapi/ /.well-known /.well-known/* /]} {[get] [] [] [] [/version /version/* /api /api/* /apis /apis/* /oapi /oapi/* /openapi/v2 /swaggerapi /swaggerapi/* /swagger.json /swagger-2.0.0.pb-v1 /osapi /osapi/ /.well-known /.well-known/* /]} {[delete] [ oauth.openshift.io] [oauthaccesstokens oauthauthorizetokens] [] []} {[impersonate] [authentication.k8s.io] [userextras/scopes.authorization.openshift.io] [] []} {[create get] [ build.openshift.io] [buildconfigs/webhooks] [] []}] ruleResolutionErrors=[]

Comment 4 Frederic Branczyk 2018-08-17 08:49:00 UTC

The pull request to fix this was merged: https://github.com/openshift/openshift-ansible/pull/9626

Comment 5 Junqi Zhao 2018-08-17 12:13:10 UTC

issue is fixed
# oc get pod | grep kube-state-metrics
kube-state-metrics-776f9667b-dzmsz             3/3       Running   0          7m

But in grafana UI, instance is not listed under Nodes

Comment 6 Junqi Zhao 2018-08-17 12:14:22 UTC

Created attachment 1476606 [details]
no instance listed under Nodes and CPU/Memory/Dish data is empty

Comment 10 Junqi Zhao 2018-08-20 07:38:01 UTC

This defect could be set to ON_QA, issues mentioned in Comment 5 - Comment 6 is in Bug 1619132

Comment 11 Junqi Zhao 2018-08-20 12:36:59 UTC

issue is fixed
cluster monitorning component images version: v3.11.0-0.17.0.0

# openshift version
openshift v3.11.0-0.17.0

Comment 14 errata-xmlrpc 2018-10-11 07:24:57 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.