1865742 – Failed to watch errors in kube-state-metrics container logs

Bug 1865742 - Failed to watch errors in kube-state-metrics container logs

Summary: Failed to watch errors in kube-state-metrics container logs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Sergiusz Urbaniak
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:	LifecycleReset
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-08-04 03:39 UTC by Junqi Zhao
Modified:	2020-10-27 16:23 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-27 16:23:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 16:23:27 UTC

Description Junqi Zhao 2020-08-04 03:39:31 UTC

Description of problem:
Failed to watch errors in kube-state-metrics container logs
# oc -n openshift-monitoring logs kube-state-metrics-7d8f88b5f7-2w8sc -c kube-state-metrics
I0804 02:14:25.340948       1 main.go:86] Using default collectors
I0804 02:14:25.341072       1 main.go:98] Using all namespace
I0804 02:14:25.341093       1 main.go:139] metric white-blacklisting: blacklisting the following items: kube_secret_labels
W0804 02:14:25.341111       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0804 02:14:25.343113       1 main.go:186] Testing communication with server
I0804 02:14:25.355312       1 main.go:191] Running with Kubernetes cluster version: v4.6+. git version: v4.6.0-202008030720.p0-dirty. git tree state: dirty. commit: 64529ef6458777ac400f4c1bf78b1dabea082fa4. platform: linux/amd64
I0804 02:14:25.355340       1 main.go:193] Communication with server successful
I0804 02:14:25.355504       1 main.go:227] Starting metrics server: 127.0.0.1:8081
I0804 02:14:25.355755       1 metrics_handler.go:96] Autosharding disabled
I0804 02:14:25.356897       1 builder.go:156] Active collectors: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
I0804 02:14:25.357258       1 main.go:202] Starting kube-state-metrics self metrics server: 127.0.0.1:8082
E0804 02:27:43.432795       1 reflector.go:307] k8s.io/kube-state-metrics/internal/store/builder.go:346: Failed to watch *v1.Deployment: unknown (get deployments.apps)
E0804 02:50:42.390754       1 reflector.go:307] k8s.io/kube-state-metrics/internal/store/builder.go:346: Failed to watch *v1.Job: unknown (get jobs.batch)
E0804 02:50:43.395802       1 reflector.go:153] k8s.io/kube-state-metrics/internal/store/builder.go:346: Failed to list *v1.Job: jobs.batch is forbidden: User "system:serviceaccount:openshift-monitoring:kube-state-metrics" cannot list resource "jobs" in API group "batch" at the cluster scope
E0804 02:57:02.406234       1 reflector.go:307] k8s.io/kube-state-metrics/internal/store/builder.go:346: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io)
E0804 02:57:03.412278       1 reflector.go:307] k8s.io/kube-state-metrics/internal/store/builder.go:346: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io)

# oc explain Deployment
KIND:     Deployment
VERSION:  apps/v1
...
************************
# oc explain StorageClass
KIND:     StorageClass
VERSION:  storage.k8s.io/v1
...
***********
# oc explain jobs
KIND:     Job
VERSION:  batch/v1
...
Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-03-143208
# /usr/bin/kube-state-metrics --version
version.Version{GitCommit:"f113959", BuildDate:"2020-08-02T15:17:20Z", Release:"v1.9.7", GoVersion:"go1.14.4", Compiler:"gc", Platform:"linux/amd64"}


How reproducible:
always

Steps to Reproduce:
1. see the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sergiusz Urbaniak 2020-08-04 06:45:53 UTC

@junqi is the error persistent and is there any degradation in functionality? or are the above log lines just temporary.

Comment 2 Sergiusz Urbaniak 2020-08-05 14:57:51 UTC

I just rechecked on my cluster (openshift-install-linux-4.6.0-0.nightly-2020-08-05-013608.tar.gz). While not pretty it is just a few log lines at the start of the pods and it seems to be a raciness with API objects.

Lowering severity to low, and reassigning to api server team to assert if there are henn-and-egg problems currently in 4.6 with api objects being available at early start stages.

Comment 5 Michal Fojtik 2020-09-05 12:18:12 UTC

This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 6 Junqi Zhao 2020-09-07 01:17:26 UTC

tested with 4.6.0-0.nightly-2020-09-05-015624, no such error now
# oc -n openshift-monitoring logs $(oc -n openshift-monitoring get po | grep kube-state-metrics | awk '{print $1}') -c kube-state-metrics

I0906 23:36:29.112878       1 main.go:86] Using default collectors
I0906 23:36:29.112978       1 main.go:98] Using all namespace
I0906 23:36:29.112998       1 main.go:139] metric white-blacklisting: blacklisting the following items: kube_secret_labels
W0906 23:36:29.113011       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0906 23:36:29.115697       1 main.go:186] Testing communication with server
I0906 23:36:29.128256       1 main.go:191] Running with Kubernetes cluster version: v1.19+. git version: v1.19.0-rc.2+068702d. git tree state: clean. commit: 068702de7d48739e835ea41b7ca959b5252de432. platform: linux/amd64
I0906 23:36:29.128271       1 main.go:193] Communication with server successful
I0906 23:36:29.128411       1 main.go:227] Starting metrics server: 127.0.0.1:8081
I0906 23:36:29.128449       1 main.go:202] Starting kube-state-metrics self metrics server: 127.0.0.1:8082
I0906 23:36:29.129188       1 metrics_handler.go:96] Autosharding disabled
I0906 23:36:29.130246       1 builder.go:156] Active collectors: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments

Comment 7 Michal Fojtik 2020-09-07 01:18:18 UTC

The LifecycleStale keyword was removed because the bug moved to QE and the bug got commented on recently.
The bug assignee was notified.

Comment 10 errata-xmlrpc 2020-10-27 16:23:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.