1945808 – Too many "Failed to watch" error logs in prometheus-operator

Bug 1945808 - Too many "Failed to watch" error logs in prometheus-operator

Summary: Too many "Failed to watch" error logs in prometheus-operator

Keywords:
Status:	CLOSED DUPLICATE of bug 1957190
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Simon Pasquier
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-04-02 02:34 UTC by Junqi Zhao
Modified:	2021-05-21 16:05 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-21 16:05:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
prometheus-operator container logs (201.07 KB, text/plain) 2021-04-02 02:34 UTC, Junqi Zhao	no flags	Details
View All

Description Junqi Zhao 2021-04-02 02:34:49 UTC

Created attachment 1768427 [details]
prometheus-operator container logs

Description of problem:
Too many "Failed to watch" error logs in prometheus-operator

# oc -n openshift-monitoring logs prometheus-operator-77b774f8b7-shshz -c prometheus-operator | grep "Failed to"
...
level=error ts=2021-04-02T01:28:00.07726081Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1alpha1.AlertmanagerConfig: the server has received too many requests and has asked us to try again later (get alertmanagerconfigs.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:00.077321503Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.PodMonitor: the server has received too many requests and has asked us to try again later (get podmonitors.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:00.077390741Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.PrometheusRule: the server has received too many requests and has asked us to try again later (get prometheusrules.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:00.077447557Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.PrometheusRule: the server has received too many requests and has asked us to try again later (get prometheusrules.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:00.077730349Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.ServiceMonitor: the server has received too many requests and has asked us to try again later (get servicemonitors.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:10.3644703Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.Probe: the server has received too many requests and has asked us to try again later (get probes.monitoring.coreos.com)"
level=error ts=2021-04-02T01:28:10.002522818Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.Secret: the server has received too many requests and has asked us to try again later (get secrets)"
level=error ts=2021-04-02T01:28:02.253620081Z caller=klog.go:96 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.ConfigMap: the server has received too many requests and has asked us to try again later (get configmaps)"

...

related resources for Failed to watch
alertmanagerconfigs.monitoring.coreos.com
podmonitors.monitoring.coreos.com
prometheusrules.monitoring.coreos.com
servicemonitors.monitoring.coreos.com
probes.monitoring.coreos.com
configmaps
secrets

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-04-01-022345
Prometheus Operator 0.45.0

How reproducible:
always

Steps to Reproduce:
1. see from the description
2.
3.

Actual results:
Too many "Failed to watch" error logs in prometheus-operator

Expected results:
no error

Additional info:

Comment 1 Simon Pasquier 2021-04-02 13:13:38 UTC

Was it a fresh install? did any alert fire?

Comment 2 Junqi Zhao 2021-04-06 01:12:03 UTC

(In reply to Simon Pasquier from comment #1)
> Was it a fresh install? did any alert fire?

yes, fresh install, no alerts fired
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/alerts' | jq '.data.alerts[] | {alertname: .labels.alertname, state: .state}'
{
  "alertname": "Watchdog",
  "state": "firing"
}
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "state": "firing"
}

Comment 3 Simon Pasquier 2021-04-06 11:14:49 UTC

setting severity to low since there's no functional impact as far as we can tell.

Comment 5 Damien Grisonnet 2021-04-28 11:42:55 UTC

This seems to be a side-effect of bug 1948311.

Comment 6 Junqi Zhao 2021-05-18 07:26:27 UTC

also seen in multireplica cluster
# oc -n openshift-monitoring logs -c prometheus-operator $(oc -n openshift-monitoring get pod | grep prometheus-operator | awk '{print $1}') | grep "Failed to"
...
level=error ts=2021-05-18T05:34:42.20180187Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.ConfigMap: the server has received too many requests and has asked us to try again later (get configmaps)"
level=error ts=2021-05-18T05:34:42.337929929Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.Probe: the server has received too many requests and has asked us to try again later (get probes.monitoring.coreos.com)"
level=error ts=2021-05-18T05:34:42.456446011Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1alpha1.AlertmanagerConfig: the server has received too many requests and has asked us to try again later (get alertmanagerconfigs.monitoring.coreos.com)"
level=error ts=2021-05-18T05:34:42.465221578Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="github.com/coreos/prometheus-operator/pkg/informers/informers.go:75: Failed to watch *v1.Secret: the server has received too many requests and has asked us to try again later (get secrets)"
..

Comment 7 Simon Pasquier 2021-05-21 16:05:49 UTC


*** This bug has been marked as a duplicate of bug 1957190 ***

Note You need to log in before you can comment on or make changes to this bug.