Bug 1888595 - cluster-policy-controller logs shows error which reads initial monitor sync has error
Summary: cluster-policy-controller logs shows error which reads initial monitor sync h...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Maciej Szulik
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-15 09:47 UTC by RamaKasturi
Modified: 2021-02-24 15:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:26:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:26:32 UTC

Description RamaKasturi 2020-10-15 09:47:05 UTC
Description of Problem:
Initial monitor sync has error: couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=prometheusrules": unable to monitor quota is shown in the cluster-policy-controller log files

+ exec cluster-policy-controller start --config=/etc/kubernetes/static-pod-resources/configmaps/cluster-policy-controller-config/config.yaml
I1015 07:47:53.315701       1 policy_controller.go:41] Starting controllers on 0.0.0.0:10357 (3fd48871)
I1015 07:47:53.320474       1 standalone_apiserver.go:103] Started health checks at 0.0.0.0:10357
I1015 07:47:53.323100       1 leaderelection.go:243] attempting to acquire leader lease  openshift-kube-controller-manager/cluster-policy-controller...
I1015 07:47:53.359024       1 leaderelection.go:253] successfully acquired lease openshift-kube-controller-manager/cluster-policy-controller
I1015 07:47:53.359446       1 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-kube-controller-manager", Name:"cluster-policy-controller", UID:"d4277735-57ce-4dad-b606-b1a73c329af3", APIVersion:"v1", ResourceVersion:"53538", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' ip-10-0-187-93 became leader
I1015 07:47:53.522689       1 policy_controller.go:144] Started "openshift.io/resourcequota"
I1015 07:47:53.522943       1 resource_quota_controller.go:272] Starting resource quota controller
I1015 07:47:53.522953       1 shared_informer.go:240] Waiting for caches to sync for resource quota
I1015 07:47:54.720517       1 request.go:645] Throttling request took 1.040959323s, request: GET:https://api-int.knarra1015.qe.devcluster.openshift.com:6443/apis/security.openshift.io/v1?timeout=32s
I1015 07:47:56.621219       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for deployments.apps
I1015 07:47:56.621371       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for endpointslices.discovery.k8s.io
I1015 07:47:56.621405       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for ingresses.extensions
I1015 07:47:56.621434       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for controllerrevisions.apps
I1015 07:47:56.632770       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for ingresses.networking.k8s.io
I1015 07:47:56.632811       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for horizontalpodautoscalers.autoscaling
I1015 07:47:56.632834       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for cronjobs.batch
I1015 07:47:56.632895       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for statefulsets.apps
I1015 07:47:56.632920       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for poddisruptionbudgets.policy
I1015 07:47:56.632956       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for rolebindings.rbac.authorization.k8s.io
I1015 07:47:56.633030       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for serviceaccounts
I1015 07:47:56.633060       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for events.events.k8s.io
I1015 07:47:56.633112       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for deploymentconfigs.apps.openshift.io
I1015 07:47:56.633581       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for buildconfigs.build.openshift.io
I1015 07:47:56.638424       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for routes.route.openshift.io
I1015 07:47:56.638774       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for replicasets.apps
I1015 07:47:56.638860       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for limitranges
I1015 07:47:56.638925       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for jobs.batch
I1015 07:47:56.638963       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for leases.coordination.k8s.io
I1015 07:47:56.639054       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for endpoints
I1015 07:47:56.639132       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for daemonsets.apps
I1015 07:47:56.639183       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for networkpolicies.networking.k8s.io
I1015 07:47:56.639224       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for roles.rbac.authorization.k8s.io
I1015 07:47:56.639319       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for builds.build.openshift.io
I1015 07:47:56.639398       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for podtemplates
E1015 07:47:56.639532       1 reconciliation_controller.go:123] initial monitor sync has error: [couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=prometheusrules": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=prometheusrules", couldn't start monitor for resource "template.openshift.io/v1, Resource=templates": unable to monitor quota for resource "template.openshift.io/v1, Resource=templates", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=alertmanagers": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=alertmanagers", couldn't start monitor for resource "whereabouts.cni.cncf.io/v1alpha1, Resource=ippools": unable to monitor quota for resource "whereabouts.cni.cncf.io/v1alpha1, Resource=ippools", couldn't start monitor for resource "authorization.openshift.io/v1, Resource=rolebindingrestrictions": unable to monitor quota for resource "authorization.openshift.io/v1, Resource=rolebindingrestrictions", couldn't start monitor for resource "autoscaling.openshift.io/v1beta1, Resource=machineautoscalers": unable to monitor quota for resource "autoscaling.openshift.io/v1beta1, Resource=machineautoscalers", couldn't start monitor for resource "cloudcredential.openshift.io/v1, Resource=credentialsrequests": unable to monitor quota for resource "cloudcredential.openshift.io/v1, Resource=credentialsrequests", couldn't start monitor for resource "tuned.openshift.io/v1, Resource=profiles": unable to monitor quota for resource "tuned.openshift.io/v1, Resource=profiles", couldn't start monitor for resource "operators.coreos.com/v1, Resource=operatorgroups": unable to monitor quota for resource "operators.coreos.com/v1, Resource=operatorgroups", couldn't start monitor for resource "snapshot.storage.k8s.io/v1beta1, Resource=volumesnapshots": unable to monitor quota for resource "snapshot.storage.k8s.io/v1beta1, Resource=volumesnapshots", couldn't start monitor for resource "template.openshift.io/v1, Resource=templateinstances": unable to monitor quota for resource "template.openshift.io/v1, Resource=templateinstances", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=probes": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=probes", couldn't start monitor for resource "operator.openshift.io/v1, Resource=ingresscontrollers": unable to monitor quota for resource "operator.openshift.io/v1, Resource=ingresscontrollers", couldn't start monitor for resource "operators.coreos.com/v1alpha1, Resource=catalogsources": unable to monitor quota for resource "operators.coreos.com/v1alpha1, Resource=catalogsources", couldn't start monitor for resource "k8s.cni.cncf.io/v1, Resource=network-attachment-definitions": unable to monitor quota for resource "k8s.cni.cncf.io/v1, Resource=network-attachment-definitions", couldn't start monitor for resource "operators.coreos.com/v1alpha1, Resource=subscriptions": unable to monitor quota for resource "operators.coreos.com/v1alpha1, Resource=subscriptions", couldn't start monitor for resource "operators.coreos.com/v1alpha1, Resource=installplans": unable to monitor quota for resource "operators.coreos.com/v1alpha1, Resource=installplans", couldn't start monitor for resource "machine.openshift.io/v1beta1, Resource=machines": unable to monitor quota for resource "machine.openshift.io/v1beta1, Resource=machines", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=prometheuses": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=prometheuses", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=servicemonitors": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=servicemonitors", couldn't start monitor for resource "network.operator.openshift.io/v1, Resource=operatorpkis": unable to monitor quota for resource "network.operator.openshift.io/v1, Resource=operatorpkis", couldn't start monitor for resource "metal3.io/v1alpha1, Resource=baremetalhosts": unable to monitor quota for resource "metal3.io/v1alpha1, Resource=baremetalhosts", couldn't start monitor for resource "whereabouts.cni.cncf.io/v1alpha1, Resource=overlappingrangeipreservations": unable to monitor quota for resource "whereabouts.cni.cncf.io/v1alpha1, Resource=overlappingrangeipreservations", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=thanosrulers": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=thanosrulers", couldn't start monitor for resource "ingress.operator.openshift.io/v1, Resource=dnsrecords": unable to monitor quota for resource "ingress.operator.openshift.io/v1, Resource=dnsrecords", couldn't start monitor for resource "tuned.openshift.io/v1, Resource=tuneds": unable to monitor quota for resource "tuned.openshift.io/v1, Resource=tuneds", couldn't start monitor for resource "operators.coreos.com/v1alpha1, Resource=clusterserviceversions": unable to monitor quota for resource "operators.coreos.com/v1alpha1, Resource=clusterserviceversions", couldn't start monitor for resource "network.openshift.io/v1, Resource=egressnetworkpolicies": unable to monitor quota for resource "network.openshift.io/v1, Resource=egressnetworkpolicies", couldn't start monitor for resource "machine.openshift.io/v1beta1, Resource=machinehealthchecks": unable to monitor quota for resource "machine.openshift.io/v1beta1, Resource=machinehealthchecks", couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=podmonitors": unable to monitor quota for resource "monitoring.coreos.com/v1, Resource=podmonitors", couldn't start monitor for resource "controlplane.operator.openshift.io/v1alpha1, Resource=podnetworkconnectivitychecks": unable to monitor quota for resource "controlplane.operator.openshift.io/v1alpha1, Resource=podnetworkconnectivitychecks", couldn't start monitor for resource "machine.openshift.io/v1beta1, Resource=machinesets": unable to monitor quota for resource "machine.openshift.io/v1beta1, Resource=machinesets"]
I1015 07:47:56.639591       1 policy_controller.go:144] Started "openshift.io/cluster-quota-reconciliation"
I1015 07:47:56.639658       1 clusterquotamapping.go:127] Starting ClusterQuotaMappingController controller
I1015 07:47:56.639765       1 reconciliation_controller.go:136] Starting the cluster quota reconciliation controller
I1015 07:47:56.639801       1 resource_quota_monitor.go:303] QuotaMonitor running
I1015 07:47:56.684820       1 policy_controller.go:144] Started "openshift.io/namespace-security-allocation"
I1015 07:47:56.684928       1 policy_controller.go:147] Started Origin Controllers
I1015 07:47:56.725427       1 shared_informer.go:247] Caches are synced for resource quota 


Version-Release number of selected component (if applicable):
[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-10-15-011122]$ ./oc version
Client Version: 4.7.0-0.nightly-2020-10-15-011122
Server Version: 4.7.0-0.nightly-2020-10-15-011122
Kubernetes Version: v1.19.0+1110e21


How Reproducible:
Always

steps to Reproduce:
1. Install latest 4.7 cluster
2. run oc logs -f <KCM_pod> -c cluster-policy-controller -n openshift-kube-controller-manager

Actual Results:
Errors seen in the logs which reads "Inital monitor synch has error couldn't start monitor for resource "monitoring.coreos.com/v1, Resource=prometheusrules": unable to monitor quota

Expected Results:
No errors should be seen in the cluster-policy-contoller logs

Additional Info:
No such errors were seen in cpc logs of 4.6 cluster

Comment 1 Maciej Szulik 2020-10-23 10:51:49 UTC
Looks like this was a temporary issues with monitoring, I've just verified this against cluster 4.7.0-0.nightly-2020-10-21-001511 and I'm not seeing any problems as such. 
Moving to qa for verification.

Comment 3 RamaKasturi 2020-10-28 06:12:27 UTC
Verified bug with the payload below and as confirmed from dev that this is a temporary problem we are seeing when the cluster is starting up and just a timing issue, when one component is faster than the other, which in distributed system is perfectly normal and not a problem.

[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-10-27-051128]$ ./oc version
Client Version: 4.7.0-0.nightly-2020-10-27-051128
Server Version: 4.7.0-0.nightly-2020-10-27-051128
Kubernetes Version: v1.19.0+e67f5dc

Reading further in the logs we see below lines which means the c-p-c was requested to terminate, probably due to previous errors and restarted cleanly

+ timeout 3m /bin/bash -exuo pipefail -c 'while [ -n "$(ss -Htanop \( sport = 10357 \))" ]; do sleep 1; done'
++ ss -Htanop '(' sport = 10357 ')'
+ '[' -n '' ']'
+ exec cluster-policy-controller start --config=/etc/kubernetes/static-pod-resources/configmaps/cluster-policy-controller-config/config.yaml
I1027 12:50:11.213831       1 policy_controller.go:41] Starting controllers on 0.0.0.0:10357 (a9cad6a4)


Based on the above moving bug to verified state.

Comment 6 errata-xmlrpc 2021-02-24 15:26:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.