Bug 2018222

Summary: cluster-monitoring-operator produces more watch requests than expected
Product: OpenShift Container Platform Reporter: Adam Kaplan <adam.kaplan>
Component: MonitoringAssignee: Sunil Thaha <sthaha>
Status: CLOSED CURRENTRELEASE QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: amuller, anpicker, aos-bugs, erooth, janantha, jfajersk, spasquie
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-12 16:42:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2026802    

Description Adam Kaplan 2021-10-28 14:27:07 UTC
Description of problem:

CI tests for 4.10 are flaking with the cluster-monitoring operator reporting too many watches:

[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] 

"Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=69, upperbound=66, ratio=1.0454545454545454"

Version-Release number of selected component (if applicable): 4.10


How reproducible: Sometimes


Steps to Reproduce:

Mainly non-IPI tests seem to be impacted (example - 4.10 aws-upgrade)

Actual results:

[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] test fails


Expected results:

Tests pass


Additional info:

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/22451/rehearse-22451-pull-ci-openshift-builder-release-4.10-openshift-e2e-aws-builds-techpreview/1452970495040294912

Comment 1 Jan Fajerski 2021-10-29 06:35:28 UTC
The fix for https://bugzilla.redhat.com/show_bug.cgi?id=2016352 might improve this.

Comment 2 Simon Pasquier 2021-11-12 16:42:48 UTC
IIUC this is a generic issue with this e2e test checking that operators don't create too many watches and nothing specific to the cluster monitoring operator.

The following PRs should improve the situation:
https://github.com/openshift/origin/pull/26583
https://github.com/openshift/origin/pull/26601