Bug 2060406
| Summary: | Test 'operators should not create watch channels very often' fails | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
| Component: | Monitoring | Assignee: | Jan Chaloupka <jchaloup> |
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.11 | CC: | amuller, anpicker, aos-bugs, juzhao, spasquie, sthaha |
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-10 10:52:11 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2059845 | ||
| Attachments: | |||
|
Description
Jan Chaloupka
2022-03-03 13:19:21 UTC
searched with https://search.ci.openshift.org/?search=operators+should+not+create+watch+channels+very+often&maxAge=48h&context=1&type=bug%2Bjunit&name=4.11.*&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job still have issue with 1. VSphere Platform https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-vsphere-techpreview-serial/1501277588121718784 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] expand_less 1s fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=87, upperbound=84, ratio=1.0357142857142858", code: https://github.com/openshift/origin/blob/master/test/extended/apiserver/api_requests.go#L235 2. single node AWS cluster https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-openshift-e2e-aws-single-node-workers-upgrade-conformance/1501289618778099712 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] expand_less Run #0: Failed expand_less 2s fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=115, upperbound=96, ratio=1.1979166666666667", ] code: https://github.com/openshift/origin/blob/master/test/extended/apiserver/api_requests.go#L299 Created attachment 1869931 [details]
periodic-ci-openshift-release-master-nightly-4.11-e2e-vsphere-techpreview-serial CMO
Created attachment 1869932 [details]
periodic-ci-openshift-release-master-nightly-4.11-openshift-e2e-aws-single-node-workers-upgrade-conformance CMO
# periodic-ci-openshift-release-master-nightly-4.11-openshift-e2e-aws-single-node-workers-upgrade-conformance sequence 69 68 63 61 62 64 78 60 69 53 66 52 73 64 66 57 61 59 64 68 65 67 52 58 60 68 57 64 57 57 71 65 86 Giving 64 as the median # periodic-ci-openshift-release-master-nightly-4.11-e2e-vsphere-techpreview-serial sequence 101 111 94 86 101 90 109 77 115 138 105 91 102 104 93 86 90 78 65 92 88 110 110 84 100 113 89 100 99 95 99 87 107 86 89 71 108 90 124 95 62 93 96 90 94 90 87 88 80 96 85 88 105 109 107 88 85 91 130 98 102 81 93 94 There are some spikes but mostly holding around 93/94 based on the median. searched with https://search.ci.openshift.org/?search=operators+should+not+create+watch+channels+very+often&maxAge=48h&context=1&type=bug%2Bjunit&name=4.11.*&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job we also need to update the upper bound for GCP platform https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-gcp-ovn-upgrade/1512265217537478656 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:449]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=97, upperbound=96, ratio=1.0104166666666667", ] code: https://github.com/openshift/origin/blob/master/test/extended/apiserver/api_requests.go#L179 Created attachment 1871841 [details]
GCP CMO api WATCH request counts
we should also bump the apirequests upper bounds for aws single node, most of the time 64 is fine, sometimes would be failed, example: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-openshift-e2e-aws-single-node-workers-upgrade-conformance/1513613641847410688 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] Run #0: Failed expand_less 2s { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:449]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=131, upperbound=128, ratio=1.0234375", ] code: https://github.com/openshift/origin/blob/master/test/extended/apiserver/api_requests.go#L300 same for openstack https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1505593005082415104 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=107, upperbound=96, ratio=1.1145833333333333", ] code: https://github.com/openshift/origin/blob/master/test/extended/apiserver/api_requests.go#L266 GCP/VSphere are fine, no issues now This could go on forever. Let's keep it as it is change the upper bounds in another iteration. Having flakes from time to time is acceptable. In some case the sudden increases are expected when many tests fail. Retrying and waiting can lead to higher number of watch requests due to tests/operators running longer. As long as the flakes does not appear often/regularly, it is acceptable to ignore occasional blips. thanks, based on Comment 14 - 17, since most of time there is not issue with the upper bound, and checked for aws SNO run: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-nightly-4.11-openshift-e2e-aws-single-node-workers-upgrade-conformance for case "openshift-tests.[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel]", 1 in 23 case failures is related to cluster-monitoring-operator for openstack ccm, the failure ratio is 6/9, other configurations on openstack does not have such high failed ratio. https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install 6 out of 9 failures are related to cluster-monitoring-operator 1. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1504505512811565056 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=104, upperbound=96, ratio=1.0833333333333333", ] 2. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1505230613387415552 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=98, upperbound=96, ratio=1.0208333333333333", ] 3. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1505593005082415104 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=107, upperbound=96, ratio=1.1145833333333333", ] 4. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1508130335605395456 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=102, upperbound=96, ratio=1.0625", ] 5. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1510305447666520064 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=101, upperbound=96, ratio=1.0520833333333333", ] 6. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.11-e2e-openstack-ccm-install/1511392966680252416 : [sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel] { fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:448]: Expected <[]string | len:1, cap:1>: [ "Operator \"cluster-monitoring-operator\" produces more watch requests than expected: watchrequestcount=98, upperbound=96, ratio=1.0208333333333333", ] Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |