periodic-ci-openshift-release-master-nightly-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade is failing frequently in CI, see: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-nightly-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade Example job failure: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade/1463913643795025920 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade/1463768020231917568 "[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel]" is failing Log snippet: started: (3/2746/2751) "[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel]" started: (3/2747/2751) "[sig-arch][Late] clients should not use APIs that are removed in upcoming releases [Suite:openshift/conformance/parallel]" started: (3/2748/2751) "[sig-etcd] etcd leader changes are not excessive [Late] [Suite:openshift/conformance/parallel]" started: (3/2749/2751) "[sig-api-machinery][Feature:APIServer][Late] API LBs follow /readyz of kube-apiserver and don't send request early [Suite:openshift/conformance/parallel]" started: (3/2750/2751) "[sig-instrumentation][Late] Alerts shouldn't exceed the 500 series limit of total series sent via telemetry from each cluster [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" started: (3/2751/2751) "[sig-storage][Late] Metrics should report short mount times [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" passed: (400ms) 2021-11-25T19:36:01 "[sig-api-machinery][Feature:APIServer][Late] kube-apiserver terminates within graceful termination period [Suite:openshift/conformance/parallel]" passed: (500ms) 2021-11-25T19:36:01 "[sig-api-machinery][Feature:APIServer][Late] API LBs follow /readyz of kube-apiserver and don't send request early [Suite:openshift/conformance/parallel]" passed: (500ms) 2021-11-25T19:36:01 "[sig-api-machinery][Feature:APIServer][Late] API LBs follow /readyz of kube-apiserver and stop sending requests [Suite:openshift/conformance/parallel]" passed: (500ms) 2021-11-25T19:36:01 "[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully [Suite:openshift/conformance/parallel]" passed: (500ms) 2021-11-25T19:36:01 "[sig-etcd] etcd leader changes are not excessive [Late] [Suite:openshift/conformance/parallel]" [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1453 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/framework.go:1453 [BeforeEach] [Top Level] github.com/openshift/origin/test/extended/util/test.go:61 [BeforeEach] [sig-arch][Late] github.com/openshift/origin/test/extended/util/client.go:142 STEP: Creating a kubernetes client [It] operators should not create watch channels very often [Suite:openshift/conformance/parallel] github.com/openshift/origin/test/extended/apiserver/api_requests.go:93 Nov 25 19:36:02.395: INFO: operator=ingress-operator, watchrequestcount=396, upperbound=742, ratio=0.5336927223719676 Nov 25 19:36:02.395: INFO: operator=authentication-operator, watchrequestcount=373, upperbound=616, ratio=0.6055194805194806 Nov 25 19:36:02.395: INFO: operator=kube-apiserver-operator, watchrequestcount=321, upperbound=520, ratio=0.6173076923076923 Nov 25 19:36:02.395: INFO: operator=openshift-apiserver-operator, watchrequestcount=310, upperbound=452, ratio=0.6858407079646017 Nov 25 19:36:02.395: INFO: operator=cluster-storage-operator, watchrequestcount=293, upperbound=310, ratio=0.9451612903225807 Nov 25 19:36:02.395: INFO: operator=openshift-kube-scheduler-operator, watchrequestcount=192, upperbound=358, ratio=0.5363128491620112 Nov 25 19:36:02.395: INFO: operator=kube-controller-manager-operator, watchrequestcount=191, upperbound=290, ratio=0.6586206896551724 Nov 25 19:36:02.395: INFO: operator=etcd-operator, watchrequestcount=182, upperbound=250, ratio=0.728 Nov 25 19:36:02.395: INFO: operator=openshift-controller-manager-operator, watchrequestcount=178, upperbound=298, ratio=0.5973154362416108 Nov 25 19:36:02.395: INFO: operator=prometheus-operator, watchrequestcount=147, upperbound=180, ratio=0.8166666666666667 Nov 25 19:36:02.395: INFO: operator=console-operator, watchrequestcount=139, upperbound=292, ratio=0.476027397260274 Nov 25 19:36:02.395: INFO: operator=aws-ebs-csi-driver-operator, watchrequestcount=100, upperbound=216, ratio=0.46296296296296297 Nov 25 19:36:02.395: INFO: operator=cluster-image-registry-operator, watchrequestcount=91, upperbound=238, ratio=0.38235294117647056 Nov 25 19:36:02.395: INFO: operator=service-ca-operator, watchrequestcount=88, upperbound=214, ratio=0.411214953271028 Nov 25 19:36:02.395: INFO: operator=cluster-monitoring-operator, watchrequestcount=72, upperbound=66, ratio=1.0909090909090908 Nov 25 19:36:02.395: INFO: Operator cluster-monitoring-operator produces more watch requests than expected Nov 25 19:36:02.395: INFO: operator=openshift-config-operator, watchrequestcount=64, upperbound=94, ratio=0.6808510638297872 Nov 25 19:36:02.395: INFO: operator=machine-api-operator, watchrequestcount=64, upperbound=96, ratio=0.6666666666666666 Nov 25 19:36:02.395: INFO: operator=csi-snapshot-controller-operator, watchrequestcount=57, upperbound=104, ratio=0.5480769230769231 Nov 25 19:36:02.395: INFO: operator=cloud-credential-operator, watchrequestcount=54, upperbound=138, ratio=0.391304347826087 Nov 25 19:36:02.395: INFO: operator=dns-operator, watchrequestcount=54, upperbound=118, ratio=0.4576271186440678 Nov 25 19:36:02.395: INFO: operator=cluster-autoscaler-operator, watchrequestcount=35, upperbound=88, ratio=0.3977272727272727 Nov 25 19:36:02.395: INFO: operator=cluster-node-tuning-operator, watchrequestcount=32, upperbound=78, ratio=0.41025641025641024 Nov 25 19:36:02.395: INFO: operator=kube-storage-version-migrator-operator, watchrequestcount=31, upperbound=116, ratio=0.2672413793103448 Nov 25 19:36:02.395: INFO: operator=cluster-samples-operator, watchrequestcount=30, upperbound=46, ratio=0.6521739130434783 Nov 25 19:36:02.395: INFO: operator=cluster-baremetal-operator, watchrequestcount=30, upperbound=62, ratio=0.4838709677419355 Nov 25 19:36:02.395: INFO: operator=marketplace-operator, watchrequestcount=16, upperbound=30, ratio=0.5333333333333333 [AfterEach] [sig-arch][Late] github.com/openshift/origin/test/extended/util/client.go:140 [AfterEach] [sig-arch][Late] github.com/openshift/origin/test/extended/util/client.go:141 fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:437]: Expected <bool>: true not to be true failed: (1.4s) 2021-11-25T19:36:02 "[sig-arch][Late] operators should not create watch channels very often [Suite:openshift/conformance/parallel]" The key log entry to note is "Nov 25 19:36:02.395: INFO: Operator cluster-monitoring-operator produces more watch requests than expected"
looks like its due to https://github.com/openshift/cluster-monitoring-operator/pull/1472 which added a few more watch requests. watch limits were increased in these PRs on master, backporting to 4.9 should fix this https://github.com/openshift/origin/pull/26583 https://github.com/openshift/origin/pull/26601
Related bug https://bugzilla.redhat.com/show_bug.cgi?id=2018222
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.18 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0279