Hide Forgot
periodic-ci-openshift-release-master-ci-4.11-e2e-azure-techpreview-serial is failing frequently in CI, see: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-e2e-azure-techpreview-serial Lots of techpreview 4.11 jobs have started to fail on "[sig-instrumentation][Late] Alerts shouldn't exceed the 500 series limit of total series sent via telemetry from each cluster [Skipped:Disconnected] [Suite:openshift/conformance/parallel]". For example, https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-capi-operator/46/pull-ci-openshift-cluster-capi-operator-main-e2e-aws-capi-techpreview/1511259398029185024. Snippet from the job log: { fail [github.com/openshift/origin/test/extended/prometheus/prometheus.go:497]: Unexpected error: <errors.aggregate | len:1, cap:1>: [ { s: "promQL query returned unexpected results:\navg_over_time(cluster:telemetry_selected_series:count[34m51s]) >= 600\n[\n {\n \"metric\": {\n \"prometheus\": \"openshift-monitoring/k8s\"\n },\n \"value\": [\n 1649152361.962,\n \"606.5857142857142\"\n ]\n }\n]", }, ] promQL query returned unexpected results: avg_over_time(cluster:telemetry_selected_series:count[34m51s]) >= 600 [ { "metric": { "prometheus": "openshift-monitoring/k8s" }, "value": [ 1649152361.962, "606.5857142857142" ] } ] occurred} The test threshold is hardcoded at 600 series, we should revise the value given that more telemetry metrics are getting added over time.
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-e2e-azure-techpreview-serial the threshold on the number of telemetry series is changed to 650 openshift-tests.[sig-instrumentation][Late] Alerts shouldn't exceed the 650 series limit of total series sent via telemetry from each cluster [Suite:openshift/conformance/parallel] no failed case for the series limit of total series so far
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069