Bug 1893201 - e2e-operator flakes with "TestMetricsAccessible: prometheus returned unexpected results: timed out waiting for the condition"
Summary: e2e-operator flakes with "TestMetricsAccessible: prometheus returned unexpect...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.6.0
Assignee: Mike Dame
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks: 1893202
TreeView+ depends on / blocked
 
Reported: 2020-10-30 14:31 UTC by Mike Dame
Modified: 2020-11-09 15:51 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1893202 (view as bug list)
Environment:
Last Closed: 2020-11-09 15:50:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4339 0 None None None 2020-11-09 15:51:11 UTC

Description Mike Dame 2020-10-30 14:31:45 UTC
The e2e-operator test frequently fails on a prometheus timeout:

 --- FAIL: TestMetricsAccessible (30.57s)
    scheduler_test.go:317: prometheus returned unexpected results: timed out waiting for the condition
FAIL
FAIL	github.com/openshift/cluster-kube-scheduler-operator/test/e2e	35.059s
FAIL

Example: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-kube-scheduler-operator/290/pull-ci-openshift-cluster-kube-scheduler-operator-release-4.5-e2e-aws-operator/1322155133357789184

We have tried extending the timeout for this test (https://github.com/openshift/cluster-kube-scheduler-operator/pull/299). It seems to fail for many runs before eventually passing

Comment 1 Mike Dame 2020-10-30 14:33:45 UTC
Going to try backporting the change from #299 to release-4.5, where we're currently seeing the problem.

For QE, this bug should be verifiable by confirming that this test hasn't had any recent failures in CI

Comment 4 RamaKasturi 2020-11-03 07:46:57 UTC
will wait for couple of  more days to check the results and then move the bug to verified state

Comment 5 RamaKasturi 2020-11-05 06:56:46 UTC
Did not see any error for the last three days, so moving the bug to verified state.

Verified in the link here: https://search.ci.openshift.org/?search=scheduler_test.go%3A317%3A+prometheus+returned+unexpected+result&maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 7 errata-xmlrpc 2020-11-09 15:50:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.3 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4339


Note You need to log in before you can comment on or make changes to this bug.