Bug 1757807 - [GCP] [flake] [Feature:Prometheus][Conformance] Prometheus when installed on the cluster should provide ingress metrics [Suite:openshift/conformance/parallel/minimal]
Summary: [GCP] [flake] [Feature:Prometheus][Conformance] Prometheus when installed on ...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.2.z
Assignee: Dan Mace
QA Contact: Hongan Li
Depends On: 1755936
TreeView+ depends on / blocked
Reported: 2019-10-02 13:30 UTC by Dan Mace
Modified: 2019-11-19 13:49 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1755936
Last Closed: 2019-11-19 13:49:01 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift origin pull 23901 0 'None' closed Bug 1757807: e2e: stabilize ingress metrics tests 2020-02-17 03:02:57 UTC
Red Hat Product Errata RHBA-2019:3869 0 None None None 2019-11-19 13:49:14 UTC

Description Dan Mace 2019-10-02 13:30:00 UTC
+++ This bug was initially created as a clone of Bug #1755936 +++

The "[Feature:Prometheus][Conformance] Prometheus when installed on the cluster should provide ingress metrics [Suite:openshift/conformance/parallel/minimal]" test is the most frequent flake on GCP tests.  It rarely fails twice in a run, but it has failed numerous times.

One example job here: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/404

Test grid here: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.2-informing#canary-openshift-ocp-installer-e2e-gcp-4.2

Based on the pattern, I wonder if we're seeing an ingress problem.  Please reach out if you find strong evidence that cluster ingress is failing.  the OAuth server failure looks that way.

--- Additional comment from Frederic Branczyk on 2019-09-26 13:37:10 UTC ---

I checked the Prometheus dump and it seems these two metrics are the ones that are not found: https://github.com/openshift/origin/blob/4b9f648354a2dcb2832e3765caa571028f99ce00/test/extended/prometheus/prometheus.go#L286-L287

We didn't write these tests (and I'd personally prefer if they were in the component's test suite not this one as as this example shows the ownership is unclear). Moving to routing component.

Comment 1 Dan Mace 2019-10-23 14:26:16 UTC
CI search and test grid data don't seem to indicate this is happening often enough to warrant our attention. If it recurs, we can open another bug.

Comment 5 errata-xmlrpc 2019-11-19 13:49:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.