Bug 1867230

Summary: [sig-instrumentation] Prometheus when installed on the cluster should start and expose a secured proxy and unsecured metrics
Product: OpenShift Container Platform Reporter: David Eads <deads>
Component: MonitoringAssignee: Sergiusz Urbaniak <surbania>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, surbania
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
[sig-instrumentation] Prometheus when installed on the cluster should start and expose a secured proxy and unsecured metrics
Last Closed: 2020-08-10 05:50:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Eads 2020-08-07 18:25:03 UTC
test:
[sig-instrumentation] Prometheus when installed on the cluster should start and expose a secured proxy and unsecured metrics 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-instrumentation%5C%5D+Prometheus+when+installed+on+the+cluster+should+start+and+expose+a+secured+proxy+and+unsecured+metrics


specific job here: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.6/1291710582575075328

Sippy is reporting this test as failing 10% of the time.  It appears to be failing without non-prometheus tests failing.

I often see it in concert with 
test: [sig-instrumentation] Prometheus when installed on the cluster should have important platform topology metrics [Suite:openshift/conformance/parallel] 




fail [github.com/openshift/origin/test/extended/prometheus/prometheus.go:253]: possibly some services didn't register ServiceMonitors to allow metrics collection
Unexpected error:
    <*errors.errorString | 0xc0002a08a0>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
occurred

Comment 1 Sergiusz Urbaniak 2020-08-10 05:50:55 UTC
Thanks a lot for the catch, this is a great investigation source. We identified the issue already and found that prometheus-operator can miss reconciling underlying CRDs. Closing out as duplicate to reduce BZ noise.

*** This bug has been marked as a duplicate of bug 1856189 ***