Bug 1855325
Summary: | [Feature:Prometheus][Conformance] Prometheus when installed on the cluster [Top Level] [Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report telemetry if a cloud.openshift.com token is present | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Varsha <vnarsing> |
Component: | Monitoring | Assignee: | Pawel Krupa <pkrupa> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 4.4 | CC: | alegrand, anpicker, bparees, erooth, kakkoyun, lcosic, pkrupa, spasquie, surbania |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: |
[Feature:Prometheus][Conformance] Prometheus when installed on the cluster [Top Level] [Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report telemetry if a cloud.openshift.com token is present
|
|
Last Closed: | 2021-02-24 15:13:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Varsha
2020-07-09 14:55:29 UTC
I suspect that the issue is bad timing between when the test is executed and when telemetry metrics are available from Prometheus. The telemeter-client logs [1] show that it didn't retrieve any metrics at 11:40:29.8762. This is consistent with the Prometheus logs [2][3] which show that they were starting around that time. The Prometheus dump shows also that the telemeter client has sent samples to the telemetry backend after the 11:45:18 mark while the test reported the failure at 11:45:03.430. Given that the telemeter client sends data every 4min30s and the test checking whether telemetry data has been sent does 5 retries at an interval of 10 seconds, this would explain it. The issue is probably rare and less visible in 4.6 since failing tests are retried to eliminate flakes. We should still look into making the test more predictible. [1] https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-vsphere-upi-4.4/1281185853179170816/artifacts/e2e-vsphere-upi/pods/openshift-monitoring_telemeter-client-66dbfd95b7-zgv7k_telemeter-client.log [2] https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-vsphere-upi-4.4/1281185853179170816/artifacts/e2e-vsphere-upi/pods/openshift-monitoring_prometheus-k8s-0_prometheus.log [3] https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-vsphere-upi-4.4/1281185853179170816/artifacts/e2e-vsphere-upi/pods/openshift-monitoring_prometheus-k8s-1_prometheus.log this was a tollbooth issue which has been resolved, there are only a couple flaky recent failures, most of the failures are over a day old(from before the issue was addressed) and will slowly fall off the test history. checked with 4.7 CI results,no failed error for the case https://search.ci.openshift.org/?search=%5C%5BFeature%3APrometheus%5C%5D%5C%5BConformance%5C%5D+Prometheus+when+installed+on+the+cluster+%5C%5BTop+Level%5C%5D+%5C%5BFeature%3APrometheus%5C%5D%5C%5BConformance%5C%5D+Prometheus+when+installed+on+the+cluster+should+report+telemetry+if+a+cloud%5C.openshift%5C.com+token+is+present&maxAge=36h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job close it, feel free to reopen it if it happens Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |