Summary: | Prometheus fails to insert reporting metrics when the sample limit is met | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Simon Pasquier <spasquie> |
Component: | Monitoring | Assignee: | Arunprasad Rajkumar <arajkuma> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | medium | Docs Contact: | Brian Burt <bburt> |
Priority: | medium | ||
Version: | 4.10 | CC: | amuller, anpicker, aos-bugs, bburt, erooth |
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Previously, if reporting metrics failed due to reaching the configured sample limit, the metrics target would still appear with a status of `Up` in the web console UI even though the metrics were missing. With this release, Prometheus bypasses the sample limit setting for reporting metrics, and the metrics now appear regardless of the sample limit setting.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:35:23 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: |
Description
Simon Pasquier
2021-12-20 11:33:31 UTC
tested with 4.10.0-0.nightly-2021-12-21-130047, followed steps in comment 0, could see the up metric # oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 ts=2021-12-21T23:54:33.692Z caller=main.go:532 level=info msg="Starting Prometheus" version="(version=2.32.1, branch=rhaos-4.10-rhel-8, revision=2003b6cb83d933ad154a6dcd6bc6b497488b8501)" # oc -n openshift-user-workload-monitoring exec -c prometheus prometheus-user-workload-0 -- cat /etc/prometheus/config_out/prometheus.env.yaml scrape_configs: - job_name: serviceMonitor/ns1/prometheus-example-monitor/0 ... - source_labels: - __tmp_hash regex: 0 action: keep sample_limit: 1 metric_relabel_configs: - target_label: namespace replacement: ns1 # oc -n openshift-user-workload-monitoring exec -c prometheus prometheus-user-workload-0 -- cat /etc/prometheus/config_out/prometheus.env.yaml # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?query=up%7Bnamespace%3D%22ns1%22%7D' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "up", "endpoint": "web", "instance": "10.131.0.152:8080", "job": "prometheus-example-app", "namespace": "ns1", "pod": "prometheus-example-app-8659789999-nwh2k", "prometheus": "openshift-user-workload-monitoring/user-workload", "service": "prometheus-example-app" }, "value": [ 1640145070.945, "1" ] } ] } } Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |