Bug 1889710
| Summary: | Prometheus metrics on disk take more space compared to OCP 4.5 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Simon Pasquier <spasquie> | ||||||
| Component: | Monitoring | Assignee: | Simon Pasquier <spasquie> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 4.6 | CC: | alegrand, anpicker, erooth, juzhao, kakkoyun, lcosic, pkrupa, surbania, wking | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.7.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1889711 (view as bug list) | Environment: | |||||||
| Last Closed: | 2021-02-24 15:26:59 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1889711 | ||||||||
| Attachments: |
|
||||||||
|
Description
Simon Pasquier
2020-10-20 12:44:52 UTC
Fixed in https://github.com/openshift/prometheus/pull/61 by bumping Prometheus to v2.22.0. tested with 4.7.0-0.nightly-2020-10-22-141237 in upi-on-azure cluster,prometheus version="2.22.0"
Element Value
prometheus_build_info{branch="rhaos-4.7-rhel-8",container="prometheus-proxy",endpoint="web",goversion="go1.15.0",instance="10.128.2.15:9091",job="prometheus-k8s",namespace="openshift-monitoring",pod="prometheus-k8s-1",revision="7014907b651c19701e46e21f622c2b113cae6cac",service="prometheus-k8s",version="2.22.0"} 1
prometheus_build_info{branch="rhaos-4.7-rhel-8",container="prometheus-proxy",endpoint="web",goversion="go1.15.0",instance="10.129.2.12:9091",job="prometheus-k8s",namespace="openshift-monitoring",pod="prometheus-k8s-0",revision="7014907b651c19701e46e21f622c2b113cae6cac",service="prometheus-k8s",version="2.22.0"} 1
Run Prometheus at least 4 hours and measure the sample compression with:
rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[8h]) / rate(prometheus_tsdb_compaction_chunk_samples_sum[8h])
the result is greater than 2, not between 1 and 2 bytes
Element Value
{container="prometheus-proxy",endpoint="web",instance="10.128.2.15:9091",job="prometheus-k8s",namespace="openshift-monitoring",pod="prometheus-k8s-1",service="prometheus-k8s"} 2.014416691936647
{container="prometheus-proxy",endpoint="web",instance="10.129.2.12:9091",job="prometheus-k8s",namespace="openshift-monitoring",pod="prometheus-k8s-0",service="prometheus-k8s"} 2.1713813708344802
How long have you waited before running the query? I guess that the results can be skewed if not enough compactions have happened. I'll paste a screenshot for 4.6 which show that the second compaction has a better compression than the first one. Created attachment 1723776 [details]
compression ratio on 4.6
(In reply to Simon Pasquier from comment #4) > How long have you waited before running the query? I guess that the results > can be skewed if not enough compactions have happened. I'll paste a > screenshot for 4.6 which show that the second compaction has a better > compression than the first one. waited for about 5 - 6 hours, will wait for a longer time and monitor again 4.7.0-0.nightly-2020-10-27-051128, let the cluster run for 8 hours, search "rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[8h]) / rate(prometheus_tsdb_compaction_chunk_samples_sum[8h])" the result is between 1 and 2 bytes, see the attached picture Created attachment 1724998 [details]
4.7 compression ratio
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |