Bug 1674270 - PVs for Prometheus pods shows different usage
Summary: PVs for Prometheus pods shows different usage
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Frederic Branczyk
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-10 16:30 UTC by hgomes
Modified: 2019-10-22 07:55 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-27 14:59:02 UTC
Target Upstream Version:
Embargoed:
kgeorgie: needinfo-


Attachments (Terms of Use)

Comment 1 minden 2019-02-12 12:46:51 UTC
That indeed looks like an inconsistency. Off the top of my head I could see two scenarios causing this:

1. One of the Prometheus instances was down for a given time.

2. One Prometheus compacts differently than the other.

Given that Prometheus scrapes itself, we could prove the former by looking at the `scrape_samples_scraped` metric. Would you mind running this metric in the Prometheus UI and looking for anomalies over a long time range?

I am also adding Krasi here. He is our Prometheus time series database expert. Krasi: Have you seen similar reports?

Comment 2 Krasi 2019-02-12 12:56:33 UTC
nope haven't had such reports to far.

I would say there should be some difference somewhere - retention policy , compaction, recording rules etc.

compare /config , /rules , /targets, /flags http endpoints  and let us know if you see any difference there.


Note You need to log in before you can comment on or make changes to this bug.