1674270 – PVs for Prometheus pods shows different usage

Bug 1674270 - PVs for Prometheus pods shows different usage

Summary: PVs for Prometheus pods shows different usage

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Frederic Branczyk
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-10 16:30 UTC by hgomes
Modified:	2019-10-22 07:55 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-02-27 14:59:02 UTC
Target Upstream Version:
Embargoed:
Flags:	kgeorgie: needinfo-

Attachments	(Terms of Use)

Comment 1 minden 2019-02-12 12:46:51 UTC

That indeed looks like an inconsistency. Off the top of my head I could see two scenarios causing this:

1. One of the Prometheus instances was down for a given time.

2. One Prometheus compacts differently than the other.

Given that Prometheus scrapes itself, we could prove the former by looking at the `scrape_samples_scraped` metric. Would you mind running this metric in the Prometheus UI and looking for anomalies over a long time range?

I am also adding Krasi here. He is our Prometheus time series database expert. Krasi: Have you seen similar reports?

Comment 2 Krasi 2019-02-12 12:56:33 UTC

nope haven't had such reports to far.

I would say there should be some difference somewhere - retention policy , compaction, recording rules etc.

compare /config , /rules , /targets, /flags http endpoints  and let us know if you see any difference there.

Note You need to log in before you can comment on or make changes to this bug.