That indeed looks like an inconsistency. Off the top of my head I could see two scenarios causing this: 1. One of the Prometheus instances was down for a given time. 2. One Prometheus compacts differently than the other. Given that Prometheus scrapes itself, we could prove the former by looking at the `scrape_samples_scraped` metric. Would you mind running this metric in the Prometheus UI and looking for anomalies over a long time range? I am also adding Krasi here. He is our Prometheus time series database expert. Krasi: Have you seen similar reports?
nope haven't had such reports to far. I would say there should be some difference somewhere - retention policy , compaction, recording rules etc. compare /config , /rules , /targets, /flags http endpoints and let us know if you see any difference there.