1752814 – Counter metrics are decreasing which should not be allowed

Bug 1752814 - Counter metrics are decreasing which should not be allowed

Summary: Counter metrics are decreasing which should not be allowed

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Stephen Greene
QA Contact:	Arvind iyengar
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1890534 1890545
TreeView+	depends on / blocked

Reported:	2019-09-17 09:35 UTC by Anshul Verma
Modified:	2024-03-25 15:25 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: HAProxy is reloaded when the router changes the configuration file Consequence: Some counter prometheus metrics were decreasing across reloads which explicitly violates the definition of a counter metric. Fix: Have the router reload metrics code note the time of the last metrics scrape, to avoid scraping beyond the preserved counter values during a reload Result: Counter metrics do not see a sudden increase followed by a decrease when the router is reloading.
Clone Of:
Clones:	1890545 (view as bug list)
Environment:
Last Closed:	2020-10-27 15:54:19 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift router pull 179	0	None	closed	Bug 1752814: Fix decreasing counter metrics when reloading HAProxy	2020-12-07 09:39:11 UTC
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 15:54:52 UTC

Description Anshul Verma 2019-09-17 09:35:14 UTC

Description of problem:

The metrics which should act as counters like haproxy_frontend_http_responses_total. These metrics should only be allowed to increase or to start over again from zero but should not decrease. 
Such definition of a prometheus counter can be found here: https://prometheus.io/docs/concepts/metric_types/#counter

Decreasing counters produce a problem in prometheus rate() function because decreasing means that the counter must be restarted and the counter starts again with zero. But if the counter does not start with zero and only decreases a little bit the rate functions produces a peak in the statistic.

Such counters are seen decreasing which should not be allowed.

Comment 7 Daneyon Hansen 2020-01-13 23:13:32 UTC

@Selim,

Now that [2] is done, we should be able to fix this bz using the preferred method [3]. We will address [3] during our next sprint planning on 1/16.

Comment 17 Arvind iyengar 2020-09-30 12:42:41 UTC

Verfied in "4.6.0-0.nightly-2020-09-30-052433" release version. With this payload, it is noted noted that the metric counters continues to increase or resets to zero during reload conditions but no decrements.

Comment 24 errata-xmlrpc 2020-10-27 15:54:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.