Bug 2092880 - etcdHighNumberOfLeaderChanges returns incorrect number of leadership changes
Summary: etcdHighNumberOfLeaderChanges returns incorrect number of leadership changes
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.11
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.11.0
Assignee: Thomas Jungblut
QA Contact: ge liu
: 2010989 (view as bug list)
Depends On:
Blocks: 2102793
TreeView+ depends on / blocked
Reported: 2022-06-02 13:21 UTC by Thomas Jungblut
Modified: 2022-08-16 16:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2022-08-10 11:15:49 UTC
Target Upstream Version:

Attachments (Terms of Use)
screenshot (186.03 KB, image/png)
2022-06-02 13:21 UTC, Thomas Jungblut
no flags Details

System ID Private Priority Status Summary Last Updated
Github openshift cluster-etcd-operator pull 851 0 None open Bug 2092880: avoid extrapolation in leaderhip alert 2022-06-10 06:35:24 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:16:05 UTC

Description Thomas Jungblut 2022-06-02 13:21:43 UTC
Created attachment 1886098 [details]

The current query:

> increase((max without (instance) (etcd_server_leader_changes_seen_total{job=~".*etcd.*"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~".*etcd.*"}))[15m:1m])

returns bogus results, it alerts on values with 5.x, where the number of leadership changes were actually only 4 compared against the metric etcd_server_is_leader.

I believe this is an issue of extrapolation in the increase function, as described here: https://prometheus.io/docs/prometheus/latest/querying/functions/#increase

Comment 1 W. Trevor King 2022-06-02 20:32:37 UTC
extrapolation is often helpful, because we may not have metrics coverage over the whole window in order to calculate the exact number of leader elections.  I think we should keep the extrapolation, but adjust the wording from [1]:

  {{ $value }} leader changes within the last 15 minutes.

to talk about the extrapolated rate:

  Around {{ $value }} leader changes per 15 minutes.

Alternatively, you could flip it around and do something like:

  Leader elections every {{ FIXME: syntax }} minutes, averaging over the past 15 minutes.

with some gymnastics to get '15 / $value' in there.

[1]: https://github.com/openshift/cluster-etcd-operator/blob/d0ac0559067390d877af995039432481a9d44901/manifests/0000_90_etcd-operator_03_prometheusrule.yaml#L162-L163

Comment 3 W. Trevor King 2022-06-06 02:48:49 UTC
4.10 run [1] has etcdHighNumberOfLeaderChanges firing early on.  I suspect the static pod controller should grow a new metric for config revision, and the etcdHighNumberOfLeaderChanges expr could be updated to say "when the leader churn is higher than what I'd expect given the revision churn" [2].  But that particular post-install situation would also be mitigated by [3].

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-cgroupsv2/1533617810465361920
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=2010989#c8
[3]: https://github.com/openshift/cluster-etcd-operator/pull/804

Comment 4 Thomas Jungblut 2022-06-07 08:06:06 UTC
TODO evaluate whether we also need the initial 1h installation guard: https://github.com/openshift/cluster-etcd-operator/pull/843#discussion_r889135867

Comment 14 Thomas Jungblut 2022-07-20 13:14:25 UTC
*** Bug 2010989 has been marked as a duplicate of this bug. ***

Comment 15 errata-xmlrpc 2022-08-10 11:15:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.