Bug 1881082

Summary: etcd raft term is not available through metrics
Product: OpenShift Container Platform Reporter: Dan Mace <dmace>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:43:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1883268    

Description Dan Mace 2020-09-21 14:04:38 UTC
Description of problem:

etcd does not currently expose the raft term through prometheus metrics. This limits our ability to do granular post-mortem performance analysis using the CI data we have available to us. The etcd operator could itself re-expose this metric point so that we have more information to inform our decisions.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Dan Mace 2020-09-28 13:33:38 UTC
Ge,

Good catch!

You can see the operator exposing the metrics with:

    $ oc exec --namespace openshift-etcd-operator deployments/etcd-operator -c etcd-operator -- /bin/bash -c 'curl -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://localhost:8443/metrics'

But it looks like we have an error in the etcd-operator scrape configuration causing the metrics to be dropped during collection. I've opened https://github.com/openshift/cluster-etcd-operator/pull/451 to fix the issue.

Comment 9 errata-xmlrpc 2020-10-27 16:43:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Comment 10 Red Hat Bugzilla 2023-09-14 06:08:39 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days