Description of problem: etcd does not currently expose the raft term through prometheus metrics. This limits our ability to do granular post-mortem performance analysis using the CI data we have available to us. The etcd operator could itself re-expose this metric point so that we have more information to inform our decisions. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Ge, Good catch! You can see the operator exposing the metrics with: $ oc exec --namespace openshift-etcd-operator deployments/etcd-operator -c etcd-operator -- /bin/bash -c 'curl -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://localhost:8443/metrics' But it looks like we have an error in the etcd-operator scrape configuration causing the metrics to be dropped during collection. I've opened https://github.com/openshift/cluster-etcd-operator/pull/451 to fix the issue.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days