Bug 1695903 - Could not monitor Elasticsearch with Prometheus with OCP 3.11
Summary: Could not monitor Elasticsearch with Prometheus with OCP 3.11
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.11.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
Depends On:
TreeView+ depends on / blocked
Reported: 2019-04-03 21:22 UTC by hgomes
Modified: 2020-01-27 13:55 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The oauth-proxy was not passing a user's token Consequence: Elasticsearch did not have a token to evaluate if a user could retrieve metrics Fix: add the proper switch to the oauth-proxy Result: User's with the proper role can retrieve metrics
Clone Of:
Last Closed: 2019-06-26 09:07:55 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 11495 0 'None' closed bug 1695903. fix es metrics 2020-11-27 06:29:35 UTC
Red Hat Product Errata RHBA-2019:1605 0 None None None 2019-06-26 09:08:04 UTC

Description hgomes 2019-04-03 21:22:04 UTC
Description of problem:

This might not even be a problem, but would like to understand if would this proceed as RFE maybe?

Elasticsearch and prometheus are installed with the standardinstallation of openshift. I expect
- prometheus could access the metrics of elasticsearch 
- prometheus is configured to scap them
- alerts are configured (e.g. disc is getting full in one week)
- dashboards are present in grafana 

For instance this project https://github.com/justwatchcom/elasticsearch_exporter is providing what we expect. We also expect this for fluentd and kibana.

From customer perspective:
I consider this is a bug. What is your roadmap to fix this bug? We/and our clients need the solution now. Can you provide us guidance and advices to work on this topic? How could we provide our solution upstream?

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Frederic Branczyk 2019-04-04 07:55:06 UTC
Each component is responsible for shipping their monitoring, so reassigning this to the logging component. As far as I am aware though the team has shipped scraping and alerting for 4.1, but I'd prefer if they would confirm that.

Comment 2 Jeff Cantrill 2019-04-12 19:29:46 UTC
Metrics are available in 3.11, though during investigation of another issue I discovered we are unable to pull them through our proxy because of a missing switch.  I will use this bz to fix that.  I believe there may be a second issue, however, which we corrected in 4.x.  I believe even if you provide the correct service account that you will not have properly signed certs unless you ignore who signed them.  Logging creates its own certs and builds it's own truststore.

To setup:

1. Define the service account in your inventory file(openshift_prometheus_namespace, openshift_logging_elasticsearch_prometheus_sa) which will be bound to this role: prometheus-metrics-viewer
2. Deploy logging using the 3.11 fix that will be associated with this bz
3. Retrieve metrics like: 'curl -k https://<logging-es-prometheus service>/_prometheus/metrics -H "Authorization : Bearer $sa_token"

I defer to the monitoring team how to configure prometheus as I'm unfamiliar with that end. Following is the documentation we presently have regarding metrics [1].


Comment 6 Anping Li 2019-06-13 07:01:20 UTC
The metrics can be fetched using the token of serviceaccount system:serviceaccount:openshift-monitoring:prometheus-k8s.  @Frederic, you are correct, to display the elasticsearch metrics and apply the rules, you need to provide rules files to prometheus.  that appened automatically in 4.x.

Comment 8 Anping Li 2019-06-14 09:46:51 UTC
I'd like close this bug as the elasticsearch can expose the metrics via token. For the further requirement, such display metrics in prometheus. please workaround yourself or file a RFE bug.

Comment 10 errata-xmlrpc 2019-06-26 09:07:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.