Bug 1382728 - Metrics not available in UI until Hawkular cert is manually accepted and no warning or message to indicate this
Summary: Metrics not available in UI until Hawkular cert is manually accepted and no w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Samuel Padgett
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-07 14:11 UTC by Ian Tewksbury
Modified: 2017-03-08 18:43 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The web console would not show any errors on the overview page when metrics were configured, but not working. It would quietly fall back to the behavior when metrics are not set up. The web console now shows an error message with a link to the metrics status URL to help diagnose problems such as invalid certificates. The alert can be permanently dismissed for users who don't wish to see it.
Clone Of:
Environment:
Last Closed: 2017-01-18 12:42:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0066 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4 RPM Release Advisory 2017-01-18 17:23:26 UTC

Description Ian Tewksbury 2016-10-07 14:11:50 UTC
Description of problem:

When the metrics are installed using "openshift_hosted_metrics_deploy" in the ansible inventory then the metrics get installed using self signed certificate that is not automatically accepted by browsers.

This is a problem because the metrics won't show up in the UI of OCP until the certificate is excepted. Additionally there is no error indication in the OCP UI that the certificate needs to be accepted.

Therefore after installing metrics it just looks like metrics isn't working. Until you browser to the hawkular end point and accept the certificate. Then everything starts working. 

How reproducible:

100% of the time.


Steps to Reproduce:
1. run the ansible installer with "openshift_hosted_metrics_deploy: true"
2. Metrics install correctly
3. Browse to a project

Actual results:

* No metrics data is show on overview page
* Browsing to a pod and clicking metrics shows no metrics data
* no error messages indicating there is an issue or what the issue is, just no metrics data

Expected results:

1) ideally there would be no need to accept the certificate and everything would just work

2) if option 1 isn't possible then there should at least be a warning or error message displayed in the UI that says that metrics is not being displayed because the certificate needs to be accepted.

Additional info:

This was a frustrating thing for us to figure out why metrics were working for some users and not others because some had accepted the cert and some had not.

It is double frustrating because there is no indication in the UI there is a problem so if you didn't know otherwise you wouldn't even know metrics were installed.

Comment 1 Matt Wringe 2016-10-07 14:36:06 UTC
Unfortunately this is how it has to work since the metrics component is under a different hostname than the console itself. Its not an ideal situation, the metrics and console team do not like it, but until there is a service we can use in OpenShift to proxy this information under the same hostname as the console, there is nothing the metrics team can do.

If you want the desired behaviour where the browser automatically accepts the certificate you need to provide your own certificates signed by a CA trusted by your browser. This can be done by setting the hawkular-metrics.pem and hawkular-metrics-ca.cert secrets.

The documentation explain how to do this.

The UI normally does have an error message and a link explaining this situation. If you could please describe where you are not seeing the error message that may help the UI team provide a better message here.

Comment 2 Jessica Forrester 2016-10-07 17:16:15 UTC
I'm guessing the issue is the overview.  If you go to the monitoring page or the metrics tab for a deployment or pod then you'll see the error message that links you to the metrics splash page.  But the overview we just fall back and show you the info about the pod templates instead.  Sam and I talked at one point about throwing an alert  at the top of the page when we fail to load the metrics.

Comment 3 Ian Tewksbury 2016-10-07 17:19:02 UTC
What Jessica said.

Overview page shows nothing. You get a link if you go to the metrics page.

It would be nice to show a warning or error or something on the overview page to link people through to accept the cert if possible.

Comment 5 Troy Dawson 2016-10-28 19:57:06 UTC
This has been merged into ose and is in OSE v3.4.0.17 or newer.

Comment 7 Yanping Zhang 2016-10-31 07:41:01 UTC
Checked on OCP v3.4.0.17, after configure metrics, login on web console, check on overview page in the project, could find a warning info on the top “An error occurred getting metrics. Open metrics URL | Don't show me again“. After click "Open metrics URL", it jumps to metrics link and user could accept the cert on the page.
The bug has been fixed, so move it to verified.

Comment 9 errata-xmlrpc 2017-01-18 12:42:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Comment 10 Ian Tewksbury 2017-01-18 12:44:00 UTC
Excellent news.


Note You need to log in before you can comment on or make changes to this bug.