Bug 1492011 - [pro][pro-us-east-1] Metrics not showing in web console
Summary: [pro][pro-us-east-1] Metrics not showing in web console
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Joel Takvorian
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-15 09:26 UTC by Will Gordon
Modified: 2018-07-26 19:34 UTC (History)
4 users (show)

Fixed In Version: 3.7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-09 18:46:27 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Will Gordon 2017-09-15 09:26:29 UTC
Description of problem:
Browsing to the webconsole for pro-us-east-1 displays "An error occurred getting metrics". Looking in developer tools, the following error is displayed, "XMLHttpRequest cannot load https://metrics.pro-us-east-1.openshift.com/hawkular/metrics. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://console.pro-us-east-1.openshift.com' is therefore not allowed access."

Version-Release number of selected component (if applicable):
Hawkular:          0.27.4.Final-redhat-1
OpenShift Master:  v3.6.173.0.21 (online version 3.5.1.76)
Kubernetes Master: v1.6.1+5115d708d7

How reproducible:
Most of the time. Oddly enough, I was able to get Metrics to work ONCE in incognito mode, but then a refresh caused them to disappear.

Steps to Reproduce:
1. View a project in pro-us-east-1
2. See error
3. Open Chrome Dev Tools for the XML error

Actual results:
Errors

Expected results:
No errors

Additional info:

Comment 1 Matt Wringe 2017-09-15 20:27:46 UTC
I am seeing an error on the overview page:

"An error occurred getting metrics. Open Metrics URL | Don't Show Me Again"

And in the browser console it shows:

"Failed to load https://metrics.pro-us-east-1.openshift.com/hawkular/metrics: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://console.pro-us-east-1.openshift.com' is therefore not allowed access."

This only happens if I refresh the overview page with 'ctrl-shift-r', if I follow a link to the overview page I don't see this error. But either way I don't see metrics on the overview page as expected.

If I open an incognito page, then I don't get any errors and I do see metrics on the overview page as expected.

From the non-incognito page, when going through the network tab, I do see a call out to Hawkular Metrics and its returning a 200 result. But its also being loaded from disk cache.

@spadgett: is there any reason why this is being loaded from cache and not freshly loaded each time? When I load through an incognito page, then it works without issue.

Comment 2 Samuel Padgett 2017-09-15 21:18:26 UTC
Matt, the /metrics page is not setting the right HTTP response headers to prevent caching. So we have a few options

1) Update /metrics to set the correct Cache-Control headers
2) Update the web console to hit a different endpoint to check if metrics are available
3) Add something unique to the URL like a random `preventCache` query parameter to prevent the browser from using the cached response.

Let me know which approach you think works. If there is a better status endpoint, it might be better to switch to that.

Comment 3 Matt Wringe 2017-09-17 16:52:39 UTC
The cache-control header should be set to private (curl -v https://metrics.pro-us-east-1.openshift.com/hawkular/metrics), so the browser can cache this in its own cache.

What Cache-Control value should we be using here instead?

I think I might be missing something here also, the cache result is returning a 200 response. What else should it be returning?

Comment 4 Samuel Padgett 2017-09-17 21:07:36 UTC
(In reply to Matt Wringe from comment #3)
> The cache-control header should be set to private (curl -v
> https://metrics.pro-us-east-1.openshift.com/hawkular/metrics), so the
> browser can cache this in its own cache.

Well, that's what's happening... I don't think it's what you want, though. The /metrics page shows current version and a status message ("Metrics Service: STARTED"). With that `Cache-Control` header and no `max-age`, you're telling the browser it can cache the page and not telling it to revalidate. It will continue to show that cached page even after Hawkular is upgraded or status changes.

`Cache-Control: no-cache` will prevent the browser from caching.

If you want to leave that page `Cache-Control: private`, we should probably switch the web console to use another endpoint for status. But I don't think `private` is best here anyway even if the web console wasn't using that URL.

Let me know what you'd like to do.

Comment 5 Samuel Padgett 2017-09-17 21:08:49 UTC
The 200 response is misleading. The request is not event sent to the server since the browser has it in its cache.

Comment 6 Matt Wringe 2017-09-17 21:36:44 UTC
(In reply to Samuel Padgett from comment #5)
> The 200 response is misleading. The request is not event sent to the server
> since the browser has it in its cache.

Right, so if its not even going to the server, why is the console saying its in error?

Comment 7 Samuel Padgett 2017-09-17 23:20:05 UTC
Are the CORS headers missing? This would look like an error to the console.

Not sure how CORS is handled with browser caching.

Comment 8 Samuel Padgett 2017-09-17 23:29:37 UTC
I wonder if you visited the /metrics page outside of the console, and the page got cached without the CORS headers. Then the requests from the console always fail because the cached version is missing `Access-Control-Allow-Origin`.

This would also explain why it works in incognito mode since it wouldn't use the cached version.

Comment 9 Matt Wringe 2017-09-18 15:54:21 UTC
@jtakori: Can you please take a look at this? It looks like we might not be using the correct caching headers here.

Comment 10 Joel Takvorian 2017-09-19 07:48:30 UTC
I don't see any Cache-Control header set to private in hawkular code or config, and when running the server locally (incognito mode or not), there's no such header set in response (trying on current release/0.27 branch, which doesn't differ from 0.27.4.Final in this aspect).

So I'm wondering, could it be some intermediate layer or filter that manipulates the cache control / some networking configuration in openshift?

Comment 11 Samuel Padgett 2017-09-19 12:28:54 UTC
It's possible. But if Hawkular is not setting any Cache-Control header on that response, it means the page is cacheable and we have the same problem. So regardless, it should be setting something like `Cache-Control: no-cache`

Comment 12 Joel Takvorian 2017-09-19 13:13:16 UTC
PR opened in upstream hawkular-metrics

Comment 13 Joel Takvorian 2017-09-19 14:18:12 UTC
Actually, I am not totally convinced that hawkular server should ask to disable client caching on every path. It makes sense on path like "/metrics/status", but not on all pages, in particular not on "/metrics" which just serves static HTML (with xhr).

To me the issue is more related to the combination CORS + browser caching. I'm trying to investigate more on that. But anyway the upstream PR is on its way if it's an acceptable workaround.

Comment 14 Samuel Padgett 2017-09-19 14:21:32 UTC
I wasn't expecting every page to disable caching.

/metrics doesn't seem like a static page since it has a status message?

If there is a real status endpoint, I would like to switch the console to use that.

Comment 15 Joel Takvorian 2017-09-19 14:38:25 UTC
It is actually static, but with javascript that fetches /metrics/status with xhr.

Comment 16 Matt Wringe 2017-09-19 17:58:01 UTC
(In reply to Joel Takvorian from comment #13)
> Actually, I am not totally convinced that hawkular server should ask to
> disable client caching on every path. It makes sense on path like
> "/metrics/status", but not on all pages,

Other than /metrics, is there any other page you think we shouldn't disable caching?

> in particular not on "/metrics"
> which just serves static HTML (with xhr).

Even static HTML can be updated

Comment 17 Joel Takvorian 2017-09-20 08:32:18 UTC
PR upstream: https://github.com/hawkular/hawkular-metrics/pull/887

Static HTML can be updated but I think it's fine to cache it and maybe set a max-age value. This is what I've done in the PR: "private, max-age=86400" for static HTML, and "no-cache" for all other.

Note that I've found another problem that is probably what's originally caused the CORS issue: the CORS headers were not being set in response on "/metrics" endpoint that is serving HTML. It's also fixed in my patch.

Comment 18 Joel Takvorian 2017-09-21 07:21:41 UTC
[related JIRA: https://issues.jboss.org/browse/HWKMETRICS-736]

Comment 19 Matt Wringe 2017-10-17 14:08:16 UTC
This should already be fixed in our 3.7 release.

Comment 20 Junqi Zhao 2017-10-18 03:56:17 UTC
Metrics route can be accessed and metrics diagrams are shown in web console.

env:
OpenShift Master:v3.6.173.0.21 (online version 3.6.0.45.1)
Kubernetes Master:v1.6.1+5115d708d7 

Images:
metrics-cassandra:v3.6.173.0.7
metrics-hawkular-metrics:v3.6.173.0.7
metrics-heapster:v3.6.173.0.7


Note You need to log in before you can comment on or make changes to this bug.