Bug 1394338

Summary: [ded-int-gcp] WebSocket timeouts from Web Console and not metrics
Product: OpenShift Container Platform Reporter: Steve Speicher <sspeiche>
Component: HawkularAssignee: Abhishek Gupta <abhgupta>
Status: CLOSED NOTABUG QA Contact: Peng Li <penli>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: aos-bugs, bingli, jforrest, mwoodson, mwringe, zhezli
Target Milestone: ---Keywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-15 20:13:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Steve Speicher 2016-11-11 18:10:04 UTC
Description of problem:
In Web Console:
Metrics won't load.
Ui won't update


Version-Release number of selected component (if applicable):
https://console.ded-int-gcp.openshift.com

I'm running on Mac OSX 10.12.1, Chrome 4.0.2840.71 (64-bit)


How reproducible:
Login in to web console, deploy any app. I used nodejs-mongodb-example

Actual results:
Get Service Interrupted error.

Expected results:
See metrics, graphs and UI update without having to hit refresh.

Additional info (Chrome console log output):

XMLHttpRequest cannot load https://metrics.ded-int-gcp.openshift.com/hawkular/metrics/counters/data?st…%3D.%2A%5Cbname%3Amongodb%5Cb%29.%2A%24&bucketDuration=60000ms&start=-15mn. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://console.ded-int-gcp.openshift.com' is therefore not allowed access. The response had HTTP status code 400.
2https://metrics.ded-int-gcp.openshift.com/hawkular/metrics/counters/data?st…%3D.%2A%5Cbname%3Amongodb%5Cb%29.%2A%24&bucketDuration=60000ms&start=-15mn Failed to load resource: the server responded with a status of 400 (Bad Request)
overview:1 XMLHttpRequest cannot load https://metrics.ded-int-gcp.openshift.com/hawkular/metrics/counters/data?st…%3D.%2A%5Cbname%3Amongodb%5Cb%29.%2A%24&bucketDuration=60000ms&start=-15mn. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://console.ded-int-gcp.openshift.com' is therefore not allowed access. The response had HTTP status code 400.
scripts.js:414 WebSocket connection to 'wss://api.ded-int-gcp.openshift.com/api/v1/namespaces/sspeiche-1/services?w…ceVersion=2539930&access_token=2ZH5v6bGJvAA-23RqhI_rd9wxhgDiNUTIinZq_mHf1U' failed: WebSocket opening handshake was canceledc @ scripts.js:414
scripts.js:414 WebSocket connection to 'wss://api.ded-int-gcp.openshift.com/api/v1/namespaces/sspeiche-1/pods?watch…ceVersion=2539930&access_token=2ZH5v6bGJvAA-23RqhI_rd9wxhgDiNUTIinZq_mHf1U' failed: WebSocket opening handshake was canceledc @ scripts.js:414

Comment 1 Ben Bennett 2016-11-11 18:17:59 UTC
I think the metrics need to be configured to add that header field.  It's something that comes in the response, not something that we do in the router.

Comment 2 Matt Woodson 2016-11-14 20:16:16 UTC
Are we hitting this issue?

https://code.google.com/p/googleappengine/issues/detail?id=11570

Comment 3 Jessica Forrester 2016-11-14 21:50:56 UTC
There seem to be multiple things going on here.

I can't recreate the websocket issues, they are working fine for me.

For the metrics requests, we are getting the CORS headers returned on the OPTIONS preflight requests to metrics, but on the subsequent GET requests the headers are not returned.  @mwringe any thoughts here?

Comment 4 Matt Wringe 2016-11-14 22:00:31 UTC
Usually when we see "No 'Access-Control-Allow-Origin' header is present on the requested resource" there is something wrong with the setup.

We have seen this before when the metrics url in the master-config.yaml does not include "/hawkular/metrics" at the end. This would cause the console to request an endpoint which doesn't belong to Hawkular Metrics and hence does not return the proper CORS header.

From the looks of the request though, you are including "/hawkular/metrics".

Are you able to access https://metrics.ded-int-gcp.openshift.com/hawkular/metrics/status in a browser?

What version of OpenShift and what version of Hawkular Metrics are you using?

Comment 5 Matt Woodson 2016-11-14 22:00:44 UTC
Please ignore my comment #2.  This is a bug in GCP using managedVMs.  We are not using this, we are using regular VM's with network loadbalancing.  This does support websockets according to their documentation.

If we were seeing the issue, I think that it would also show up in console tasks, and Jessica does confirm they work.

Sorry for the confusion.

Comment 6 Matt Woodson 2016-11-14 22:02:52 UTC
some info for Matt Wringe, commented #4

# oc version
oc v3.3.1.3
kubernetes v1.3.0+52492b4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://internal.api.ded-int-gcp.openshift.com
openshift v3.3.1.3
kubernetes v1.3.0+52492b4


# grep metrics /etc/origin/master/master-config.yaml
  metricsPublicURL: https://metrics.ded-int-gcp.openshift.com/hawkular/metrics

Comment 7 Matt Woodson 2016-11-14 22:38:32 UTC
Matt Wringe helped identify that the version of metrics running on ded-int-gcp was old.  I did a re-install of metrics, and it is now confirmed working.

Steve, I will let you handle this bug.

Comment 8 Steve Speicher 2016-11-15 20:13:32 UTC
No longer a bug. It is localized to my browser (websockets issue) and not a loadbalancer issue we can tell. Have had 3+ others verify it works fine on Chrome and works fine for me on Firefox