Bug 1468350

Summary: Although metrics comes up, it is not working
Product: OpenShift Container Platform Reporter: Robert Baron <robbaron>
Component: HawkularAssignee: Matt Wringe <mwringe>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: aos-bugs, jtakvori, robbaron
Target Milestone: ---Keywords: UpcomingRelease
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-11 13:39:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Additional information for "Although metrics comes up, it is not working"
none
cloud forms screen shot
none
Hawkular metrics seems to be up.
none
screen shot of the metrics tab in open shift interface. none

Description Robert Baron 2017-07-06 19:14:28 UTC
Created attachment 1295028 [details]
Additional information for "Although metrics comes up, it is not working"

Description of problem:

Just using the ansible playbooks to start metrics, starts all of the pods, however despite the pods starting without errors, somehow metrics are not be reported in a manner that either the OpenShift interface or CloudForms can report upon.


Version-Release number of selected component (if applicable):


How reproducible:

Happens all of the time on my instance of OpenShift.


Steps to Reproduce:
1.  If metrics are enabled, disable using the following:
    ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml \
       -e openshift_metrics_install_metrics=False

2.  Enable metrics using the following:
    ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml \
       -e openshift_metrics_install_metrics=True \
       -e openshift_metrics_hawkular_hostname=hawkular-metrics.apps.osh.massopen.cloud

Actual results:

Although the pods seem to come up cleanly, and some information is known about the systems, none of the node oriented metrics are available in either the openshift interface or CloudForms.

Expected results:

Want to see the metrics in open shift and cloudforms.?


see attached file for:
 
    - the logs for the metric components (Hawkular Metrics, Cassandra, Heapster)

    - the output of 'oc get pods -o yaml'

    - the output of 'nodetool status' run from the Cassandra pod(s)


Note, I haven't added this section yet.

    - the metric section of the inventory file used by openshift ansible

Using the command line instead.

Comment 1 Matt Wringe 2017-07-06 19:33:10 UTC
Just a note, in the future you will need to attach all the information separately. We have had problems in the past where people will attach a single 50k line text file which is not easy to parse.

> none of the node oriented metrics are available in either the openshift interface or CloudForms.

Do the pod level metrics show up in the console? Metrics about the machines that are running OpenShift do not appear in the OpenShift Console, only in CloudForms.

Can you please verify that the pod level metrics are appearing? Or what you are seeing in the console?

Comment 2 Robert Baron 2017-07-06 19:49:50 UTC
Created attachment 1295039 [details]
cloud forms screen shot

This is to show which metrics are appearing and which ones are not.

Comment 3 Robert Baron 2017-07-06 19:53:18 UTC
Created attachment 1295042 [details]
Hawkular metrics seems to be up.

This is just the URL that is presented when metrics don't appear in the interface

Comment 4 Robert Baron 2017-07-06 20:08:47 UTC
Created attachment 1295047 [details]
screen shot of the metrics tab in open shift interface.

This is a screen shot of the metrics tab in the open shift interface

Comment 5 Robert Baron 2017-07-06 20:12:32 UTC
Matt,

I have added the following screenshots to clarify what it is that I am seeing:

    1) Cloud Forms container overview page
    2) Metrics tab in openshift interface.
    3) The page that is linked when the metrics do not come up in the open shift interface.

Is this sufficient?

Comment 6 Matt Wringe 2017-07-07 18:24:55 UTC
We need to get metrics showing up in the OpenShift console before worrying about things not appearing in CloudForms.

From the screenshot of the metrics tab in the browser, it looks like the console is eating the error message. 

Is it possible for you to get the actual error message coming from your browser? If you click the link specified after "Metrics are not available" does that work?

In chrome you can get the error message by right clicking and selecting 'inspect' then going to the 'network' tab and refreshing the OpenShift console page.

You should then see the network call to Hawkular Metrics which should display what the error is that Hawkular Metrics is returning.

Comment 7 Robert Baron 2017-07-07 21:39:09 UTC
If I click on the link after "Metrics are not available" I get the page that reports that Hawkular-metrics are up.

The error message within the open shift web console is:

XMLHttpRequest cannot load https://hawkular-metrics.apps.osh.massopen.cloud/hawkular/metrics/gauges/po…-fa163eacf4ff%2Fnetwork%2Ftx_rate/data?bucketDuration=120000ms&start=-60mn. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://128.31.22.40:8443' is therefore not allowed access. The response had HTTP status code 500.

and: 

hawkular-metrics.apps.osh.massopen.cloud/hawkular/metrics/gauges/pod%2F25e9…-fa163eacf4ff%2Fnetwork%2Ftx_rate/data?bucketDuration=120000ms&start=-60mn Failed to load resource: the server responded with a status of 500 (Kubernetes client request failure)

Comment 8 Joel Takvorian 2017-07-10 08:21:45 UTC
Robert, can you tell if you get different result when using another browser? Personally I can have similar symptoms when I use a recent version of Chrome, whereas I get the metrics when I use Firefox, which is related to the way certificates are generated. Google has reinforced security in Chrome with CN support dropped (https://www.thesslstore.com/blog/security-changes-in-chrome-58/).

Though the problem here might be different, it's worth checking.

Comment 9 Matt Wringe 2017-07-10 16:01:58 UTC
The error you are seeing may be related to a problem we have encountered when there are multiple certificates in the ca.crt file.

You can confirm if this is the error by running:

oc exec -it hawkular-metrics-1nsdq cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Comment 10 Robert Baron 2017-07-10 16:50:39 UTC
Yes, there a multiple certificates.

And Opera and Safari give the same results as chrome.  I haven't gotten Firefox running.

Comment 11 Matt Wringe 2017-07-11 13:39:18 UTC

*** This bug has been marked as a duplicate of bug 1468308 ***