Created attachment 1295028 [details] Additional information for "Although metrics comes up, it is not working" Description of problem: Just using the ansible playbooks to start metrics, starts all of the pods, however despite the pods starting without errors, somehow metrics are not be reported in a manner that either the OpenShift interface or CloudForms can report upon. Version-Release number of selected component (if applicable): How reproducible: Happens all of the time on my instance of OpenShift. Steps to Reproduce: 1. If metrics are enabled, disable using the following: ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml \ -e openshift_metrics_install_metrics=False 2. Enable metrics using the following: ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml \ -e openshift_metrics_install_metrics=True \ -e openshift_metrics_hawkular_hostname=hawkular-metrics.apps.osh.massopen.cloud Actual results: Although the pods seem to come up cleanly, and some information is known about the systems, none of the node oriented metrics are available in either the openshift interface or CloudForms. Expected results: Want to see the metrics in open shift and cloudforms.? see attached file for: - the logs for the metric components (Hawkular Metrics, Cassandra, Heapster) - the output of 'oc get pods -o yaml' - the output of 'nodetool status' run from the Cassandra pod(s) Note, I haven't added this section yet. - the metric section of the inventory file used by openshift ansible Using the command line instead.
Just a note, in the future you will need to attach all the information separately. We have had problems in the past where people will attach a single 50k line text file which is not easy to parse. > none of the node oriented metrics are available in either the openshift interface or CloudForms. Do the pod level metrics show up in the console? Metrics about the machines that are running OpenShift do not appear in the OpenShift Console, only in CloudForms. Can you please verify that the pod level metrics are appearing? Or what you are seeing in the console?
Created attachment 1295039 [details] cloud forms screen shot This is to show which metrics are appearing and which ones are not.
Created attachment 1295042 [details] Hawkular metrics seems to be up. This is just the URL that is presented when metrics don't appear in the interface
Created attachment 1295047 [details] screen shot of the metrics tab in open shift interface. This is a screen shot of the metrics tab in the open shift interface
Matt, I have added the following screenshots to clarify what it is that I am seeing: 1) Cloud Forms container overview page 2) Metrics tab in openshift interface. 3) The page that is linked when the metrics do not come up in the open shift interface. Is this sufficient?
We need to get metrics showing up in the OpenShift console before worrying about things not appearing in CloudForms. From the screenshot of the metrics tab in the browser, it looks like the console is eating the error message. Is it possible for you to get the actual error message coming from your browser? If you click the link specified after "Metrics are not available" does that work? In chrome you can get the error message by right clicking and selecting 'inspect' then going to the 'network' tab and refreshing the OpenShift console page. You should then see the network call to Hawkular Metrics which should display what the error is that Hawkular Metrics is returning.
If I click on the link after "Metrics are not available" I get the page that reports that Hawkular-metrics are up. The error message within the open shift web console is: XMLHttpRequest cannot load https://hawkular-metrics.apps.osh.massopen.cloud/hawkular/metrics/gauges/po…-fa163eacf4ff%2Fnetwork%2Ftx_rate/data?bucketDuration=120000ms&start=-60mn. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://128.31.22.40:8443' is therefore not allowed access. The response had HTTP status code 500. and: hawkular-metrics.apps.osh.massopen.cloud/hawkular/metrics/gauges/pod%2F25e9…-fa163eacf4ff%2Fnetwork%2Ftx_rate/data?bucketDuration=120000ms&start=-60mn Failed to load resource: the server responded with a status of 500 (Kubernetes client request failure)
Robert, can you tell if you get different result when using another browser? Personally I can have similar symptoms when I use a recent version of Chrome, whereas I get the metrics when I use Firefox, which is related to the way certificates are generated. Google has reinforced security in Chrome with CN support dropped (https://www.thesslstore.com/blog/security-changes-in-chrome-58/). Though the problem here might be different, it's worth checking.
The error you are seeing may be related to a problem we have encountered when there are multiple certificates in the ca.crt file. You can confirm if this is the error by running: oc exec -it hawkular-metrics-1nsdq cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Yes, there a multiple certificates. And Opera and Safari give the same results as chrome. I haven't gotten Firefox running.
*** This bug has been marked as a duplicate of bug 1468308 ***