Created attachment 1477083 [details] grafana container logs Description of problem: checked grafana container log for grafana pod, there is error, ""Failed to find user", more info please see the attached log file BTW, lvl=eror seems it is typo, maybe lvl=error is fine # oc logs -c grafana grafana-7476cc5c4b-zw4dn t=2018-08-20T03:23:39+0000 lvl=eror msg="Failed to find user" logger=context error="User not found" t=2018-08-20T03:23:39+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/datasources/proxy/1/api/v1/query status=500 remote_addr="119.254.120.72, 10.129.0.1" time_ms=7 size=1705 referer="https://grafana-openshift-monitoring.apps.0820-brf.qe.rhcloud.com/d/efa86fd1d0c121a26444b636a3f509a8/k8s-compute-resources-cluster?refresh=10s&orgId=1" https://grafana-openshift-monitoring.apps.0820-brf.qe.rhcloud.com/d/efa86fd1d0c121a26444b636a3f509a8/k8s-compute-resources-cluster?refresh=10s&orgId=1 is for "K8s / Compute Resources / Cluster", and the Headlines part "CPU Utilisation,CPU Requests Commitment,CPU Limits Commitment,Memory Utilisation,Memory Requests Commitment,Memory Limits Commitment" are all N/A no instance listed under Nodes and CPU/Memory/Dish data is empty, see the attached pictures Version-Release number of selected component (if applicable): Cluster monitoring component images version: v3.11.0-0.17.0.0 How reproducible: Always Steps to Reproduce: 1. Deploy Cluster monitoring and check data in grafana UI. 2. 3. Actual results: "User not found" error in grafana pod logs and some metrics could not be viewed from grafana UI Expected results: metrics could not be viewed from grafana UI Additional info:
Created attachment 1477085 [details] grafana-proxy container logs
Created attachment 1477086 [details] no instance listed under Nodes and CPU/Memory/Dish data is empty
Created attachment 1477087 [details] data in Headlines part are all N/A
The "user not found" message is simply because we are not using the Grafana backend for authentication, but just tell Grafana to use whatever user is proxied to it through the OpenShift oauth-proxy, therefore it not finding the user is expected. I have identified what the problem with the missing metrics in Grafana is, the dashboards were built on node-exporter v0.15.2, but we are deploying v0.16.0, which has a number of breaking changes to its metrics. I'll take this to the team and we will figure out what to do.
We just merged a number of pull requests that should fix most of these problems. We also noticed some incorrect behavior for filesystem graphs that is already in the works, I would suggest to create a new issue for that though.
https://github.com/coreos/prometheus-operator/releases/tag/v0.20.0
Please change to ON_QA, most of the issues are fixed, new Bug 1622387 is file
Created attachment 1478849 [details] cluster headlines show has data
Created attachment 1478850 [details] metrics data under "Nodes"
cluster monitoring version: v3.11.0-0.22.0.0 # openshift version openshift v3.11.0-0.22.0
What exactly are you referring to with "cluster headlines show has data"? That in the table there are not always numbers displayed? That is correct behavior as a lot of OpenShift components do not have resource requests and limits configured resulting in those blank fields. The metric display error in the nodes dashboard is being fixed.
Per Comment 9 - Comment 11, set to VERIFIED
(In reply to Frederic Branczyk from comment #12) > What exactly are you referring to with "cluster headlines show has data"? > That in the table there are not always numbers displayed? That is correct > behavior as a lot of OpenShift components do not have resource requests and > limits configured resulting in those blank fields. > > The metric display error in the nodes dashboard is being fixed. See picture "data in Headlines part are all N/A", data is N/A, now it is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652