Bug 1619132 - "User not found" error in grafana pod logs, and some metrics could not be viewed from grafana UI
Summary: "User not found" error in grafana pod logs, and some metrics could not be vie...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.0
Assignee: Frederic Branczyk
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-20 07:33 UTC by Junqi Zhao
Modified: 2018-10-11 07:25 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-11 07:25:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
grafana container logs (27.78 KB, text/plain)
2018-08-20 07:33 UTC, Junqi Zhao
no flags Details
grafana-proxy container logs (2.90 KB, text/plain)
2018-08-20 07:34 UTC, Junqi Zhao
no flags Details
no instance listed under Nodes and CPU/Memory/Dish data is empty (86.60 KB, image/png)
2018-08-20 07:35 UTC, Junqi Zhao
no flags Details
data in Headlines part are all N/A (142.52 KB, image/png)
2018-08-20 07:36 UTC, Junqi Zhao
no flags Details
cluster headlines show has data (139.05 KB, image/png)
2018-08-27 02:43 UTC, Junqi Zhao
no flags Details
metrics data under "Nodes" (202.62 KB, image/png)
2018-08-27 02:44 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 0 None None None 2018-10-11 07:25:45 UTC

Internal Links: 1679504

Description Junqi Zhao 2018-08-20 07:33:48 UTC
Created attachment 1477083 [details]
grafana container logs

Description of problem:
checked grafana container log for grafana pod, there is error, ""Failed to find user", more info please see the attached log file

BTW, lvl=eror seems it is typo, maybe lvl=error is fine

# oc logs -c grafana grafana-7476cc5c4b-zw4dn
t=2018-08-20T03:23:39+0000 lvl=eror msg="Failed to find user" logger=context error="User not found"
t=2018-08-20T03:23:39+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/datasources/proxy/1/api/v1/query status=500 remote_addr="119.254.120.72, 10.129.0.1" time_ms=7 size=1705 referer="https://grafana-openshift-monitoring.apps.0820-brf.qe.rhcloud.com/d/efa86fd1d0c121a26444b636a3f509a8/k8s-compute-resources-cluster?refresh=10s&orgId=1"


https://grafana-openshift-monitoring.apps.0820-brf.qe.rhcloud.com/d/efa86fd1d0c121a26444b636a3f509a8/k8s-compute-resources-cluster?refresh=10s&orgId=1
is for "K8s / Compute Resources / Cluster", and the Headlines part
"CPU Utilisation,CPU Requests Commitment,CPU Limits Commitment,Memory Utilisation,Memory Requests Commitment,Memory Limits Commitment" are all N/A

no instance listed under Nodes and CPU/Memory/Dish data is empty, see the attached pictures

Version-Release number of selected component (if applicable):
Cluster monitoring component images version: v3.11.0-0.17.0.0


How reproducible:
Always

Steps to Reproduce:
1. Deploy Cluster monitoring and check data in grafana UI.
2.
3.

Actual results:
"User not found" error in grafana pod logs and some metrics could not be viewed from grafana UI

Expected results:
metrics could not be viewed from grafana UI

Additional info:

Comment 1 Junqi Zhao 2018-08-20 07:34:31 UTC
Created attachment 1477085 [details]
grafana-proxy container logs

Comment 2 Junqi Zhao 2018-08-20 07:35:10 UTC
Created attachment 1477086 [details]
no instance listed under Nodes and CPU/Memory/Dish data is empty

Comment 3 Junqi Zhao 2018-08-20 07:36:18 UTC
Created attachment 1477087 [details]
data in Headlines part are all N/A

Comment 5 Frederic Branczyk 2018-08-20 10:31:33 UTC
The "user not found" message is simply because we are not using the Grafana backend for authentication, but just tell Grafana to use whatever user is proxied to it through the OpenShift oauth-proxy, therefore it not finding the user is expected.

I have identified what the problem with the missing metrics in Grafana is, the dashboards were built on node-exporter v0.15.2, but we are deploying v0.16.0, which has a number of breaking changes to its metrics. I'll take this to the team and we will figure out what to do.

Comment 6 Frederic Branczyk 2018-08-23 09:53:14 UTC
We just merged a number of pull requests that should fix most of these problems. We also noticed some incorrect behavior for filesystem graphs that is already in the works, I would suggest to create a new issue for that though.

Comment 8 Junqi Zhao 2018-08-27 02:42:00 UTC
Please change to ON_QA, most of the issues are fixed, new Bug 1622387 is file

Comment 9 Junqi Zhao 2018-08-27 02:43:58 UTC
Created attachment 1478849 [details]
cluster headlines show has data

Comment 10 Junqi Zhao 2018-08-27 02:44:58 UTC
Created attachment 1478850 [details]
metrics data under "Nodes"

Comment 11 Junqi Zhao 2018-08-27 02:45:19 UTC
cluster monitoring version: v3.11.0-0.22.0.0
# openshift version
openshift v3.11.0-0.22.0

Comment 12 Frederic Branczyk 2018-08-27 13:40:26 UTC
What exactly are you referring to with "cluster headlines show has data"? That in the table there are not always numbers displayed? That is correct behavior as a lot of OpenShift components do not have resource requests and limits configured resulting in those blank fields.

The metric display error in the nodes dashboard is being fixed.

Comment 13 Junqi Zhao 2018-08-28 00:20:03 UTC
Per Comment 9 - Comment 11, set to VERIFIED

Comment 14 Junqi Zhao 2018-08-28 00:22:43 UTC
(In reply to Frederic Branczyk from comment #12)
> What exactly are you referring to with "cluster headlines show has data"?
> That in the table there are not always numbers displayed? That is correct
> behavior as a lot of OpenShift components do not have resource requests and
> limits configured resulting in those blank fields.
> 
> The metric display error in the nodes dashboard is being fixed.

See picture "data in Headlines part are all N/A", data is N/A, now it is fixed.

Comment 16 errata-xmlrpc 2018-10-11 07:25:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652


Note You need to log in before you can comment on or make changes to this bug.