Bug 2043098

Summary: Grafana Dash no longer works with error Templating Template variable service failed
Product: OpenShift Container Platform Reporter: jhusta <jhusta>
Component: MonitoringAssignee: Simon Pasquier <spasquie>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: amuller, anpicker, aos-bugs, erooth, fleber, Holger.Wolf, krmoser
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-20 16:24:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
grafana pod log
none
Grafana Error Pop Up screen shot none

Description jhusta 2022-01-20 15:51:16 UTC
Description of problem:
I was running a light mixed workload against mem, network and i/o. I was monitoring using Grafana and it was working with no issues during the initial start of the run. When checking in the morning I receive a popup stating
 Templating Template variable service failed.......

Looking at the logs in grafana pod I see the following:


t=2022-01-20T12:43:50+0000 lvl=info msg="Database locked, sleeping then retrying" logger=sqlstore error="database is locked" retry=0
t=2022-01-20T12:43:50+0000 lvl=eror msg="Failed to login" logger=auth.proxy username=kube:admin message="failed to log in as user, specified in auth proxy header" error="user already exists" ignoreCache=false
t=2022-01-20T12:43:50+0000 lvl=eror msg="failed to log in as user, specified in auth proxy header" logger=context error="user already exists"
t=2022-01-20T12:43:50+0000 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/datasources/proxy/1/api/v1/series status=407 remote_addr="10.20.116.6, 10.128.2.2" time_ms=14 size=1762 referer="https://grafana-openshift-monitoring.apps.pok-74.ocptest.pok.stglabs.ibm.com/d/cL_KOrJnz/node-exporter-use-method-node?orgId=1&refresh=30s&var-datasource=prometheus&var-cluster=&var-instance=worker-2.pok-74.ocptest.pok.stglabs.ibm.com"
t=2022-01-20T12:43:50+0000 lvl=info msg="Request Completed" logger=context userId=2 orgId=1 uname=kube:admin method=GET path=/api/datasources/proxy/1/api/v1/series status=403 remote_addr="10.20.116.6, 10.131.0.2" time_ms=35 size=86085 referer="https://grafana-openshift-monitoring.apps.pok-74.ocptest.pok.stglabs.ibm.com/d/cL_KOrJnz/node-exporter-use-method-node?orgId=1&refresh=30s&var-datasource=prometheus&var-cluster=&var-instance=worker-2.pok-74.ocptest.pok.stglabs.ibm.com"
t=2022-01-20T12:44:02+0000 lvl=eror msg="Dashboard not found" logger=context userId=2 orgId=1 uname=kube:admin error="Dashboard not found" remote_addr="10.20.116.6, 10.131.0.2"
t=2022-01-20T12:44:02+0000 lvl=info msg="Request Completed" logger=context userId=2 orgId=1 uname=kube:admin method=GET path=/api/dashboards/uid/cL_KOrJnz status=404 remote_addr="10.20.116.6, 10.131.0.2" time_ms=1 size=33 referer="https://grafana-openshift-monitoring.apps.pok-74.ocptest.pok.stglabs.ibm.com/d/cL_KOrJnz/node-exporter-use-method-node?orgId=1&refresh=30s&var-datasource=prometheus&var-cluster=&var-instance=worker-2.pok-74.ocptest.pok.stglabs.ibm.com"
.
.
.
t=2022-01-20T12:48:06+0000 lvl=info msg="Request Completed" logger=context userId=2 orgId=1 uname=kube:admin method=GET path=/api/datasources/proxy/1/api/v1/query_range status=403 remote_addr="10.20.116.6, 10.128.2.2" time_ms=2 size=86211 referer="https://grafana-openshift-monitoring.apps.pok-74.ocptest.pok.stglabs.ibm.com/d/0TCpKC1nz/node-exporter-use-method-node?orgId=1&refresh=30s"



Version-Release number of selected component (if applicable):
Client Version: 4.10.0-0.nightly-s390x-2022-01-17-171822
Server Version: 4.10.0-0.nightly-s390x-2022-01-17-171822
Kubernetes Version: v1.23.0+60f5a1c


How reproducible:
I am not sure what triggered the issue. In my case I had a mixed workload running over night. I checked another one of my systems that has nothing running and I can get to the grafana dashboard with no issues.


Steps to Reproduce:
1.
2.
3.

Actual results:
Grafana Dashboard is no longer working


Expected results:


Additional info:
I will attach the grafana pod log and the screen shot of the Grafana Error Popup

Comment 1 jhusta 2022-01-20 15:53:36 UTC
Created attachment 1852226 [details]
grafana pod log

Comment 2 jhusta 2022-01-20 15:57:38 UTC
Created attachment 1852227 [details]
Grafana Error Pop Up screen shot

Comment 3 Simon Pasquier 2022-01-20 16:24:12 UTC

*** This bug has been marked as a duplicate of bug 2037891 ***