Description of problem:
No data in grafana dashboard.
Version-Release number of selected component (if applicable):
- Promethues installation done completely
- All pods are running
- Prometheus targets are available
<snip sample output>
- All pods are running no pod restart.
[root@master-0 ~]# oc get pods
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 3/3 Running 13 10d
alertmanager-main-1 3/3 Running 0 6d
alertmanager-main-2 3/3 Running 0 6d
cluster-monitoring-operator-6f5fbd6f8b-qxg5c 1/1 Running 6 10d
grafana-857fc848bf-lljwm 2/2 Running 7 10d
kube-state-metrics-75c4d6dc-ffs7j 3/3 Running 8 10d
node-exporter-896gs 2/2 Running 0 18h
node-exporter-jd8rx 2/2 Running 4 10d
node-exporter-qxfhr 2/2 Running 2 10d
node-exporter-t92fz 2/2 Running 0 1d
node-exporter-xmqf2 0/2 Completed 0 10d
node-exporter-zbqlv 2/2 Running 3 10d
prometheus-k8s-0 4/4 Running 1 6d
prometheus-k8s-1 0/4 Unknown 1 10d
prometheus-operator-7855c8646b-659w6 0/1 Unknown 0 10d
prometheus-operator-7855c8646b-dxhnz 1/1 Running 5 6d
Steps to Reproduce:
I tried to reproduce this but I can't. I spin up a v3.11 openshift cluster and I can access Grafana dashboard and see all the data.
Can you provide additional information. Cluster events, logs from grafana container would be useful. How long was the cluster up and running? Anything else failing besides grafana?
Also I noticed it says "- All pods are running no pod restart." But in the output of the ` oc get pods` command it seems like multiple pods were restarted. Can you clarify that. Thanks!
Another question, it says "- Promethues installation done completely ", does this mean the installation was done manually?
Thanks for the access to the logs. While having a look at the grafana logs, we noticed that grafana itself seems to be having problems with a locked database table. We suggest to delete the grafana pod and see if this problem occurs again. (There are many upstream issues open about this on grafana itself . If data in grafana is still not visible after the pod is deleted and restarted, can you provide the logs and events again. Thank you.
FYI: We could not reproduce this on a new cluster.
The pod has been deleted and still the issue persists.
Infact the logs provided are after restarting the pod again.
Can you double check you can see the data in both the Alertmanager and Prometheus dashboards and the only place you are not seeing it is in Grafana? That should help us pin-down the problem better.
Also we noticed a lot of "unauthorised access" errors in the grafana proxy logs, were these just failed login attempts? I believe you haven't provided the events and other information we asked above, that would make it easier to debug.
Can you make sure that Grafana can reach prometheus, exec into the grafana container and try to reach https://prometheus-k8s.openshift-monitoring.svc:9091, ping binary seems like it's not available, but curl should work.
The other thing we can try is to increase the logs verbosity if the above works.
Created attachment 1605901 [details]
UI screenshots of grafana and Prometheus target UI - grafana cluster-ui