Bug 2250829

Summary: [cee/sd][ceph-dashboard] Graphs in Grafana Dashboard are not showing consistent line graphs after upgrading from RHCS 4 to 5.
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Aashish sharma <aasharma>
Component: Ceph-DashboardAssignee: Aashish sharma <aasharma>
Status: CLOSED CURRENTRELEASE QA Contact: Vinayak Papnoi <vpapnoi>
Severity: medium Docs Contact: Anjana Suparna Sriram <asriram>
Priority: unspecified    
Version: 7.0CC: aasharma, asriram, ceph-eng-bugs, cephqe-warriors, kdreyer, milverma, mobisht, nia, pegonzal, saraut, sostapov, tserlin, vdas, vereddy
Target Milestone: ---   
Target Release: 7.1z4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
.Some metrics are displayed as null leading to blank spaces in graphs Some metrics on the Ceph dashboard are shown as null, which leads to blank space in the graphs since you do not initialize a metric until it has some value. As a workaround, edit the Grafana panel in which the issue is present. From the _Edit_ menu, click _Migrate_ and select _Connect Nulls_. Choose _Always_ and the issue is resolved.
Story Points: ---
Clone Of: 2228128 Environment:
Last Closed: 2025-04-10 09:23:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aashish sharma 2023-11-21 11:03:59 UTC
+++ This bug was initially created as a clone of Bug #2228128 +++

Description of problem:
After upgrading the lcuster from RHCS 4 to 5 (16.2.10-172) some of the graphs on the grafana dashboard are not consistent (Mix of both point and lines pattern).

See the attachments of non-consistent graphs.

Version-Release number of selected component (if applicable):
RHCS 5 (16.2.10-172)

How reproducible:
Customer specfic

Actual results:
Graphs are not consistent

Expected results:
Graphs should be consistent

--- Additional comment from Milind on 2023-08-01 13:10:58 UTC ---

Addititonal Information:
------------------------

Customer: North-West University
Account#: 614125
Case#: 03368113

Our analysis:
- The promethus logs are filled with warning messages, but not sure if these are relevant for the issue.
~~~
Jun 23 21:04:03 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:04:03.712Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:04:43 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:04:43.626Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:16:13 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:16:13.575Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:36:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:36:33.618Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:00:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:00:33.657Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:04:43 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:04:43.621Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:08:36 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:08:36.947Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:34:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:34:33.880Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:56:53 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:56:53.855Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
~~~

- We tried redpeloying the grafana and prometheus again but it didnt helped.
- After that we explored the grafana graph and during that exploration we saw that there are 4 types of pattern in which graphs shows.
  - line, bar, points, stack
- The issue is only seen when we change the graphs in the lines view, making the graph in other view works fine.
- Also this isue is only seen in few graphs not all.

I am attaching some graphs and the promethus logs from the customer dashboard.

--- Additional comment from Milind on 2023-08-01 13:12:40 UTC ---



--- Additional comment from Milind on 2023-08-01 13:13:14 UTC ---



--- Additional comment from Milind on 2023-08-01 13:13:41 UTC ---



--- Additional comment from Milind on 2023-08-01 13:15:01 UTC ---



--- Additional comment from Milind on 2023-08-01 13:16:07 UTC ---



--- Additional comment from Milind on 2023-08-04 18:29:20 UTC ---

Hi Team,

Can we have an update on this BZ?

Regards,
Milind

--- Additional comment from Aashish sharma on 2023-08-07 04:49:37 UTC ---

Hi Milind,

I am working on this one..will update soon.

Thanks
Aashish

--- Additional comment from Milind on 2023-08-15 09:48:39 UTC ---

Sure thanks for the update.

--- Additional comment from Milind on 2023-08-18 08:16:33 UTC ---

Hi Aashish,

Are you able to find the cause of this and any resolution?
The cusotmer is waiting for the udpate.

Regards,
Milind Verma

--- Additional comment from Aashish sharma on 2023-08-22 05:10:24 UTC ---

Hi Milind,

Apologies for the delay. We were in middle of a dev freeze..will update the bz asap

Thanks
Aashish

--- Additional comment from Milind on 2023-08-22 09:57:07 UTC ---

okay thanks Aashish

--- Additional comment from Aashish sharma on 2023-08-24 08:16:42 UTC ---



--- Additional comment from Aashish sharma on 2023-08-24 08:20:41 UTC ---

Hi Milind,

I have attached a video recording of some steps that the customer can try in grafana to get rid of the above issue. Please let me know if the issue persists after these steps.

Thanks
Aashish

--- Additional comment from Milind on 2023-08-24 14:21:36 UTC ---

Hi Aashish,

the video recording worked for the customer and the issue is resolved.
Can we have the RCA for this issue?

Regards,
Milind

--- Additional comment from Aashish sharma on 2023-09-04 05:01:35 UTC ---

Hi Milind,

Thank you for confirming that the issue got resolved. In RHCS4, we used to intialise the value of a metric with a 0, whether or not the metric has some value. From RHCS5 onwards, we do not intialise a metric until it has some value. This behavior needs to be taken care of in the grafana jsons as well where we have a configuration 'nullPointMode' and the value of this parameter needs to be 'null as zero' so that it assumes the value as 0 when no data is present in the metric. I will raise a PR to permanently fix this issue.

Thanks
Aashish

Comment 5 Scott Ostapovicz 2024-04-01 12:12:10 UTC
Retargeting this to 8.0, as per Aashish.