Bug 2250826 - [GSS]Graphs in Grafana Dashboard are not showing consistent line graphs after upgrading from RHCS 4 to 5. [NEEDINFO]
Summary: [GSS]Graphs in Grafana Dashboard are not showing consistent line graphs after...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Dashboard
Version: 6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 8.1
Assignee: Aashish sharma
QA Contact: Vinayak Papnoi
Disha Walvekar
URL:
Whiteboard:
Depends On: 2228128
Blocks: 2261930
TreeView+ depends on / blocked
 
Reported: 2023-11-21 10:59 UTC by Aashish sharma
Modified: 2025-04-17 09:30 UTC (History)
15 users (show)

Fixed In Version: ceph-19.2.1-53.el9cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2228128
Environment:
Last Closed:
Embargoed:
vpapnoi: needinfo? (nia)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph pull 54540 0 None Merged quincy: mgr/dashboard: Consider null values as zero in grafana panels 2023-11-21 10:59:03 UTC
Red Hat Issue Tracker RHCEPH-7938 0 None None None 2023-11-21 11:00:00 UTC
Red Hat Issue Tracker RHCSDASH-1183 0 None None None 2023-11-21 11:00:05 UTC

Description Aashish sharma 2023-11-21 10:59:03 UTC
+++ This bug was initially created as a clone of Bug #2228128 +++

Description of problem:
After upgrading the lcuster from RHCS 4 to 5 (16.2.10-172) some of the graphs on the grafana dashboard are not consistent (Mix of both point and lines pattern).

See the attachments of non-consistent graphs.

Version-Release number of selected component (if applicable):
RHCS 5 (16.2.10-172)

How reproducible:
Customer specfic

Actual results:
Graphs are not consistent

Expected results:
Graphs should be consistent

--- Additional comment from Milind on 2023-08-01 13:10:58 UTC ---

Addititonal Information:
------------------------

Customer: North-West University
Account#: 614125
Case#: 03368113

Our analysis:
- The promethus logs are filled with warning messages, but not sure if these are relevant for the issue.
~~~
Jun 23 21:04:03 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:04:03.712Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:04:43 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:04:43.626Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:16:13 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:16:13.575Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 21:36:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T19:36:33.618Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:00:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:00:33.657Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:04:43 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:04:43.621Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:08:36 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:08:36.947Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:34:33 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:34:33.880Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
Jun 23 22:56:53 v-rhcs-arc-admin ceph-ec9f8b4a-c1d4-4dc0-96fc-2c5214843800-alertmanager-v-rhcs-arc-admin[2791159]: level=warn ts=2023-06-23T20:56:53.855Z caller=notify.go:674 component=dispatcher receiver=ceph-dashboard integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://v-rhcs-arc-mon01.nwu.ac.za:8443/api/prometheus_receiver\": EOF"
~~~

- We tried redpeloying the grafana and prometheus again but it didnt helped.
- After that we explored the grafana graph and during that exploration we saw that there are 4 types of pattern in which graphs shows.
  - line, bar, points, stack
- The issue is only seen when we change the graphs in the lines view, making the graph in other view works fine.
- Also this isue is only seen in few graphs not all.

I am attaching some graphs and the promethus logs from the customer dashboard.

--- Additional comment from Milind on 2023-08-01 13:12:40 UTC ---



--- Additional comment from Milind on 2023-08-01 13:13:14 UTC ---



--- Additional comment from Milind on 2023-08-01 13:13:41 UTC ---



--- Additional comment from Milind on 2023-08-01 13:15:01 UTC ---



--- Additional comment from Milind on 2023-08-01 13:16:07 UTC ---



--- Additional comment from Milind on 2023-08-04 18:29:20 UTC ---

Hi Team,

Can we have an update on this BZ?

Regards,
Milind

--- Additional comment from Aashish sharma on 2023-08-07 04:49:37 UTC ---

Hi Milind,

I am working on this one..will update soon.

Thanks
Aashish

--- Additional comment from Milind on 2023-08-15 09:48:39 UTC ---

Sure thanks for the update.

--- Additional comment from Milind on 2023-08-18 08:16:33 UTC ---

Hi Aashish,

Are you able to find the cause of this and any resolution?
The cusotmer is waiting for the udpate.

Regards,
Milind Verma

--- Additional comment from Aashish sharma on 2023-08-22 05:10:24 UTC ---

Hi Milind,

Apologies for the delay. We were in middle of a dev freeze..will update the bz asap

Thanks
Aashish

--- Additional comment from Milind on 2023-08-22 09:57:07 UTC ---

okay thanks Aashish

--- Additional comment from Aashish sharma on 2023-08-24 08:16:42 UTC ---



--- Additional comment from Aashish sharma on 2023-08-24 08:20:41 UTC ---

Hi Milind,

I have attached a video recording of some steps that the customer can try in grafana to get rid of the above issue. Please let me know if the issue persists after these steps.

Thanks
Aashish

--- Additional comment from Milind on 2023-08-24 14:21:36 UTC ---

Hi Aashish,

the video recording worked for the customer and the issue is resolved.
Can we have the RCA for this issue?

Regards,
Milind

--- Additional comment from Aashish sharma on 2023-09-04 05:01:35 UTC ---

Hi Milind,

Thank you for confirming that the issue got resolved. In RHCS4, we used to intialise the value of a metric with a 0, whether or not the metric has some value. From RHCS5 onwards, we do not intialise a metric until it has some value. This behavior needs to be taken care of in the grafana jsons as well where we have a configuration 'nullPointMode' and the value of this parameter needs to be 'null as zero' so that it assumes the value as 0 when no data is present in the metric. I will raise a PR to permanently fix this issue.

Thanks
Aashish

Comment 7 Scott Ostapovicz 2024-03-19 11:52:07 UTC
Missed the z5 gate.  Retargeting to z6.


Note You need to log in before you can comment on or make changes to this bug.