Graphs on the Alerting, Metrics and Dashboards pages for both Dev and Admin perspectives all use the `QueryBrowser` component to render their graphs. The render time of this component has got slower since the 4.5 release.
I tested with Performance tab of chrome developer tools, you fix is in build 4.6.0-0.nightly-2020-09-21-230455 1. Installed ocp cluster with payload 4.6.0-0.nightly-2020-09-21-093308 which doesn't include your fix 2. Installed ocp cluster with payload 4.6.0-0.nightly-2020-09-23-022756 which include your fix 3. Perform example query and collect performance data, see no performance enhancement sort_desc(sum(sum_over_time(ALERTS{alertstate="firing"}[24h])) by (alertname)) 4. Perform query query with big data and collect performance data, filed product bug https://bugzilla.redhat.com/show_bug.cgi?id=1880698 cluster_quantile:apiserver_request_duration_seconds:histogram_quantile I collect the following data five time for each ENV, suppose the Painting time or the Rendering time should be enhanced, see no enhancement for all the times 38 ms Loading 3204 ms Scripting 422 ms Rendering 24 ms Painting 561 ms System 4622 ms Idle 8871 ms Total
Full test result https://files.slack.com/files-pri/T027F3GAJ-F01BZTTSYJC/image.png
The following bug is not caused by the current fix. https://bugzilla.redhat.com/show_bug.cgi?id=1880698 I reopen the bug for my test see no performance enhancement.
Will do performance again with two cluster with same data series
Launch chrome in incognito mode and collect performance data with Chrome developer tools, test results is as below: From the results we can know that the performance is enhanced with bug's fix sort_desc(sum(sum_over_time(ALERTS{alertstate="firing"}[24h])) by (alertname)) fix is not in 2 time series 1 3204 ms Scripting 2 3160 ms Scripting 3 3007 ms Scripting 4 2863 ms Scripting 5 2806 ms Scripting total:15040 fix is in 2 time series 2044 ms Scripting 2224 ms Scripting 2314 ms Scripting 2359 ms Scripting 2468 ms Scripting 11409(total) topk(5, cluster_quantile:apiserver_request_duration_seconds:histogram_quantile) fix is not in(634 time series) 13479 Ms Scripting 4164 ms Scripting 11556 ms Scripting 5993 ms Scripting 5756 ms Scripting 40948(total) fix is in (742 time series) 8457 ms Scripting 9596 ms Scripting 3134 ms Scripting 3856 ms Scripting 8953 ms Scripting 33996(total)
We have a follow on fix that improves performance a bit more. Moving back to assigned.
Test with payload 4.6.0-0.nightly-2020-09-28-171716 sort_desc(sum(sum_over_time(ALERTS{alertstate="firing"}[24h])) by (alertname)) 5 time series 2062 msScripting 2117 msScripting 2187 msScripting 2251 msScripting 1756 msScripting 10373 cluster_quantile:apiserver_request_duration_seconds:histogram_quantile 858 timeseries 2161 msScripting 1550 msScripting 2761 msScripting 2350 msScripting 2587 msScripting 11409
From the results, we can know that performance improve greatly when the query returns big data
*** Bug 1795401 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196