Bug 2060329

Summary: Detect unsupported amount of workloads before rendering a lazy or crashing topology
Product: OpenShift Container Platform Reporter: Christoph Jerolimov <cjerolim>
Component: Dev ConsoleAssignee: Sahil Budhwar <sbudhwar>
Status: CLOSED ERRATA QA Contact: spathak <spathak>
Severity: high Docs Contact: Olivia Payne <opayne>
Priority: high    
Version: 4.7CC: cbremble, nmukherj, sbudhwar, vismishr
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Topology with more than 100+ nodes was crashing/lagging on load Consequence: Topology becomes unusable Fix: If topology has more than 100 nodes we show a new page with the description "We noticed that it is taking a long time to visualize your application Topology. You can use Search to find specific resources or click Continue to keep waiting." which lets user to either continue with topology view or choose to see resources in list view Result: User can load resources in list view and page loads without any crash/lags.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 10:51:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2087065    

Description Christoph Jerolimov 2022-03-03 10:53:55 UTC
Description of problem:
As a user, I was stopped from using the developer perspective when switching into a namespace with a lot of workloads (Deployments, Pods, etc.)

This is a follow up on https://bugzilla.redhat.com/show_bug.cgi?id=2006395

We recommend the following safety precautions against a lazy or crashing topology, also if we continue to work on performance improvements to allow more workloads rendered.

At the moment we expect that a topology with around about 100 nodes could be displayed. This could also depend on the node types, the used browser, the computer power of the PC, and how often the workload conditions changes.


Recommended safety guard:
The topology graph (maybe the list as well) should check how many nodes are fetched and will be rendered.

1. We need to evaluate if we could make this decision based on the shown graph nodes and edges or the number of underlying resources.

For example, is it required to count each Pod in a Deployment or not?

2. Based on a threshold (~ 100?) the topology graph should skip the rendering.

3. We should show a 'warning page' instead, which explains that the topology could not handle this amount of X nodes at the moment.

4. This page could have an option to "Show topology anyway" so that users who don't have issues here can still use the topology.

Comment 4 Christoph Jerolimov 2022-05-23 08:46:18 UTC
Verified on 4.11.0-0.nightly-2022-05-20-213928

Comment 6 errata-xmlrpc 2022-08-10 10:51:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069