Bug 2074236 - TALO not keeping up with large scale SNO ZTP deployment case
Summary: TALO not keeping up with large scale SNO ZTP deployment case
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: jun
QA Contact: yliu1
URL:
Whiteboard:
Depends On:
Blocks: 2075134
TreeView+ depends on / blocked
 
Reported: 2022-04-11 20:12 UTC by jun
Modified: 2022-08-26 16:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2075134 (view as bug list)
Environment:
Last Closed: 2022-08-26 16:43:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cluster-group-upgrades-operator pull 159 0 None open Bug 2074236: Performance fixes for ZTP scale test 2022-04-13 14:25:12 UTC

Description jun 2022-04-11 20:12:23 UTC
Description of problem:
TALO couldn't keep up with the amount of reconcile events in large scale ZTP SNO tests. As the number of installed SNOs increases, it takes longer and longer for TALO to get back to a particular SNO and progress its upgrade sequence further. Eventually it becomes too much and TALO starts to give up on SNOs that have gone beyond the 4 hour limit.

Version-Release number of selected component (if applicable):


How reproducible:
100%


Steps to Reproduce:
1. ZTP SNO deployment test at 50 clusters per hour or 100 cluster per hour
2. 
3.

Actual results:
See description

Expected results:
It should be able to do 50 clusters per hour at minimum. Ideally 100 as well or even higher.

Additional info:

Comment 2 jun 2022-04-13 17:22:49 UTC
Changed to verified to unblock backporting


Note You need to log in before you can comment on or make changes to this bug.