Bug 1884464

Summary: CVO marks an upgrade as failed when an operator takes more than 20 minutes to rollout
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Cluster Version OperatorAssignee: Scott Dodson <sdodson>
Status: CLOSED DEFERRED QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.5CC: aos-bugs, jokerman, wking
Target Milestone: ---Keywords: Reopened
Target Release: 4.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-25 16:01:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1884334    
Bug Blocks:    

Description OpenShift BugZilla Robot 2020-10-02 04:34:37 UTC
+++ This bug was initially created as a clone of Bug #1884334 +++

This bug was initially created as a copy of Bug #1862524

I am copying this bug because: 



Currently the CVO marks an upgrade as failed whenever an operator takes longer than 20 minutes to rollout. It's very common on clusters of any size to take more than 20 minutes to rollout operators which have daemonsets running on all hosts, in particular MCO, network, and dns operators. By moving this to 40 minutes we'll significantly reduce the noise so we can focus on upgrades which have real problems.

There's follow up to make more significant implementation changes here but we'll push those out more slowly

https://issues.redhat.com/browse/OTA-247

Comment 1 W. Trevor King 2020-10-04 02:28:20 UTC
Waiting on QE to verify the 4.6 bug.

Comment 2 W. Trevor King 2020-10-25 15:55:50 UTC
Bug 1884334 is now blocking on a refactor, so closing this for now.  We'll revisit backports once we have a complete fix.