Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1862524

Summary:	CVO marks an upgrade as failed when an operator takes more than 10 minutes to rollout
Product:	OpenShift Container Platform	Reporter:	Scott Dodson <sdodson>
Component:	Cluster Version Operator	Assignee:	W. Trevor King <wking>
Status:	CLOSED ERRATA	QA Contact:	Johnny Liu <jialiu>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.4	CC:	aos-bugs, jokerman
Target Milestone:	---
Target Release:	4.6.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	1866480 (view as bug list)		Environment:
Last Closed:	2020-10-27 16:21:54 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1866480

Description Scott Dodson 2020-07-31 15:56:18 UTC

Currently the CVO marks an upgrade as failed whenever an operator takes longer than 10 minutes to rollout. It's very common on clusters of any size to take more than 10 minutes to rollout operators which have daemonsets running on all hosts, in particular MCO, network, and dns operators. By moving this to 20 minutes we'll significantly reduce the noise so we can focus on upgrades which have real problems.

There's follow up to make more significant implementation changes here but we'll push those out more slowly

https://issues.redhat.com/browse/OTA-247

Comment 3 Johnny Liu 2020-08-05 14:28:05 UTC

Set up a 4.5 cluster with 3 masters + 9 workers, trigger upgrade towards 4.6.0-0.nightly-2020-08-04-035157.

Check one time per 5 mins, everything is working well.


08-05 20:50:01 The cluster will be updated to 4.6.0-0.nightly-2020-08-04-035157
08-05 20:50:01 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-08-04-035157
08-05 20:55:02 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 30% complete Progress: True Available: True
08-05 21:00:03 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True
08-05 21:05:03 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True
08-05 21:10:13 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True
08-05 21:15:14 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True
08-05 21:20:15 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:25:15 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:30:16 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:35:20 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:40:22 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:45:22 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True
08-05 21:50:23 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True
08-05 21:55:24 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True
08-05 22:00:25 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 84% complete Progress: True Available: True
08-05 22:05:25 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 84% complete Progress: True Available: True
08-05 22:10:27 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator machine-config has not yet successfully rolled out Progress: True Available: True
08-05 22:15:27 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 28% complete Progress: True Available: True
08-05 22:20:28 Status: Cluster version is 4.6.0-0.nightly-2020-08-04-035157 Progress: False Available: True

Comment 4 W. Trevor King 2020-08-05 23:03:46 UTC

I don't think we need doc text for this temporary bandaid.  We can add doc text when we raise the limit to infinity ;).

Comment 6 errata-xmlrpc 2020-10-27 16:21:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196