Currently the CVO marks an upgrade as failed whenever an operator takes longer than 10 minutes to rollout. It's very common on clusters of any size to take more than 10 minutes to rollout operators which have daemonsets running on all hosts, in particular MCO, network, and dns operators. By moving this to 20 minutes we'll significantly reduce the noise so we can focus on upgrades which have real problems. There's follow up to make more significant implementation changes here but we'll push those out more slowly https://issues.redhat.com/browse/OTA-247
Set up a 4.5 cluster with 3 masters + 9 workers, trigger upgrade towards 4.6.0-0.nightly-2020-08-04-035157. Check one time per 5 mins, everything is working well. 08-05 20:50:01 The cluster will be updated to 4.6.0-0.nightly-2020-08-04-035157 08-05 20:50:01 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-08-04-035157 08-05 20:55:02 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 30% complete Progress: True Available: True 08-05 21:00:03 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True 08-05 21:05:03 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True 08-05 21:10:13 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True 08-05 21:15:14 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True 08-05 21:20:15 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:25:15 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:30:16 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:35:20 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:40:22 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:45:22 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator monitoring is degraded Progress: True Available: True 08-05 21:50:23 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 79% complete Progress: True Available: True 08-05 21:55:24 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator network has not yet successfully rolled out Progress: True Available: True 08-05 22:00:25 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 84% complete Progress: True Available: True 08-05 22:05:25 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 84% complete Progress: True Available: True 08-05 22:10:27 Status: Unable to apply 4.6.0-0.nightly-2020-08-04-035157: the cluster operator machine-config has not yet successfully rolled out Progress: True Available: True 08-05 22:15:27 Status: Working towards 4.6.0-0.nightly-2020-08-04-035157: 28% complete Progress: True Available: True 08-05 22:20:28 Status: Cluster version is 4.6.0-0.nightly-2020-08-04-035157 Progress: False Available: True
I don't think we need doc text for this temporary bandaid. We can add doc text when we raise the limit to infinity ;).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196