Current upgrade runs take about 40-60 minutes, even when only some changes are occurring. This is a long period of disruption and any misbehaving operators need individual bugs. On average, components except for control plane and the MCD should take less than 5 minutes. Here is a run that took ~70 minutes. There are lots of errors in the monitor log, which likely indicates errors that delayed normal roll out. The list of reported errors needs to be triaged by individual teams as bugs linked to this issue. https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/895 Please work with team leads to ensure they investigate runs like this and subdivide work.
Known bugs related to this overall theme: https://bugzilla.redhat.com/show_bug.cgi?id=1702414 https://bugzilla.redhat.com/show_bug.cgi?id=1702390 Assign to group lead assigned to build-cop responsibilities, leaving component set to Upgrades.
Closed because the immediate problems are resolved. I'm still working on getting information on how long components are expected to take, and working out how to determine how long a component took to complete the upgrade.
https://github.com/sjenning/oschart can help determine the slowdowns