Hide Forgot
[sig-cluster-lifecycle] cluster upgrade should complete in 75m (105m on AWS) is failing frequently in CI, see: https://sippy.ci.openshift.org/sippy-ng/tests/4.8/analysis?test=%5Bsig-cluster-lifecycle%5D%20cluster%20upgrade%20should%20complete%20in%2075m%20(105m%20on%20AWS) For example [1] blocked a 4.8 CI release: : [sig-cluster-lifecycle] cluster upgrade should complete in 75m (105m on AWS) 1h16m50s upgrade to registry.build03.ci.openshift.org/ci-op-ktnnb79c/release@sha256:326a14fdab07111e77882fdc34f26bed95f2254d3d8868faccd61cfe49f36017 took too long: 76.84 minutes Common for several minor bumps: $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=96h&type=junit&name=release-master-ci&search=cluster+upgrade+should+complete+in' | grep 'failures match' | grep -v rehearse | sort -V periodic-ci-openshift-release-master-ci-4.8-e2e-gcp-upgrade (all) - 37 runs, 46% failed, 6% of failures match = 3% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-ovn-upgrade (all) - 4 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-upgrade (all) - 4 runs, 75% failed, 33% of failures match = 25% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-ovn-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-upgrade (all) - 37 runs, 70% failed, 58% of failures match = 41% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-openstack-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-ovirt-upgrade (all) - 16 runs, 69% failed, 18% of failures match = 13% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-vsphere-upgrade (all) - 9 runs, 33% failed, 67% of failures match = 22% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-azure-ovn-upgrade (all) - 4 runs, 100% failed, 25% of failures match = 25% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-gcp-ovn-upgrade (all) - 2 runs, 50% failed, 100% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-gcp-upgrade (all) - 4 runs, 50% failed, 100% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-openstack-upgrade (all) - 4 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-ovirt-upgrade (all) - 16 runs, 50% failed, 25% of failures match = 13% impact periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade (all) - 281 runs, 97% failed, 0% of failures match = 0% impact periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-openstack-upgrade (all) - 2 runs, 100% failed, 50% of failures match = 50% impact [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-upgrade/1461391694523011072
It was discussed that we'd only loosen these tolerances in 4.9 as the problem doesn't seem present in 4.10 upgrades and if it becomes an issue there we'd prefer fixing whatever regression introduced the increase in upgrade duration rather than amending the tests.