Bug 2024766 - [sig-cluster-lifecycle] cluster upgrade should complete in 75m: minor updates timeout
Summary: [sig-cluster-lifecycle] cluster upgrade should complete in 75m: minor updates...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Test Framework
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: Devan Goodwin
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 2025722
TreeView+ depends on / blocked
 
Reported: 2021-11-18 22:39 UTC by W. Trevor King
Modified: 2021-11-28 19:33 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2025722 (view as bug list)
Environment:
Last Closed: 2021-11-28 19:31:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description W. Trevor King 2021-11-18 22:39:36 UTC
[sig-cluster-lifecycle] cluster upgrade should complete in 75m (105m on AWS)

is failing frequently in CI, see:
https://sippy.ci.openshift.org/sippy-ng/tests/4.8/analysis?test=%5Bsig-cluster-lifecycle%5D%20cluster%20upgrade%20should%20complete%20in%2075m%20(105m%20on%20AWS)

For example [1] blocked a 4.8 CI release:

  : [sig-cluster-lifecycle] cluster upgrade should complete in 75m (105m on AWS)	1h16m50s
  upgrade to registry.build03.ci.openshift.org/ci-op-ktnnb79c/release@sha256:326a14fdab07111e77882fdc34f26bed95f2254d3d8868faccd61cfe49f36017 took too long: 76.84 minutes

Common for several minor bumps:

  $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=96h&type=junit&name=release-master-ci&search=cluster+upgrade+should+complete+in' | grep 'failures match' | grep -v rehearse | sort -V
  periodic-ci-openshift-release-master-ci-4.8-e2e-gcp-upgrade (all) - 37 runs, 46% failed, 6% of failures match = 3% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-ovn-upgrade (all) - 4 runs, 100% failed, 100% of failures match = 100% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-upgrade (all) - 4 runs, 75% failed, 33% of failures match = 25% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-ovn-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-upgrade (all) - 37 runs, 70% failed, 58% of failures match = 41% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-openstack-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-ovirt-upgrade (all) - 16 runs, 69% failed, 18% of failures match = 13% impact
  periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-vsphere-upgrade (all) - 9 runs, 33% failed, 67% of failures match = 22% impact
  periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-azure-ovn-upgrade (all) - 4 runs, 100% failed, 25% of failures match = 25% impact
  periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-gcp-ovn-upgrade (all) - 2 runs, 50% failed, 100% of failures match = 50% impact
  periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-gcp-upgrade (all) - 4 runs, 50% failed, 100% of failures match = 50% impact
  periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-openstack-upgrade (all) - 4 runs, 100% failed, 50% of failures match = 50% impact
  periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-ovirt-upgrade (all) - 16 runs, 50% failed, 25% of failures match = 13% impact
  periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade (all) - 281 runs, 97% failed, 0% of failures match = 0% impact
  periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-openstack-upgrade (all) - 2 runs, 100% failed, 50% of failures match = 50% impact

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-upgrade/1461391694523011072

Comment 1 Scott Dodson 2021-11-28 19:32:51 UTC
It was discussed that we'd only loosen these tolerances in 4.9 as the problem doesn't seem present in 4.10 upgrades and if it becomes an issue there we'd prefer fixing whatever regression introduced the increase in upgrade duration rather than amending the tests.


Note You need to log in before you can comment on or make changes to this bug.