Bug 1970975 - 4.7 -> 4.8 upgrades on AWS take longer than expected
Summary: 4.7 -> 4.8 upgrades on AWS take longer than expected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Vadim Rutkovsky
QA Contact: Ke Wang
URL:
Whiteboard: tag-ci
Depends On: 1974237
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-11 15:14 UTC by Vadim Rutkovsky
Modified: 2021-07-27 23:12 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:12:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 26219 0 None closed Bug 1970975: upgrade test: expect upgrades to take longer on AWS 2021-07-15 01:19:08 UTC
Github openshift origin pull 26252 0 None closed Bug 1970975: upgrade: extend upgrade duration to 105mins on AWS 2021-07-15 01:19:12 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:12:57 UTC

Description Vadim Rutkovsky 2021-06-11 15:14:19 UTC
Due to https://bugzilla.redhat.com/show_bug.cgi?id=1943804 AWS kube-apiserver rollouts take longer. As a result upgrade doesn't fit into expected 75 mins.

Test should be updated to expect updated take less than 90 mins on AWS

Comment 2 Ke Wang 2021-06-15 09:42:51 UTC
As of the PR was merged in, the e2e upgrade tests of aws 

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=sig-cluster-lifecycle.*cluster+upgrade+should+complete+in+&maxAge=96h&context=3&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep '4.7-e2e-aws-ovn-upgrade' | grep 'failures match' | sort

periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade (all) - 53 runs, 100% failed, 4% of failures match = 4% impact

- I pasted the two failed upgrade tests as following, seems the upgrade still took long than 90 minutes.
#1403580382435086336	junit	3 days ago	

Jun 12 07:07:14.959 - 41ms  E ns/openshift-image-registry route/test-disruption Route is not responding to GET requests on reused connections
Jun 12 07:07:15.001 I ns/openshift-image-registry route/test-disruption Route started responding to GET requests on reused connections
# [sig-cluster-lifecycle] cluster upgrade should complete in 75m (90m on AWS)
upgrade to registry.build01.ci.openshift.org/ci-op-9cc1f0dl/release@sha256:71d4e745782dbf0aeee19fd64cb6f38bbb5bfe76cbc233c3d086f498b4ecc8b7 took too long: 91.83 minutes

#1403539606275624960	junit	3 days ago	

Jun 12 04:36:35.649 - 49ms  E ns/openshift-image-registry route/test-disruption Route is not responding to GET requests on reused connections
Jun 12 04:36:35.698 I ns/openshift-image-registry route/test-disruption Route started responding to GET requests on reused connections
# [sig-cluster-lifecycle] cluster upgrade should complete in 75m (90m on AWS)
upgrade to registry.build01.ci.openshift.org/ci-op-nxmkqszc/release@sha256:924ca15bc52230c0cd2485aa0307396b06283cc3403a518684f4d5d5e7ae67e8 took too long: 91.83 minutes

Comment 3 Ke Wang 2021-06-15 10:41:43 UTC
The PR fix doesn't fix the issue that upgrade on AWS takes too long, so I assign the bug back.

Comment 4 Vadim Rutkovsky 2021-06-15 10:52:00 UTC
Created https://github.com/openshift/origin/pull/26230 to bump this to 105m in 4.9, will backport to 4.8 once its approved

Comment 6 Ke Wang 2021-07-15 01:22:56 UTC
$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=sig-cluster-lifecycle.*cluster+upgrade+should+complete+in+&maxAge=168h&context=3&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep '4.7-e2e-aws-ovn-upgrade' | grep 'failures match' | sort
 No results found

So we can see there is no e2e upgrade tests of aws occurred in past 7days, move the bug VERIFIED.

Comment 9 errata-xmlrpc 2021-07-27 23:12:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.