Bug 1970975

Summary: 4.7 -> 4.8 upgrades on AWS take longer than expected
Product: OpenShift Container Platform Reporter: Vadim Rutkovsky <vrutkovs>
Component: kube-apiserverAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: high Docs Contact:
Priority: high    
Version: 4.8CC: aos-bugs, mfojtik, vrutkovs, xxia
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: tag-ci
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:12:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1974237    
Bug Blocks:    

Description Vadim Rutkovsky 2021-06-11 15:14:19 UTC
Due to https://bugzilla.redhat.com/show_bug.cgi?id=1943804 AWS kube-apiserver rollouts take longer. As a result upgrade doesn't fit into expected 75 mins.

Test should be updated to expect updated take less than 90 mins on AWS

Comment 2 Ke Wang 2021-06-15 09:42:51 UTC
As of the PR was merged in, the e2e upgrade tests of aws 

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=sig-cluster-lifecycle.*cluster+upgrade+should+complete+in+&maxAge=96h&context=3&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep '4.7-e2e-aws-ovn-upgrade' | grep 'failures match' | sort

periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade (all) - 53 runs, 100% failed, 4% of failures match = 4% impact

- I pasted the two failed upgrade tests as following, seems the upgrade still took long than 90 minutes.
#1403580382435086336	junit	3 days ago	

Jun 12 07:07:14.959 - 41ms  E ns/openshift-image-registry route/test-disruption Route is not responding to GET requests on reused connections
Jun 12 07:07:15.001 I ns/openshift-image-registry route/test-disruption Route started responding to GET requests on reused connections
# [sig-cluster-lifecycle] cluster upgrade should complete in 75m (90m on AWS)
upgrade to registry.build01.ci.openshift.org/ci-op-9cc1f0dl/release@sha256:71d4e745782dbf0aeee19fd64cb6f38bbb5bfe76cbc233c3d086f498b4ecc8b7 took too long: 91.83 minutes

#1403539606275624960	junit	3 days ago	

Jun 12 04:36:35.649 - 49ms  E ns/openshift-image-registry route/test-disruption Route is not responding to GET requests on reused connections
Jun 12 04:36:35.698 I ns/openshift-image-registry route/test-disruption Route started responding to GET requests on reused connections
# [sig-cluster-lifecycle] cluster upgrade should complete in 75m (90m on AWS)
upgrade to registry.build01.ci.openshift.org/ci-op-nxmkqszc/release@sha256:924ca15bc52230c0cd2485aa0307396b06283cc3403a518684f4d5d5e7ae67e8 took too long: 91.83 minutes

Comment 3 Ke Wang 2021-06-15 10:41:43 UTC
The PR fix doesn't fix the issue that upgrade on AWS takes too long, so I assign the bug back.

Comment 4 Vadim Rutkovsky 2021-06-15 10:52:00 UTC
Created https://github.com/openshift/origin/pull/26230 to bump this to 105m in 4.9, will backport to 4.8 once its approved

Comment 6 Ke Wang 2021-07-15 01:22:56 UTC
$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=sig-cluster-lifecycle.*cluster+upgrade+should+complete+in+&maxAge=168h&context=3&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep '4.7-e2e-aws-ovn-upgrade' | grep 'failures match' | sort
 No results found

So we can see there is no e2e upgrade tests of aws occurred in past 7days, move the bug VERIFIED.

Comment 9 errata-xmlrpc 2021-07-27 23:12:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438