Bug 2047833

Summary: [sig-cluster-lifecycle] cluster upgrade should complete in 75.00 minutes - Inconsistency and consistent failure of the test in upgrade jobs of s390x
Product: OpenShift Container Platform Reporter: Lakshmi Ravichandran <lakshmi.ravichandran1>
Component: Multi-ArchAssignee: Deep Mistry <dmistry>
Multi-Arch sub component: IBM P / Z QA Contact: Douglas Slavens <dslavens>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: low CC: aos-bugs, danili, dorzel, eparis, mfojtik, sttts, wking, xxia
Version: 4.10   
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-18 19:33:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2049750    
Bug Blocks:    

Description Lakshmi Ravichandran 2022-01-28 16:32:11 UTC
Description of problem:

In 4.9 to 4.10 upgrade jobs:
-----------------------------------------

the time limit considered for upgrade is 75.00 mins.

'[sig-cluster-lifecycle] cluster upgrade should complete in 75.00 minutes '
which consistently fails as the upgrades take more time to complete on s390x platform.

https://search.ci.openshift.org/?search=cluster+upgrade+should+complete+in+75&maxAge=336h&context=1&type=junit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

In 4.7 -> 4.8, 4.8 -> 4.9 upgrade jobs :
-----------------------------------------

the time limit considered for upgrade is 90.00 mins.

'[sig-cluster-lifecycle] cluster upgrade should complete in 90.00 minutes' 
https://search.ci.openshift.org/?search=cluster+upgrade+should+complete+in+90&maxAge=336h&context=1&type=junit&name=s390x&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

----------------------------------------------------------------------------------

* Inconsistency is observed on the time limit used for upgrade across 4.9 -> 4.10 and 4.7 to 4.8, 4.8 to 4.9.

* Consistent failure of this test is observed where the upgrade jobs take longer than both of the time limit used in s390x platform. The logs indicate, at most times, the upgrades take > 94 minutes to complete. so, there is consistent failure of this test across different versions.

----------------------------------------------------------------------------------


Version-Release number of selected component (if applicable):
4.10, 4.9. 4.8


How reproducible:
search ci links as given above.

Steps to Reproduce:
1. by going through the jobs in the above search.ci links

Actual results:
test '[sig-cluster-lifecycle] cluster upgrade should complete in 75.00 minutes ', '[sig-cluster-lifecycle] cluster upgrade should complete in 90.00 minutes ' fails.

Expected results:
The time limit for the upgrade jobs in s390x platform to be increased consistently.

Additional info:

Comment 2 Dan Li 2022-02-08 19:08:45 UTC
Setting Blocker- for this bug reasons being:
 
1. Per Axel's test result in Comment 6, the instability/slowness is not observed on every condition
2. Similar 4.10 upgrade performance bugs (e.g. BZ 2034367 and BZ 2047828) do not seem to block 4.10 GA.
3. Another associated bug on instability during upgrade BZ 2049750 is set as Blocker-

Comment 3 Jeremy Poulin 2022-02-09 17:47:34 UTC
Going to pass this on to Deep since the investigation is on the CI side.
Feel free to reassign against w/in that space.

Comment 4 Dan Li 2022-02-14 18:28:43 UTC
Hi Deep, do you know if this bug will be resolved before the end of the current sprint (Feb 19th)? If not, can we set "reviewed-in-sprint"?

Comment 5 Deep Mistry 2022-02-14 18:50:18 UTC
Requires more investigation

Comment 6 Dan Li 2022-03-07 14:32:44 UTC
Setting reviewed-in-sprint flag as Deep is OOTO and this bug is unlikely to be resolved before the end of the current sprint. This bug is a blocker-

Comment 7 Dan Li 2022-03-28 17:34:51 UTC
Since Deep is OOTO this week and this bug is not a blocker, I am keeping the "reviewed-in-sprint+" flag as it is unlikely that this bug will get resolved this week.

Comment 8 Dan Li 2022-04-18 14:26:54 UTC
Hi Deep, will this bug be resolved before the end of the current sprint (April 23rd)? If not, can we set the "reviewed-in-sprint" flag?

Comment 9 Dan Li 2022-04-18 19:33:40 UTC
Closing as this bug is a duplicate of BZ 2049750 - the discussion for a potential fix will take place on that bug

*** This bug has been marked as a duplicate of bug 2049750 ***