Bug 1824150

Summary: [4.3] e2e-aws-scaleup-rhel7 constantly failing
Product: OpenShift Container Platform Reporter: Russell Teague <rteague>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Johnny Liu <jialiu>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: unspecified CC: jialiu, kgarriso, rteague
Version: 4.4Keywords: Reopened
Target Milestone: ---   
Target Release: 4.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1823916 Environment:
Last Closed: 2020-05-28 12:54:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1779811, 1823916, 1824300    
Bug Blocks: 1816959    

Description Russell Teague 2020-04-15 12:51:48 UTC
+++ This bug was initially created as a clone of Bug #1823916 +++

+++ This bug was initially created as a clone of Bug #1779811 +++

This bug was initially created as a copy of Bug #1766792

I am copying this bug because: 
The job is still not stable in master and has failed ~70 times in a row.  The last time it passed was on 11/21/19. Really unsure of the value of this job.


Description of problem:
this job seems to be basically broken.  going into the history in the last 168 runs it's only passed 17 times.  can we eliminate these tests until it is actually able to run correctly? it doesn't seem like an efficient use of our resources right now.

For ref: https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-master-e2e-aws-scaleup-rhel7?buildId=


How reproducible:
Look at most of its ci runs

Actual results:
It always fails

Expected results:
It should generally be passing unless there is a reason for it to fail.

--- Additional comment from errata-xmlrpc on 2020-02-28 05:12:29 UTC ---

This bug has been added to advisory RHBA-2020:51809 by OpenShift Release Team Bot (ocp-build/buildvm.openshift.eng.bos.redhat.com)

--- Additional comment from errata-xmlrpc on 2020-02-28 05:12:30 UTC ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2020:51809-01
https://errata.devel.redhat.com/advisory/51809

--- Additional comment from Gaoyun Pei on 2020-03-09 09:41:21 UTC ---

No issue found in QE's testing on RHEL scale-up in 4.5. Hi Russell, could you help to check on the CI job? Thanks.

--- Additional comment from Russell Teague on 2020-03-10 13:07:02 UTC ---

The changes in the linked GitHub issue were for fixing issues with how rhel nodes were provisioned in CI which was causing e2e tests to fail.  This bug does not need QE review.

--- Additional comment from Kirsten Garrison on 2020-04-14 18:56:24 UTC ---

This job is still failing :
https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-master-e2e-aws-scaleup-rhel7?buildId=

--- Additional comment from Kirsten Garrison on 2020-04-14 19:00:46 UTC ---

There's something going on in this job, it's been failing consistently across releases. If it doesn't work reliably why is there a CI job?

See prior report as well: https://bugzilla.redhat.com/show_bug.cgi?id=1766792

--- Additional comment from Kirsten Garrison on 2020-04-14 19:08:06 UTC ---

This is also failing on 4.4 https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-release-4.4-e2e-aws-scaleup-rhel7?buildId=

--- Additional comment from Kirsten Garrison on 2020-04-14 15:20:16 EDT ---

Possible cause is here https://bugzilla.redhat.com/show_bug.cgi?id=1820717

--- Additional comment from Kirsten Garrison on 2020-04-14 15:21:40 EDT ---

4.4 runs are permafailing: https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-release-4.4-e2e-aws-scaleup-rhel7?buildId=

Comment 1 Russell Teague 2020-05-28 12:54:47 UTC
Closing this because the 4.3 specific bugs have been addressed.