Bug 1823916

Summary: [4.4] e2e-aws-scaleup-rhel7 constantly failing
Product: OpenShift Container Platform Reporter: Kirsten Garrison <kgarriso>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Johnny Liu <jialiu>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: unspecified CC: rteague
Version: 4.4Keywords: Reopened
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1779811
: 1824150 (view as bug list) Environment:
Last Closed: 2020-05-28 12:55:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1779811, 1824298    
Bug Blocks: 1824150    

Description Kirsten Garrison 2020-04-14 19:19:00 UTC
+++ This bug was initially created as a clone of Bug #1779811 +++

This bug was initially created as a copy of Bug #1766792

I am copying this bug because: 
The job is still not stable in master and has failed ~70 times in a row.  The last time it passed was on 11/21/19. Really unsure of the value of this job.


Description of problem:
this job seems to be basically broken.  going into the history in the last 168 runs it's only passed 17 times.  can we eliminate these tests until it is actually able to run correctly? it doesn't seem like an efficient use of our resources right now.

For ref: https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-master-e2e-aws-scaleup-rhel7?buildId=


How reproducible:
Look at most of its ci runs

Actual results:
It always fails

Expected results:
It should generally be passing unless there is a reason for it to fail.

--- Additional comment from errata-xmlrpc on 2020-02-28 05:12:29 UTC ---

This bug has been added to advisory RHBA-2020:51809 by OpenShift Release Team Bot (ocp-build/buildvm.openshift.eng.bos.redhat.com)

--- Additional comment from errata-xmlrpc on 2020-02-28 05:12:30 UTC ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2020:51809-01
https://errata.devel.redhat.com/advisory/51809

--- Additional comment from Gaoyun Pei on 2020-03-09 09:41:21 UTC ---

No issue found in QE's testing on RHEL scale-up in 4.5. Hi Russell, could you help to check on the CI job? Thanks.

--- Additional comment from Russell Teague on 2020-03-10 13:07:02 UTC ---

The changes in the linked GitHub issue were for fixing issues with how rhel nodes were provisioned in CI which was causing e2e tests to fail.  This bug does not need QE review.

--- Additional comment from Kirsten Garrison on 2020-04-14 18:56:24 UTC ---

This job is still failing :
https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-master-e2e-aws-scaleup-rhel7?buildId=

--- Additional comment from Kirsten Garrison on 2020-04-14 19:00:46 UTC ---

There's something going on in this job, it's been failing consistently across releases. If it doesn't work reliably why is there a CI job?

See prior report as well: https://bugzilla.redhat.com/show_bug.cgi?id=1766792

--- Additional comment from Kirsten Garrison on 2020-04-14 19:08:06 UTC ---

This is also failing on 4.4 https://prow.svc.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-machine-config-operator-release-4.4-e2e-aws-scaleup-rhel7?buildId=

Comment 1 Kirsten Garrison 2020-04-14 19:20:16 UTC
Possible cause is here https://bugzilla.redhat.com/show_bug.cgi?id=1820717

Comment 3 Russell Teague 2020-05-28 12:55:06 UTC
Closing this because the 4.4 specific bugs have been addressed.