Bug 1705319

Summary: Jenkins slave pod tests failing (random uuid and /etc/passwd update)
Product: OpenShift Container Platform Reporter: Adam Kaplan <adam.kaplan>
Component: ContainersAssignee: Mrunal Patel <mpatel>
Status: CLOSED ERRATA QA Contact: Mike Fiedler <mifiedle>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.1.0CC: aos-bugs, bparees, dwalsh, gmontero, jokerman, mifiedle, mmccomas, pweil, sponnaga, wzheng
Target Milestone: ---Keywords: BetaBlocker
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:48:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1685458    

Description Adam Kaplan 2019-05-02 01:58:40 UTC
Description of problem:
Beginning approximately May 1, 2019 at 2:00PM EDT, e2e-aws-builds started failing consistently running the Jenkins pipeline tests.


Version-Release number of selected component (if applicable): 4.1.0-0.ci-2019-05-01-175409


How reproducible: Always


Additional info: https://openshift-gce-devel.appspot.com/builds/origin-ci-test/pr-logs/directory/pull-ci-openshift-origin-master-e2e-aws-builds

Comment 1 Gabe Montero 2019-05-02 13:33:59 UTC
Ben has PR https://github.com/openshift/jenkins/pull/846 up if the openshift/jenkins image *has* to react to recent cri-o changes

If those changes are reverted, either this bug will get assigned there, or will be closed as a dup of whatever bug they have that 
drives that work.

Comment 2 Gabe Montero 2019-05-02 17:03:02 UTC
Per discussions in https://coreos.slack.com/archives/CBBJY9GSX/p1556809989144600 and the results of Ben's https://github.com/openshift/jenkins/pull/846 where the file perm changes are preventing us from working around this,
the folks from the runtime team need to adjust crio per various directions from Clayton and others.

That way, the /etc/passwd file won't be manipulated by crio to among other things affect the setting of $HOME, which results in the jenkins remoting code attempting to create files off of "/" i.e. the root dir

Sending this bug to that team

Comment 3 Sudha Ponnaganti 2019-05-02 20:50:24 UTC
@mpatel.redhat.com - Mrunal - Wanted to check if you have a fix coming up. Would like to get this in to tonight or tomorrow build

Comment 4 Mrunal Patel 2019-05-02 21:31:03 UTC
We just tagged CRI-O 1.13.8 with a fix. It is being built and will be put into RHCOS pipeline.

Comment 6 Mike Fiedler 2019-05-03 12:32:43 UTC
Not yet available in an AMI used by the installer in a nightly build.

Comment 7 Shawn Hurley 2019-05-03 12:50:06 UTC
*** Bug 1705779 has been marked as a duplicate of this bug. ***

Comment 8 Gabe Montero 2019-05-03 13:16:42 UTC
The jenkins extended tests are passing again @Mike after failing consistently since the original /etc/passwd change dropped.
We've seen this across multiple PRs.

So it would seem the change is in fact in, at least from our perspective.

Also, my team as the originators are fine with just closing this out, unless you want to run some regression outside of 
our test case.

Comment 9 Mike Fiedler 2019-05-03 14:38:34 UTC
Ignore comment 6 - latest build has it.   Verified on 4.1.0-0.nightly-2019-05-03-093152 by successfully running Jenkins pipeline builds.


Also verified by testing the ASB install failure scenario in duplicate https://bugzilla.redhat.com/show_bug.cgi?id=1705779

Comment 11 errata-xmlrpc 2019-06-04 10:48:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758