Bug 1614493

Summary: [3.10] Binary builds with 'large' input hangs and never completes
Product: OpenShift Container Platform Reporter: Ben Parees <bparees>
Component: BuildAssignee: Ben Parees <bparees>
Status: CLOSED ERRATA QA Contact: wewang <wewang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, wkulhane, wzheng
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: A race condition when piping output from a tar stream extraction. Consequence: Binary builds with large numbers of files could hang indefinitely. Fix: Reverted the tar streaming logic to use a previous mechanism which does not have a race condition. Result: Binary builds with large numbers of files complete normally.
Story Points: ---
Clone Of: 1614347 Environment:
Last Closed: 2018-08-31 06:18:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1614347    
Bug Blocks:    

Description Ben Parees 2018-08-09 18:11:25 UTC
+++ This bug was initially created as a clone of Bug #1614347 +++

Description of problem:
As of 3.10 binary builds with input greater than 10Mb (empirical, may be a bit higher) fail.

Version-Release number of selected component (if applicable):
3.10.14

How reproducible:
Every time

Steps to Reproduce:
1. git clone https://github.com/wkulhanek/rhte-app
2. cd rhte-app
3. npm install # this will create 28MB of dependencies in the directory
4. oc new-build nodejs --binary=true --name=test
5. oc start-build test --from-dir=.
6. Hangs


Additional info:
Already discussed with Ben Parees in openshift-sme mailing list. This also happens with httpd as a builder image and a static web site as input. A web site that's 9MB in size works. One that's 55Mb in size fails.

--- Additional comment from Ben Parees on 2018-08-09 14:10:50 EDT ---

https://github.com/openshift/origin/pull/20592

Comment 1 Ben Parees 2018-08-09 18:59:28 UTC
https://github.com/openshift/ose/pull/1385

Comment 3 wewang 2018-08-21 06:05:35 UTC
Verified in openshift version: v3.10.28, and wrote related senario test case: OCP-20435

steps:
1. git clone https://github.com/wkulhanek/rhte-app
2. cd rhte-app
3. npm install # this will create 28MB of dependencies in the directory
4. oc new-build nodejs --binary=true --name=test
5. oc start-build test --from-dir=.
Uploading directory "." as binary input for the build ...
.
Uploading finished
build.build.openshift.io/test-1 started
$ oc get  builds
NAME      TYPE      FROM             STATUS     STARTED              DURATION
test-1    Source    Binary@44246d4   Complete   About a minute ago   15s
[wewang@wen-local rhte-app]$ oc logs build/test-1
Receiving source from STDIN as archive ...
---> Installing application source ...
---> Installing all dependencies
npm notice created a lockfile as package-lock.json. You should commit this file.
up to date in 0.567s
---> Building in production mode
---> Pruning the development dependencies
up to date in 0.604s
/opt/app-root/src/.npm is not a mountpoint
---> Cleaning the npm cache /opt/app-root/src/.npm
/tmp is not a mountpoint
---> Cleaning the /tmp/npm-*

Pushing image docker-registry.default.svc:5000/wewang/test:latest ...
Pushed 0/6 layers, 14% complete
Pushed 1/6 layers, 25% complete
Pushed 2/6 layers, 33% complete
Push successful

Comment 5 errata-xmlrpc 2018-08-31 06:18:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2376