Description of problem: I have two openshift clusters with minor version differences (3.2.1.1 vs. 3.2.1.15). Building images in OpenShift for the redhat-helloworld-msa sample fails via binary build fails for some services with the error message "numeric overflow in sparse archive member", indicating that tar inside the docker builder has a problem extracting files with long filenames. Version-Release number of selected component (if applicable): 3.2.1.15 How reproducible: 100% Steps to Reproduce: 1. Set up OpenShift 3.3.2.15 cluster 2. deploy redhat-msa according to the following ansible script: https://github.com/wrichter/hailstorm/blob/master/ansible/roles/layerX_openshift_demo_redhatmsa_on_devclient/tasks/instantiate_microservices.yml Actual results: $ oc start-build -n helloworld-msa frontend --from-dir=git/frontend --follow Uploading "git/frontend" at commit "HEAD" as binary input for the build ... Uploading directory "git/frontend" as binary input for the build ... frontend-12 I0919 11:56:22.005054 1 builder.go:57] Master version "v3.2.1.15", Builder version "v3.2.1.15" I0919 11:56:22.009819 1 builder.go:145] Running build with cgroup limits: api.CGroupLimits{MemoryLimitBytes:92233720368547, CPUShares:2, CPUPeriod:100000, CPUQuota:-1, MemorySwap:92233720368547} I0919 11:56:22.011043 1 source.go:180] Receiving source from STDIN as archive ... I0919 11:56:32.260161 1 source.go:188] Extracting... tar: inherits_browser.js: numeric overflow in sparse archive member tar: README.md: numeric overflow in sparse archive member tar: package.json: numeric overflow in sparse archive member tar: bootstrap-datepicker.standalone.css: numeric overflow in sparse archive member tar: bootstrap-datepicker.standalone.css.map: numeric overflow in sparse archive member tar: bootstrap-datepicker.standalone.min.css: numeric overflow in sparse archive member tar: bootstrap-datepicker.standalone.min.css.map: numeric overflow in sparse archive member tar: bootstrap-datepicker3.standalone.css: numeric overflow in sparse archive member […] Expected results: oc start-build -n helloworld-msa frontend --from-dir=git/frontend --follow Uploading "git/frontend" at commit "HEAD" as binary input for the build ... Uploading directory "git/frontend" as binary input for the build ... frontend-2 I0919 12:50:00.737801 1 builder.go:57] Master version "v3.2.1.1-1-g33fa4ea", Builder version "v3.2.1.1-1-g33fa4ea" I0919 12:50:00.740431 1 builder.go:145] Running build with cgroup limits: api.CGroupLimits{MemoryLimitBytes:92233720368547, CPUShares:2, CPUPeriod:100000, CPUQuota:-1, MemorySwap:92233720368547} I0919 12:50:00.741371 1 source.go:180] Receiving source from STDIN as archive ... Step 1 : FROM registry.access.redhat.com/openshift3/nodejs-010-rhel7 ---> 72baa90ae334 Step 2 : ADD . /opt/app-root/src/ ---> 76850c08acad Removing intermediate container 4650a9370b84 Step 3 : EXPOSE 8080 ---> Running in 63bdff3c4db3 ---> fbe8b1933c9c Removing intermediate container 63bdff3c4db3 Step 4 : ENV OS_SUBDOMAIN 'rhel-cdk.10.1.2.2.xip.io' OS_PROJECT 'helloworld-msa' ---> Running in 4e7f7df74e6e ---> efa892976545 Removing intermediate container 4e7f7df74e6e Step 5 : CMD OLACHAINURL=${OLACHAINURL:-"http://ola-${OS_PROJECT}.${OS_SUBDOMAIN}/api/ola-chaining"} HOLAURL=${HOLAURL:-"http://hola-${OS_PROJECT}.${OS_SUBDOMAIN}/api/hola"} BONJOURURL=${BONJOURURL:-"http://bonjour-${OS_PROJECT}.${OS_SUBDOMAIN}/api/bonjour"} ALOHAURL=${ALOHAURL:-"http://aloha-${OS_PROJECT}.${OS_SUBDOMAIN}/api/aloha"} OLAURL=${OLAURL:-"http://ola-${OS_PROJECT}.${OS_SUBDOMAIN}/api/ola"} APIGATEWAYURL=${APIGATEWAYURL:-"http://api-gateway-${OS_PROJECT}.${OS_SUBDOMAIN}/api"} HYSTRIXDASHBOARDURL=${HYSTRIXDASHBOARDURL:-"http://hystrix-dashboard-${OS_PROJECT}.${OS_SUBDOMAIN}"} ZIPKINQUERYURL=${ZIPKINQUERYURL:-"http://zipkin-query-${OS_PROJECT}.${OS_SUBDOMAIN}"} && sed -i.orig services.json -e 's|OLACHAINURL|'"$OLACHAINURL"'|' -e 's|HOLAURL|'"$HOLAURL"'|' -e 's|BONJOURURL|'"$BONJOURURL"'|' -e 's|ALOHAURL|'"$ALOHAURL"'|' -e 's|OLAURL|'"$OLAURL"'|' -e 's|APIGATEWAYURL|'"$APIGATEWAYURL"'|' && sed -i.orig index.html -e 's|HYSTRIXDASHBOARDURL|'"$HYSTRIXDASHBOARDURL"'|' -e 's|ZIPKINQUERYURL|'"$ZIPKINQUERYURL"'|' && /bin/bash -c 'npm start' ---> Running in 1010bdbd3d93 ---> b977ed41d6c7 Removing intermediate container 1010bdbd3d93 Step 6 : ENV "OPENSHIFT_BUILD_NAME" "frontend-2" "OPENSHIFT_BUILD_NAMESPACE" "helloworld-msa" ---> Running in 22994b18ae79 ---> 2c5b4ed1262f Removing intermediate container 22994b18ae79 Step 7 : LABEL "io.openshift.build.commit.author" " \u003c\u003e" ---> Running in 37fab6056aa0 ---> ec6b44d2e1ca Removing intermediate container 37fab6056aa0 Successfully built ec6b44d2e1ca I0919 12:50:37.318712 1 docker.go:118] Pushing image 172.30.168.121:5000/helloworld-msa/frontend:latest ... I0919 12:51:56.448922 1 docker.go:122] Push successful -sh-4.2$ Additional info: A workaround is to pin the image versions via modifying the /etc/origin/master/master-config.yaml on all nodes and pinning the image version to 3.2.1.1 via imageConfig.format: openshift3/ose-${component}:v3.2.1.1 I can provide access to the environments where this is reproducible in the RedHat VPN
this change went in somewhere between 3.2.1.1 and 3.2.1.15 and is almost certainly the cause of this bug: https://github.com/openshift/origin/commit/4beb260c271fb505eb6e6b6362c9567c57728643#diff-986cfe99352ecdaac7e8c0d4e47bdbd4 I'm CCing Joel Smith who made the change to see if he can shed some light on what might be going wrong or how we can address it. Seems like perhaps the introduction of the pipe to tar has made the flow less tolerant of long names (tar wasn't involved before).
Also can you tell us more about what filename length appears to cause issues?
Created attachment 1203369 [details] stderr log showing what the docker builder container actually complains about
Created attachment 1203371 [details] stdout log showing what files have been added to the tar archive by the oc client If you check the stderr.log you can see that the first file it complains about is inherits_browser.js. The stdout.log shows this file's path to be (in the tar): node_modules/express/node_modules/send/node_modules/http-errors/node_modules/inherits/inherits_browser.js which is about 105 characters.
I doubt that the long filenames by themselves are the only issue. Perhaps it's an issue with another tar format, but I had no problem with paths much longer than 105 chars in my testing. I also tried sparse files with long filenames and that worked too. I tried with old and new libarchive and couldn't reproduce with the tar streams I created. Wolfram, you updated OpenShift, but did you also update libarchive? If so, it may be that we have introduced a bug there in fixing the various issues for CVE-2016-5418. On RHEL, libarchive-3.1.2-10 has all the fixes. Ben, is there a way we could get a copy of the STDIN stream that is eventually fed to bsdtar so that we can debug it? I think it's fine to revert https://github.com/openshift/origin/commit/4beb260c271fb505eb6e6b6362c9567c57728643#diff-986cfe99352ecdaac7e8c0d4e47bdbd4 given that there is now a fixed libarchive, but I'd first like to understand the bug and make sure that reverting will actually fix the issue (which it wouldn't if new libarchive was the source of this new problem).
Clarification: I did NOT update openshift, rather I have two fresh installs based on content from two different satellites which have synced & created their content views a couple of weeks apart, hence the version differences.
Wolfram: thanks for the clarification. Could you also check the version of the libarchive RPM installed on your OpenShift compute nodes on each of the two deployments?
I'm afraid I'm not the infrastructure expert: [root@ose3-node1 ~]# rpm -q libarchive package libarchive is not installed [root@ose3-node1 ~]#
the package versions will be: bsdtar-3.1.2-7.el7.x86_64 tar-1.26-29.el7.x86_64 libarchive-3.1.2-7.el7.x86_64 i'm working on getting them updated to the newer packages and intend to revert the change in question.
Verified in openshift openshift v3.3.0.33 kubernetes v1.3.0+52492b4 etcd 2.3.0+git step: 1. Create an application $ oc new-app https://raw.githubusercontent.com/openshift/origin/master/examples/sample-app/application-template-dockerbuild.json 2. Git clone repo $ git clone https://github.com/openshift/ruby-hello-world 3. Start build from dir $ oc start-build ruby-sample-build --from-dir=./ruby-hello-world/ --follow Actual result: build success, no error in buildlog Have someone reproduced the error in buildlog like attachment 1203369 [details] with openshift v3.2.1.15
No i don't have an easy way to recreate it, but i suggest you try building a directory that has a deep nesting of files/long paths to the files, see comment 4 above.
This is the project which was used to trigger the output in comment 4: https://github.com/redhat-helloworld-msa/frontend
Trigger 10 builds on the 3.2.1.15, some failed: $ oc get build NAME TYPE FROM STATUS STARTED DURATION frontend-1 Docker Git@5a9f311 Complete 13 minutes ago 2m40s frontend-10 Docker Binary@5a9f311 Failed About a minute ago 25s frontend-2 Docker Binary@5a9f311 Failed 12 minutes ago 8s frontend-3 Docker Binary@5a9f311 Complete 12 minutes ago 3m19s frontend-4 Docker Binary@5a9f311 Complete 9 minutes ago 2m42s frontend-5 Docker Binary@5a9f311 Complete 8 minutes ago 3m19s frontend-6 Docker Binary@5a9f311 Complete 7 minutes ago 3m22s frontend-7 Docker Binary@5a9f311 Failed 4 minutes ago 3s frontend-8 Docker Binary@5a9f311 Complete 4 minutes ago 2m58s frontend-9 Docker Binary@5a9f311 Complete 4 minutes ago 3m3s oc logs -f build/frontend-7 I0929 23:48:05.089673 1 builder.go:57] Master version "v3.2.1.15", Builder version "v3.2.1.15" I0929 23:48:05.609275 1 builder.go:145] Running build with cgroup limits: api.CGroupLimits{MemoryLimitBytes:92233720368547, CPUShares:2, CPUPeriod:100000, CPUQuota:-1, MemorySwap:92233720368547} I0929 23:48:05.630622 1 source.go:180] Receiving source from STDIN as archive ... I0929 23:48:06.706585 1 source.go:188] Extracting... bsdtar: (null) tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now F0929 23:48:06.706635 1 builder.go:204] Error: build error: unable to extract binary build input, must be a zip, tar, or gzipped tar, or specified as a file: exit status 2 Trigger 10 builds on the 3.3.0.33 all builds succeed: [root@ip-172-18-15-96 frontend]# oc get build -n haowang1 NAME TYPE FROM STATUS STARTED DURATION frontend-1 Docker Git@5a9f311 Complete 26 minutes ago 2m39s frontend-10 Docker Binary@5a9f311 Complete 2 minutes ago 2m7s frontend-2 Docker Binary@5a9f311 Complete 24 minutes ago 2m44s frontend-3 Docker Binary@5a9f311 Complete 21 minutes ago 2m34s frontend-4 Docker Binary@5a9f311 Complete 18 minutes ago 2m25s frontend-5 Docker Binary@5a9f311 Complete 16 minutes ago 2m38s frontend-6 Docker Binary@5a9f311 Complete 13 minutes ago 2m45s frontend-7 Docker Binary@5a9f311 Complete 10 minutes ago 2m59s frontend-8 Docker Binary@5a9f311 Complete 7 minutes ago 2m47s frontend-9 Docker Binary@5a9f311 Complete 4 minutes ago 2m18s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1988