+++ This bug was initially created as a clone of Bug #1707941 +++
Description of problem:
in a dockerfile,
COPY . .
is failing in some cases.
Steps to Reproduce:
1. git clone git:operator-framework/helm.git
2. buildah bud .
3. see failure:
error building at STEP "COPY . .": error copying "/home/bparees/git/gocode/src/github.com/openshift/helm/pkg/chartutil/testdata/joonix/charts/frobnitz" to "/home/bparees/.local/share/containers/storage/vfs/dir/a95fa17f13262c63706f22e35a8c0186a522bff0df57c97028c88867df39bd02/go/src/k8s.io/helm": Can't copy a directory
4. docker build .
5. see success
buildah bud fails, docker build succeeds.
both should succeed
This is a blocker for OCP4.1 because ocp image builds are experiencing the same failure. buildah is just an easy reproducer.
There are also similar looking cases of COPY that seem to work fine:
git clone git:openshift/elasticsearch-operator.git
buildah bud .
succeeds despite doing pretty much the same COPY operation: https://github.com/openshift/elasticsearch-operator/blob/master/Dockerfile#L3
--- Additional comment from Ben Parees on 2019-05-08 18:46:40 UTC ---
full list of github repos i'm seeing this issue with:
I am also seeing a slightly different issue on these repos, but the overall effect is the same in that docker builds them fine, buildah fails:
operator-framework/operator-registry fails with:
STEP 13: RUN mkdir /registry
STEP 14: WORKDIR /registry
STEP 15: COPY --from=builder /go/src/github.com/operator-framework/operator-registry/bin/initializer /bin/initializer
STEP 16: COPY --from=builder /go/src/github.com/operator-framework/operator-registry/bin/registry-server /bin/registry-server
STEP 17: COPY --from=builder /go/src/github.com/operator-framework/operator-registry/bin/configmap-server /bin/configmap-server
STEP 18: COPY --from=builder /go/src/github.com/operator-framework/operator-registry/bin/appregistry-server /bin/appregistry-server
STEP 19: COPY --from=builder /go/bin/grpc_health_probe /bin/grpc_health_probe
STEP 20: RUN chgrp -R 0 /registry && chgrp -R 0 /dev && chmod -R g+rwx /registry && chmod -R g+rwx /dev
chgrp: changing group of '/dev/urandom': Permission denied
chgrp: changing group of '/dev/zero': Permission denied
chgrp: changing group of '/dev/tty': Permission denied
chgrp: changing group of '/dev/full': Permission denied
chgrp: changing group of '/dev/random': Permission denied
chgrp: changing group of '/dev/null': Permission denied
error building at STEP "RUN chgrp -R 0 /registry && chgrp -R 0 /dev && chmod -R g+rwx /registry && chmod -R g+rwx /dev": error while running runtime: exit status 1
ERRO exit status 1
openshift/cluster-api-provider-azure fails with:
STEP 1: FROM registry.svc.ci.openshift.org/openshift/release:golang-1.10 AS builder
STEP 2: WORKDIR /go/src/sigs.k8s.io/cluster-api-provider-azure
STEP 3: COPY pkg/ pkg/
STEP 4: COPY cmd/ cmd/
STEP 5: COPY vendor/ vendor/
error building at STEP "COPY vendor/ vendor/": error copying "/home/bparees/git/gocode/src/github.com/openshift/cluster-api-provider-azure/vendor/k8s.io/kubernetes/.bazelrc" to "/home/bparees/.local/share/containers/storage/vfs/dir/b2e6a7668c62fa0e1d9ac68cb38bf1bf367131424c88cfaef259cf7861a8b264/go/src/sigs.k8s.io/cluster-api-provider-azure/vendor": stat /home/bparees/git/gocode/src/github.com/openshift/cluster-api-provider-azure/vendor/k8s.io/kubernetes/.bazelrc: no such file or directory
ERRO exit status 1
--- Additional comment from Nalin Dahyabhai on 2019-05-08 19:43:06 UTC ---
It looks like the handling of .dockerignore files has difficulty with symbolic links (and probably other non-directory, non-regular items).
--- Additional comment from chris alfonso on 2019-05-08 20:22:05 UTC ---
Based upon your investigation, I'd like to move this to 4.2 as we wouldn't hold the GA release for this fix.
--- Additional comment from Ben Parees on 2019-05-08 22:45:17 UTC ---
Just to clarify the impact of this bug, based on my understanding from Nalin:
if you have a image build context directory containing:
1) a .dockerignore
2) a symlink (or other "unusual" file type)
and then you do a
COPY . /somedir
in your dockerfile.
Then your build will fail. It does not matter if the .dockerignore references the symlink or not.
For the RUN issue, we should split it out into a separate (4.1.z+4.2.0) targeted bug as it's an unrelated issue and less severe in terms of likely users impacted.
--- Additional comment from Nalin Dahyabhai on 2019-05-09 15:30:30 UTC ---
https://github.com/containers/buildah/pull/1583 should fix the issues with symbolic links.
--- Additional comment from Nalin Dahyabhai on 2019-05-13 14:15:17 UTC ---
https://github.com/openshift/builder/pull/72 should merge the fix into the builder.
release-4.1 PR: https://github.com/openshift/builder/pull/73
Verified it in image build side in version:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.1.0-0.nightly-2019-05-16-223922 True False 5h58m Cluster version is 4.1.0-0.nightly-2019-05-16-223922
1.Create a new build, which dir had symlink and .dockerignore file.
$oc new-build https://github.com/wewang58/dockerignore2
2. Build complete
[wewang@Desktop dockerignore2]$ oc get builds
NAME TYPE FROM STATUS STARTED DURATION
dockerignore2-1 Docker Git@831c29a Complete 23 seconds ago 18s
[wewang@Desktop dockerignore2]$ ls -al
drwxrwxr-x. 4 wewang wewang 4096 May 17 16:31 .
drwx------. 39 wewang wewang 20480 May 17 16:31 ..
-rw-rw-r--. 1 wewang wewang 22 May 17 16:31 Dockerfile
-rw-rw-r--. 1 wewang wewang 10 May 17 16:29 .dockerignore
drwxrwxr-x. 8 wewang wewang 4096 May 17 16:32 .git
-rw-rw-r--. 1 wewang wewang 16 May 16 16:53 README.md
drwxrwxr-x. 3 wewang wewang 4096 May 17 10:52 subdir
lrwxrwxrwx. 1 wewang wewang 6 May 17 10:23 symlink -> subdir
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.