Bug 1810184

Summary: [4.3] Components using globs in Dockerfile COPY commands may break on OCP 4
Product: OpenShift Container Platform Reporter: Adam Kaplan <adam.kaplan>
Component: ReleaseAssignee: Ben Parees <bparees>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: aos-bugs, bparees, jokerman, kewang, lszaszki, wsun
Target Milestone: ---Flags: adam.kaplan: needinfo-
Target Release: 4.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1810182
: 1810185 (view as bug list) Environment:
Last Closed: 2020-04-14 16:18:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1810182    
Bug Blocks: 1810185    

Description Adam Kaplan 2020-03-04 16:45:46 UTC
This is a tracking bug for components that may not be able to immediately migrate their CI to 4.x clusters due a behavior skew between imagebuilder and buildah.

Docker has a longstanding bug using globs in COPY directives (i.e. COPY foo/dir/* /tmp/dir/ ). If the directory structure includes subdirectories, these subdirectories are not present in the destination [1][2]. Buildah - which is used to run builds in OCP 4.x - inherited this bug to maintain compatibility with Docker.

Imagebuilder - which is used to run multistage Dockerfile builds on OCP 3.11 - does not have this bug. Teams migrating their CI jobs from our 3.11 cluster to 4.x clusters may encounter failures if their build relies on glob copies that preserve the subdirectory structure.

The following repos (producing images with the referenced Dockerfiles) may be impacted:

openshift/cluster-etcd-operator › Dockerfile.rhel7
openshift/cluster-image-registry-operator › Dockerfile
openshift/cluster-kube-apiserver-operator › Dockerfile.rhel7
openshift/cluster-kube-controller-manager-operator › Dockerfile.rhel7
openshift/cluster-kube-scheduler-operator › Dockerfile.rhel7
openshift/cluster-logging-operator › Dockerfile
openshift/cluster-nfd-operator › Dockerfile.rhel7
openshift/cluster-samples-operator › Dockerfile.rhel7
openshift/loki › fluentd/fluent-plugin-grafana-loki/Dockerfile
openshift/must-gather › Dockerfile
openshift/must-gather › Dockerfile.rhel7
openshift/ocp-release-operator-sdk › ci/dockerfiles/ansible-e2e-hybrid.Dockerfile
openshift/ocp-release-operator-sdk › ci/dockerfiles/ansible.Dockerfile
openshift/ocs-operator › must-gather/Dockerfile
openshift/origin-aggregated-logging › curator/Dockerfile
openshift/origin-aggregated-logging › curator/Dockerfile.centos7
openshift/prometheus-operator › scripts/tooling/Dockerfile
openshift/router › images/router/haproxy/Dockerfile
openshift/router › images/router/haproxy/Dockerfile.rhel
openshift/router › images/router/nginx/Dockerfile
openshift/router › images/router/nginx/Dockerfile.rhel
openshift/svt › networking/synthetic/stac-s2i-builder-image/Dockerfile
openshift/windows-machine-config-operator › build/Dockerfile


Solution:

When possible, use simple directory copies with a trailing slash in the destination instead of globs. This will copy the directory and its subdirectories to the destination.

Example:

```
COPY foo/dir /tmp/
```

will move the contents of “dir” to /tmp/dir. This may require you to update the source file structure so that the source and destination directories align.

Actions:

1. Review the Dockerfiles used to build your images if your Dockerfile is referenced above.
2. Test your Dockerfile build with a current version of Docker or buildah. If your build fails with Docker/buildah, replace glob usage with simple directory file copies.
3. Submit a PR with your changes, referencing the appropriate BZ for the version you are targeting your PR against.
4. Cherrypick your PRs to the earliest version the referenced Dockerfile was used to produce an image. Retitle your PR with the appropriate BZ and mention Ben Parees (@bparees) in your pull request.


Additional Info:

[1] https://github.com/moby/moby/issues/29211
[2] https://github.com/moby/moby/issues/15858

Comment 4 Ben Parees 2020-03-16 12:59:20 UTC
Sorry Wei Sun, this is not actually done yet, the automation moved it but it's not ready.

in any case i do not think QE will need to verify it, it's not a change to shipping code, it's something we need to do to ensure the images build correctly in our CI system.

Comment 8 Ke Wang 2020-04-08 03:50:57 UTC
Verified with OCP build 4.3.0-0.nightly-2020-04-07-141343, Refer to the PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/816, we need to check the related manifests files are deployed to specified location in the openshift-kube-apiserver-operator pod, see following verification,

$ oc rsh -n openshift-kube-apiserver-operator kube-apiserver-operator-6d9db77446-42lq7

sh-4.2# ls /usr/share/bootkube/manifests/*
/usr/share/bootkube/manifests/bootstrap-manifests:
kube-apiserver-pod.yaml

/usr/share/bootkube/manifests/config:
bootstrap-config-overrides.yaml  config-overrides.yaml

/usr/share/bootkube/manifests/manifests:
00_openshift-kube-apiserver-ns.yaml	      configmap-csr-controller-ca.yaml		    secret-loadbalancer-serving-signer.yaml
00_openshift-kube-apiserver-operator-ns.yaml  configmap-sa-token-signing-certs.yaml	    secret-localhost-serving-signer.yaml
cluster-role-binding-kube-apiserver.yaml      secret-aggregator-client-signer.yaml	    secret-service-network-serving-signer.yaml
cluster-role-kube-apiserver.yaml	      secret-control-plane-client-signer.yaml
configmap-admin-kubeconfig-client-ca.yaml     secret-kube-apiserver-to-kubelet-signer.yaml

The required manifests are found.

Comment 10 errata-xmlrpc 2020-04-14 16:18:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1393