Bug 1810182

Summary: [4.4] Components using globs in Dockerfile COPY commands may break on OCP 4
Product: OpenShift Container Platform Reporter: Adam Kaplan <adam.kaplan>
Component: ReleaseAssignee: Ben Parees <bparees>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: aos-bugs, bparees, jokerman, lszaszki, sbatsche, wsun
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1810181
: 1810184 (view as bug list) Environment:
Last Closed: 2020-05-13 20:05:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1810181    
Bug Blocks: 1810184    

Description Adam Kaplan 2020-03-04 16:43:13 UTC
+++ This bug was initially created as a clone of Bug #1810181 +++

This is a tracking bug for components that may not be able to immediately migrate their CI to 4.x clusters due a behavior skew between imagebuilder and buildah.

Docker has a longstanding bug using globs in COPY directives (i.e. COPY foo/dir/* /tmp/dir/ ). If the directory structure includes subdirectories, these subdirectories are not present in the destination [1][2]. Buildah - which is used to run builds in OCP 4.x - inherited this bug to maintain compatibility with Docker.

Imagebuilder - which is used to run multistage Dockerfile builds on OCP 3.11 - does not have this bug. Teams migrating their CI jobs from our 3.11 cluster to 4.x clusters may encounter failures if their build relies on glob copies that preserve the subdirectory structure.

The following repos (producing images with the referenced Dockerfiles) may be impacted:

openshift/cluster-etcd-operator › Dockerfile.rhel7
openshift/cluster-image-registry-operator › Dockerfile
openshift/cluster-kube-apiserver-operator › Dockerfile.rhel7
openshift/cluster-kube-controller-manager-operator › Dockerfile.rhel7
openshift/cluster-kube-scheduler-operator › Dockerfile.rhel7
openshift/cluster-logging-operator › Dockerfile
openshift/cluster-nfd-operator › Dockerfile.rhel7
openshift/cluster-samples-operator › Dockerfile.rhel7
openshift/loki › fluentd/fluent-plugin-grafana-loki/Dockerfile
openshift/must-gather › Dockerfile
openshift/must-gather › Dockerfile.rhel7
openshift/ocp-release-operator-sdk › ci/dockerfiles/ansible-e2e-hybrid.Dockerfile
openshift/ocp-release-operator-sdk › ci/dockerfiles/ansible.Dockerfile
openshift/ocs-operator › must-gather/Dockerfile
openshift/origin-aggregated-logging › curator/Dockerfile
openshift/origin-aggregated-logging › curator/Dockerfile.centos7
openshift/prometheus-operator › scripts/tooling/Dockerfile
openshift/router › images/router/haproxy/Dockerfile
openshift/router › images/router/haproxy/Dockerfile.rhel
openshift/router › images/router/nginx/Dockerfile
openshift/router › images/router/nginx/Dockerfile.rhel
openshift/svt › networking/synthetic/stac-s2i-builder-image/Dockerfile
openshift/windows-machine-config-operator › build/Dockerfile


Solution:

When possible, use simple directory copies with a trailing slash in the destination instead of globs. This will copy the directory and its subdirectories to the destination.

Example:

```
COPY foo/dir /tmp/
```

will move the contents of “dir” to /tmp/dir. This may require you to update the source file structure so that the source and destination directories align.

Actions:

1. Review the Dockerfiles used to build your images if your Dockerfile is referenced above.
2. Test your Dockerfile build with a current version of Docker or buildah. If your build fails with Docker/buildah, replace glob usage with simple directory file copies.
3. Submit a PR with your changes, referencing the appropriate BZ for the version you are targeting your PR against.
4. Cherrypick your PRs to the earliest version the referenced Dockerfile was used to produce an image. Retitle your PR with the appropriate BZ and mention Ben Parees (@bparees) in your pull request.


Additional Info:

[1] https://github.com/moby/moby/issues/29211
[2] https://github.com/moby/moby/issues/15858

Comment 5 Ke Wang 2020-04-24 08:29:15 UTC
Verified with OCP build 4.4.0-0.nightly-2020-04-23-224300, Refer to the PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/808, we need to check the related manifests files are deployed to specified location in the openshift-kube-apiserver-operator pod, see following verification,

$  oc rsh -n openshift-kube-apiserver-operator kube-apiserver-operator-76d44fcccb-8jxd6
sh-4.2# ls /usr/share/bootkube/manifests/*
/usr/share/bootkube/manifests/bootstrap-manifests:
kube-apiserver-pod.yaml

/usr/share/bootkube/manifests/config:
bootstrap-config-overrides.yaml

/usr/share/bootkube/manifests/manifests:
00_openshift-kube-apiserver-ns.yaml	      configmap-csr-controller-ca.yaml		      secret-kube-apiserver-to-kubelet-signer.yaml
00_openshift-kube-apiserver-operator-ns.yaml  configmap-kubelet-bootstrap-kubeconfig-ca.yaml  secret-loadbalancer-serving-signer.yaml
cluster-role-binding-kube-apiserver.yaml      configmap-sa-token-signing-certs.yaml	      secret-localhost-serving-signer.yaml
cluster-role-kube-apiserver.yaml	      secret-aggregator-client-signer.yaml	      secret-service-network-serving-signer.yaml
configmap-admin-kubeconfig-client-ca.yaml     secret-control-plane-client-signer.yaml

The required manifests are found.

Comment 7 Luke Meyer 2020-05-13 20:05:24 UTC
This should have been closed with the 4.4 GA release.