Created attachment 1690453 [details] Test script for reproducing bug and implementing work around Created attachment 1690453 [details] Test script for reproducing bug and implementing work around ## Description of problem When a BuildConfig has a postCommit script defined, and the OpenShift cluster is configured to [whitelist specific registries](https://docs.openshift.com/container-platform/4.4/openshift_images/image-configuration.html), all builds with a postCommit will fail during the COMMIT step. ## Version-Release number of selected component (if applicable) OpenShift Container Platform 4.2, 4.3, & 4.4 ## How reproducible Always ## Steps to reproduce 1. Add a whitelist of allowed registries. ```bash oc patch image.config.openshift.io/cluster --type=merge -p ' spec: registrySources: allowedRegistries: - image-registry.openshift-image-registry.svc:5000 - registry.access.redhat.com - registry.redhat.io - registry.connect.redhat.com - quay.io - docker.io ' ``` 2. Deploy an application in OpenShift. 3. Add a postCommit to the application's build. ```bash oc patch bc/${APP_NAME} --type=merge -p ' spec: postCommit: script: echo "This is a test" ' ``` 4. Start a build. ```bash oc start-build ${APP_NAME} --build-loglevel=5 --wait --follow ``` 5. Wait for build to fail. ## Actual results Build fails with the following error: ```text ... STEP 9: CMD /usr/libexec/s2i/run Getting image source signatures Copying blob sha256:35b7a5c4e1b4a84fb05d9c6658572c2b7a9925a270e8f7860c0ae30671c0a57c Copying blob sha256:eddcd8d2986daee57d8cd75add7ff3c998e668857847e0f2b3c3d3b7e02a3ab6 Copying blob sha256:f0f97bb39344256e639831d65c0c9db84aca2e9b0f1507f267b7cc128068fff0 Copying blob sha256:5a9c62a939b5a7eb752536378f00381f42c8cb293a026b29fa4a9384e56da6af Copying blob sha256:72beca8812421a68c0ac833a371148e35043be85ad138b67cbce72602b92f4cc Copying blob sha256:2aebf74dd0b4cfd3bd9b653dcae05a5c1ebd08fd27a6ea36f7a560fac9b9a5fe Copying config sha256:1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918 Writing manifest to image destination Storing signatures 1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918 STEP 10: FROM 1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918 STEP 11: RUN /bin/sh -ic 'echo "This is a test"' sh: no job control in this shell This is a test STEP 12: FROM 1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918 STEP 13: COMMIT temp.builder.openshift.io/test-postcommit/python-3:e6712cd2 F0520 23:25:37.890850 1 helpers.go:114] error: build error: error copying image "1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918": Source image rejected: Running image containers-storage:[overlay@/var/lib/containers/storage+/var/run/containers/storage]@1875230d5230a5d11979d5c7cac7ffbe115cbeb83b7a904ffca219ceda8db918 is rejected by policy. ``` ## Expected results Build should succeed. ## Additional info 1. When a container registry whitelist is not configured, builds with a postCommit succeed. 2. When a container registry whitelist is configured, builds without a postCommit succeed. 3. Patching a compute node's `/etc/containers/policy.json` does not fix this issue. 4. This [GitHub issue](https://github.com/openshift/builder/issues/71) is related but does not resolve the issue in this BZ. ## Root cause When a build pod runs, it uses it's own containers policy located at `/etc/containers/policy.json` inside the container image. ```json { "default": [ { "type": "insecureAcceptAnything" } ] } ``` [*Source*](https://github.com/openshift/builder/blob/bb6e41a1e23a61e070274f778d86d4211bdb41ff/imagecontent/policy.json) If a container registry whitelist is configured, however, the policy.json is overridden by a ConfigMap mounted at `/var/run/configs/openshift.io/build-system/policy.json`. This ConfigMap, `${APP_NAME}-${BUILD_NUMBER}-sys-config`, is generated by OpenShift prior to a build starting. The issue is that OpenShift does not include `containers-storage` in the whitelisted transports, so it is rejected by default. ## Possible Fix I believe the `openshift-controller-manager` contains the [broken code](https://github.com/openshift/openshift-controller-manager/blob/9d0118b20168324d21efba6ff7c244730abbd855/pkg/build/controller/build/build_controller.go#L2159). When it creates the transports, it needs to include a `containers-storage` entry. ```golang policyObj.Transports = map[string]signature.PolicyTransportScopes{ "atomic": transportScopes, "docker": transportScopes, "containers-storage": TODO, // add entry here } ``` ## Work around As stated in BZ 1758014, users can define their own security policy for builds. 1. Create a policy.json file that includes `containers-storage` in the whitelist. ```json { "default": [ { "type": "reject" } ], "transports": { "atomic": { ... }, "docker": { ... }, "containers-storage": { "": [ { "type": "insecureAcceptAnything" } ] } } } ``` 2. Create the ConfigMap `${APP_NAME}-${NEXT_BUILD_NUMBER}-sys-config` which includes the custom policy.json file. 3. Create the ConfigMaps `${APP_NAME}-${NEXT_BUILD_NUMBER}-ca` and `${APP_NAME}-${NEXT_BUILD_NUMBER}-global-ca`. These can be copied from previous builds. **NOTE:** If you don't create the CA ConfigMaps, the build will fail because the missing ConfigMaps couldn't be mounted in the build pod. 4. Start the build. 5. Wait for build to succeed. The issue with this work around is that it must be executed prior to every build. I recommend using CI/CD to automate this process.
Still wait for available 4.6 nightly build payload to verify it.
Verified in version: 4.6.0-0.nightly-2020-06-07-065515 Steps: 1. Create apps $oc new-app openshift/ruby~https://github.com/openshift/ruby-hello-world 2.Add a whitelist of allowed registries. 3. Add a postCommit to the application's build. ```bash oc patch bc/ruby-hello-world --type=merge -p ' spec: postCommit: script: echo "This is a test" ' ``` 4. Start a build,build complete
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
@adam.kaplan Hi Adam This issue is noticed again in 4.8 through 4.10 again. (Refer to 03317106). Should we re-open this BZ or open a new one ? Thx Anand
Hi Anand, Please open a new BZ and link the associated case. The original root cause of this issue was verified by QE in 4.6. Thank You, Adam