Bug 1934177
Summary: | knative-camel-operator CreateContainerError "container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied" | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Marek Schmidt <maschmid> |
Component: | Node | Assignee: | Peter Hunt <pehunt> |
Node sub component: | CRI-O | QA Contact: | Weinan Liu <weinliu> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | afield, aos-bugs, dwalsh, jokerman, ngirard, pehunt, swasthan, tsweeney |
Version: | 4.7 | Keywords: | Reopened |
Target Milestone: | --- | ||
Target Release: | 4.8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | runc-1.0.0-84.rhaos4.6.git7116f03 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
A change in the order of when runc sets up the workdir of a container
Consequence:
Container creation errors occurred if the workdir wasn't owned by the user running runc
Fix:
Update runc to attempt the chdir to the workdir multiple times, in case one does not work
Result:
Container creations succeed regardless of whether the workdir is owned by the container user or the user running runc
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-27 22:49:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Marek Schmidt
2021-03-02 16:28:51 UTC
The image used by the knative-camel-operator is gcr.io/knative-releases/knative.dev/eventing-camel/cmd/controller@sha256:874b498fc53ee5060c4f897c3fdf193a457d7c51c6ae6acc336d57518e848882 Specifically it seems the regression is between 4.6.16 (which also works), and 4.6.17 (on which it fails with CreateContainerError ) ah, It seems you've hit the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1915397 make sure the WORKDIR is accessible by the user the container runs as The image is an upstream image based on gcr.io/distroless/static:nonroot This affects any image based on gcr.io/distroless/static:nonroot that doesn't modify WORKDIR , e.g. oc new-app quay.io/maschmid/helloworld:latest which is just FROM gcr.io/distroless/static:nonroot ADD hello_world /hello_world CMD ["/hello_world"] This image works on 4.6.15, but doesn't on 4.7.0 running `id=$(podman pull -1 gcr.io/distroless/static:nonroot)` and then `podman inspect $id` returns: [{ ... "Config": { "User": "65532", "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt" ], "WorkingDir": "/home/nonroot" }, ... }] So the WORKDIR is actually modified, just not by your Dockerfile. I am suspcious that running `oc new-app` is running this container as a random uid, and that uid is not 65532. I would recommend running as that user, or doing something similar to what CNV did to workaround this issue: ``` RUN chgrp -R 0 /home/nonroot && \ + chmod -R g=u /home/nonroot ``` for whatever group your container ends up running as podman pull -q gcr.io/distroless/static:nonroot should be the first command does the work around work for you? As we don't have direct control on the image, we're trying to workaround by "runAsUser: 65532" in the operator: https://github.com/operator-framework/community-operators/pull/3262 That PR merged, is that work around sufficient/can we close this? Specifically for the knative-camel-operator the issue is fixed by our "workaround". I'd leave it up to you if you want to track a general problem of making images based on gcr.io/distroless/static:nonroot "just work" on OpenShift like it did before 4.6.17. (I'd consider this to be a serious regression, as this behavior can cause applications breaking when upgrading to new OCP micro release, but I understand that was an unfortunate tradeoff that had to be done for fixing a different regression vs OCP 3.x) yeah I deem this to be an unfortunate trade-off. Since this behavior is more correct, the regression must be allowed to happen I've had a change of heart. I believe we can fix this case because it *was* previously valid. I've attached the PR. If it is accepted by upstream I will backport it to 4.5+ *** Bug 1944312 has been marked as a duplicate of this bug. *** I have worked around the issue and submitted the patch to 4.5-4.8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |