Bug 1944312
| Summary: | Upgrade to 4.6.20 causes pod to stop working due to permission denied | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Neil Girard <ngirard> |
| Component: | Node | Assignee: | Peter Hunt <pehunt> |
| Node sub component: | CRI-O | QA Contact: | MinLi <minmli> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | high | CC: | aos-bugs, bsmitley, nagrawal |
| Version: | 4.6 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-04-06 17:54:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hello, after trying different version of OCP, it seems this issue was introduced in 4.6.17. ah, It seems you've hit the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1915397 make sure the WORKDIR is accessible by the user the container runs as Here's what CNV did to fix it: ``` RUN chgrp -R 0 /home/nonroot && \ + chmod -R g=u /home/nonroot ``` for whatever group your container ends up running as the image itself doesn't have chgrp, but I was able to workaround by doing:
```
apiVersion: v1
kind: Pod
metadata:
name: foo
spec:
containers:
- name: foo
image: hasheddan/crossplane:nocreds
workingDir: /tmp
```
Since the command is run from PATH anyway.
Is this acceptable?
Hi Peter, I was able to work around it by changing the runAsUser to mentioned here (https://bugzilla.redhat.com/show_bug.cgi?id=1934177#c9). I did not try changing the workingDir. I'll have to try that and ask the customer. Thanks! I am marking this as a dup of an older version of this bug, which I have reopened. I have also submitted a fix to this case to upstream runc. let's see how the maintainers feel about it *** This bug has been marked as a duplicate of bug 1934177 *** |
Description of problem: Customer upgraded from 4.6.16 to 4.6.20 and a container stopped being able to run due to: Error: container create failed: time="2021-03-18T20:30:44Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied" From what I am able to determine, the pod is being assigned a different security contexts setting for the seLinuxOptions level. In the case where the pod is loading, it is being assigned: s0:c24,c19 In the bad case I am seeing s0:c24,c14 In both cases, the scc of anyuid is assigned. I am also currently trying to find which exact 4.6 potentially introduced the change that is causing this issue The pod definition I am using to recreate the issue is: apiVersion: v1 kind: Pod metadata: name: foo spec: containers: - name: foo image: hasheddan/crossplane:nocreds Version-Release number of selected component (if applicable): 4.6.20 How reproducible: Always Steps to Reproduce: 1. Create pod definition above 2. Watch pod attempt to start and fail. Actual results: Pod unable to start due to permission denied. Expected results: Additional info: There are some inspect of 3 projects with the issue from the customer's cluster.