Bug 1944312 - Upgrade to 4.6.20 causes pod to stop working due to permission denied
Summary: Upgrade to 4.6.20 causes pod to stop working due to permission denied
Keywords:
Status: CLOSED DUPLICATE of bug 1934177
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 4.8.0
Assignee: Peter Hunt
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-29 18:09 UTC by Neil Girard
Modified: 2021-04-06 17:54 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-06 17:54:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Neil Girard 2021-03-29 18:09:53 UTC
Description of problem:
Customer upgraded from 4.6.16 to 4.6.20 and a container stopped being able to run due to:

Error: container create failed: time="2021-03-18T20:30:44Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

From what I am able to determine, the pod is being assigned a different security contexts setting for the seLinuxOptions level.  In the case where the pod is loading, it is being assigned:

s0:c24,c19

In the bad case I am seeing

s0:c24,c14

In both cases, the scc of anyuid is assigned.

I am also currently trying to find which exact 4.6 potentially introduced the change that is causing this issue

The pod definition I am using to recreate the issue is:

apiVersion: v1
kind: Pod
metadata:
  name: foo
spec:
  containers:
  - name: foo
    image: hasheddan/crossplane:nocreds

Version-Release number of selected component (if applicable):
4.6.20

How reproducible:
Always

Steps to Reproduce:
1. Create pod definition above
2. Watch pod attempt to start and fail.

Actual results:
Pod unable to start due to permission denied.

Expected results:


Additional info:

There are some inspect of 3 projects with the issue from the customer's cluster.

Comment 1 Neil Girard 2021-03-30 12:01:35 UTC
Hello, after trying different version of OCP, it seems this issue was introduced in 4.6.17.

Comment 3 Peter Hunt 2021-03-31 15:36:46 UTC
ah, It seems you've hit the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1915397

make sure the WORKDIR is accessible by the user the container runs as

Here's what CNV did to fix it:
 
```
RUN chgrp -R 0 /home/nonroot && \
+    chmod -R g=u /home/nonroot
```
for whatever group your container ends up running as

Comment 5 Peter Hunt 2021-03-31 19:19:02 UTC
the image itself doesn't have chgrp, but I was able to workaround by doing:
```
apiVersion: v1                                                                                                                                                                                                                                
kind: Pod                                                                                                                                                                                                                                     
metadata:                                                                                                                                                                                                                                     
  name: foo                                                                                                                                                                                                                                   
spec:                                                                                                                                                                                                                                         
  containers:                                                                                                                                                                                                                                 
  - name: foo                                                                                                                                                                                                                                 
    image: hasheddan/crossplane:nocreds                                                                                                                                                                                                       
    workingDir: /tmp
```

Since the command is run from PATH anyway.


Is this acceptable?

Comment 6 Neil Girard 2021-04-01 19:26:23 UTC
Hi Peter, I was able to work around it by changing the runAsUser to mentioned here (https://bugzilla.redhat.com/show_bug.cgi?id=1934177#c9). I did not try changing the workingDir.  I'll have to try that and ask the customer.  Thanks!

Comment 9 Peter Hunt 2021-04-06 17:54:59 UTC
I am marking this as a dup of an older version of this bug, which I have reopened.
I have also submitted a fix to this case to upstream runc. let's see how the maintainers feel about it

*** This bug has been marked as a duplicate of bug 1934177 ***


Note You need to log in before you can comment on or make changes to this bug.