1944312 – Upgrade to 4.6.20 causes pod to stop working due to permission denied

Bug 1944312 - Upgrade to 4.6.20 causes pod to stop working due to permission denied

Summary: Upgrade to 4.6.20 causes pod to stop working due to permission denied

Keywords:
Status:	CLOSED DUPLICATE of bug 1934177
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Peter Hunt
QA Contact:	MinLi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-29 18:09 UTC by Neil Girard
Modified:	2024-06-14 01:03 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-04-06 17:54:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Neil Girard 2021-03-29 18:09:53 UTC

Description of problem:
Customer upgraded from 4.6.16 to 4.6.20 and a container stopped being able to run due to:

Error: container create failed: time="2021-03-18T20:30:44Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

From what I am able to determine, the pod is being assigned a different security contexts setting for the seLinuxOptions level.  In the case where the pod is loading, it is being assigned:

s0:c24,c19

In the bad case I am seeing

s0:c24,c14

In both cases, the scc of anyuid is assigned.

I am also currently trying to find which exact 4.6 potentially introduced the change that is causing this issue

The pod definition I am using to recreate the issue is:

apiVersion: v1
kind: Pod
metadata:
  name: foo
spec:
  containers:
  - name: foo
    image: hasheddan/crossplane:nocreds

Version-Release number of selected component (if applicable):
4.6.20

How reproducible:
Always

Steps to Reproduce:
1. Create pod definition above
2. Watch pod attempt to start and fail.

Actual results:
Pod unable to start due to permission denied.

Expected results:


Additional info:

There are some inspect of 3 projects with the issue from the customer's cluster.

Comment 1 Neil Girard 2021-03-30 12:01:35 UTC

Hello, after trying different version of OCP, it seems this issue was introduced in 4.6.17.

Comment 3 Peter Hunt 2021-03-31 15:36:46 UTC

ah, It seems you've hit the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1915397

make sure the WORKDIR is accessible by the user the container runs as

Here's what CNV did to fix it:
 
```
RUN chgrp -R 0 /home/nonroot && \
+    chmod -R g=u /home/nonroot
```
for whatever group your container ends up running as

Comment 5 Peter Hunt 2021-03-31 19:19:02 UTC

the image itself doesn't have chgrp, but I was able to workaround by doing:
```
apiVersion: v1                                                                                                                                                                                                                                
kind: Pod                                                                                                                                                                                                                                     
metadata:                                                                                                                                                                                                                                     
  name: foo                                                                                                                                                                                                                                   
spec:                                                                                                                                                                                                                                         
  containers:                                                                                                                                                                                                                                 
  - name: foo                                                                                                                                                                                                                                 
    image: hasheddan/crossplane:nocreds                                                                                                                                                                                                       
    workingDir: /tmp
```

Since the command is run from PATH anyway.


Is this acceptable?

Comment 6 Neil Girard 2021-04-01 19:26:23 UTC

Hi Peter, I was able to work around it by changing the runAsUser to mentioned here (https://bugzilla.redhat.com/show_bug.cgi?id=1934177#c9). I did not try changing the workingDir.  I'll have to try that and ask the customer.  Thanks!

Comment 9 Peter Hunt 2021-04-06 17:54:59 UTC

I am marking this as a dup of an older version of this bug, which I have reopened.
I have also submitted a fix to this case to upstream runc. let's see how the maintainers feel about it

*** This bug has been marked as a duplicate of bug 1934177 ***

Note You need to log in before you can comment on or make changes to this bug.