1850148 – Executing mkdir commands inside pods results in `Permission denied`

Bug 1850148 - Executing mkdir commands inside pods results in `Permission denied`

Summary: Executing mkdir commands inside pods results in `Permission denied`

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	unclassified
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	OCS 4.5.0
Assignee:	Michael Adam
QA Contact:	Ben Eli
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1850484
TreeView+	depends on / blocked

Reported:	2020-06-23 15:28 UTC by Ben Eli
Modified:	2020-09-23 09:04 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1850484 (view as bug list)
Environment:
Last Closed:	2020-09-15 16:33:04 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ben Eli 2020-06-23 15:28:16 UTC

Description of problem:
As part of the automation performed in OCS-CI, we test the creation of an NGINX application pod.
However, it seems like the latest version of OCP 4.5 results in the pod being stuck in `CrashLoopBackOff`, even though it was fine in the previous version.

Logs from the pod:
oc logs pod-test-rbd-a4f97c822dc84ea188eff99cc3c99672
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: Can not modify /etc/nginx/conf.d/default.conf (read-only file system?), exiting
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2020/06/23 15:15:59 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2020/06/23 15:15:59 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)

Another pod we try to create is the awscli_pod, using this YAML:
apiVersion: v1
kind: Pod
metadata:
  name: awscli
  namespace: default
spec:
  containers:
    - name: awscli
      image: amazon/aws-cli:2.0.13
      # Override the default `aws` entrypoint in order to 
      # allow the pod to run continuously and act as a relay
      command: ['/bin/sh']
      stdin: true
      tty: true

The pod is created successfully, but when we try to use `mkdir` inside the pod, it fails because of `Permission denied` once more. Just like the NGINX image - it always used to work fine up until now.

Since the NGINX pod is part of our deployment testing, all our jobs depending on deployment verification (deployments, PR verifications) cannot execute, because the pod never enters a `Ready` state.

Version-Release number of selected component (if applicable):
OCP 4.5.0-0.nightly-2020-06-23-020504
OCS 4.5.0-460.ci

How reproducible:
100%

Steps to Reproduce:
1. Deploy a cluster with the OCP and OCS versions described above
2. Try to create a new directory inside any pod by using `mkdir`


Actual results:
mkdir: cannot create directory <dir>: Permission denied

Expected results:
The directory is created successfully

Additional info:

Comment 1 Peter Hunt 2020-06-23 15:36:14 UTC

what is the directory you're trying to create, and what's the pod yaml for the nginx pod that's failing?

Comment 2 Ben Eli 2020-06-23 15:43:32 UTC

In the case of NGINX - 
/var/cache/nginx/client_temp

In the case of awscli_pod - 
/cert/

Both worked fine in the previous 4.5 versions.


NGINX pod YAML:
---
apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
  namespace: default
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /var/lib/www/html
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: pvc
       readOnly: false

Comment 3 Ben Eli 2020-06-23 16:09:10 UTC

I just noticed something - 
`oc rsh` in OCP 4.5.0-0.nightly-2020-06-11-183238
sh-4.2# echo $(whoami)
root

`oc rsh awscli-relay-pod` in OCP 4.5.0-0.nightly-2020-06-23-075004
sh-4.2$ echo $(whoami)
1000570000

Somewhere along the way, the default user changed from root to... another one.

Comment 4 Vasu Kulkarni 2020-06-23 18:04:07 UTC

Do you have the full jenkins logs for this run?

Comment 6 Petr Balogh 2020-06-24 11:46:55 UTC

Is this really the change in the ocp, or nginx image? If change in OCP 4.5 can we see some link to the RFE/documentation or change PR with the details to get better understanding of the change?

Is it possible to still run rsh on some specific pods as root user by some configuration?

If so can you please mention it here?

Thanks

Comment 7 Petr Balogh 2020-06-24 11:53:06 UTC

https://github.com/red-hat-storage/ocs-ci/blob/master/ocs_ci/templates/app-pods/rhel-7_7.yaml#L15-L17

Will this help us if we will set it on pods which are suppose to use root?

    securityContext:
      privileged: true
      runAsUser: 0

Comment 8 Ben Eli 2020-06-24 12:10:27 UTC

The change is also happening with a pinned AWSCLI image - so I do not think it's image-dependent.
I would also love to have a link for RFE/docs regarding the change if it is indeed in OCP.

I tested the securityContext attribute by adding it to our AWSCLI pod's YAML, and I did gain root access when I ssh'd into the pod, so it seems to work.

Comment 9 Peter Hunt 2020-06-24 14:07:23 UTC

ultimately, this does not really seem like an issue with the node or container component. 

there is the working theory that the default user changed, does anyone know which component that is?

Comment 10 Urvashi Mohnani 2020-06-24 18:21:21 UTC

Hi Ben,

With the following versions:

Server Version: 4.5.0-0.nightly-2020-06-23-020504
Kubernetes Version: v1.18.3+c44581d
crio version 1.18.2-15.dev.rhaos4.5.git7c4494f.el8

I don't see this issue when I am logged into oc as system:admin. I tried running both the nginx pod and awscli pod and both of them started with root user.

➜ ~ oc whoami
system:admin

awscli pod:
apiVersion: v1
kind: Pod
metadata:
name: awscli
namespace: default
spec:
containers:
- name: awscli
image: amazon/aws-cli:2.0.13
# Override the default `aws` entrypoint in order to
# allow the pod to run continuously and act as a relay
command: ['/bin/sh']
stdin: true
tty: true

➜ ~ oc rsh awscli
sh-4.2# whoami
root
sh-4.2# mkdir foo
sh-4.2# ls -al
total 0
drwxr-xr-x. 1 root root 29 Jun 24 18:02 .
drwxr-xr-x. 1 root root 51 Jun 24 17:46 ..
drwxr-xr-x. 2 root root 6 Jun 24 17:50 blah
drwxr-xr-x. 2 root root 6 Jun 24 18:02 foo

nginx pod:
apiVersion: v1
kind: Pod
metadata:
name: demo-pod
namespace: default
spec:
containers:
- name: web-server
image: nginx

➜ ~ oc rsh demo-pod
# whoami
root
# mkdir foo
# ls
bin dev docker-entrypoint.sh foo lib media opt root sbin sys usr
boot docker-entrypoint.d etc home lib64 mnt proc run srv tmp var

When I logged into oc as a regular user, my pod was no longer running as root:

➜ ~ oc whoami
newton

awscli pod yaml is the same as above:
➜ ~ oc rsh awscli
sh-4.2$ whoami
1000580000
sh-4.2$ mkdir foo
mkdir: cannot create directory 'foo': Permission denied

I would say check what user you are logging into your cluster as. It is possible that it changed from system:admin to a regular user and that is why you are hitting this issue. I looked through the cri-o code and nothing regarding the set up of user has changed on the cri-o end.

Comment 11 Ben Eli 2020-06-24 19:51:06 UTC

Hi Urvashi - thank you for looking into this!

Forgive me; I was imprecise. There is one important difference between the YAML I shared here, and the YAML I actually use - and it's the namespace.
When I used the YAML that created the pod under the `default` namespace, I indeed received a root shell on the pod.
However, when I created the pod under the `openshift-storage` namespace - I was greeted with the unprivileged shell once more - 

meridian@metropolis:~$ oc whoami
kube:admin
meridian@metropolis:~$ oc rsh awscli
sh-4.2$ whoami
1000580000
sh-4.2$ mkdir foo
mkdir: cannot create directory 'foo': Permission denied

I used the exact same YAML as the one you shared - just replaced `default` with `openshift-storage`.
Thus, leading me to think that this issue is related to creating pods in the specific `openshift-storage` namespace.

Comment 12 Urvashi Mohnani 2020-06-24 20:02:29 UTC

Hmm, that is weird. Does the `openshift-storage` namespace come with the cluster at install, or is that one you created? I have a cluster that I started with cluster-bot and don't see such a namespace.
These are the only namespaces I have with "storage" in the name:
➜  ~ oc projects | grep storage
    openshift-cluster-storage-operator
    openshift-kube-storage-version-migrator
    openshift-kube-storage-version-migrator-operator

Created the awscli pod in those namespaces and the user was root.

Comment 13 Ben Eli 2020-06-24 20:38:50 UTC

It's a namespace that should be created by the user prior to installation of OCS on the cluster - 
https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.4/html/deploying_openshift_container_storage/deploying-openshift-container-storage#installing-openshift-container-storage-operator-using-the-operator-hub_rhocs

Comment 14 Urvashi Mohnani 2020-06-25 15:18:08 UTC

Hi Ben,

So I followed the steps in the link you pasted above and deployed the Openshift Container Storage on my cluster. After that, I started the awscli pod in the `openshift-storage` namespace and see that it started with root as the user. I am unable to reproduce, is there something specific about the set up that I am missing?

Server Version: 4.5.0-0.nightly-2020-06-23-020504
Kubernetes Version: v1.18.3+c44581d
crio version 1.18.2-15.dev.rhaos4.5.git7c4494f.el8

➜  ~ oc whoami
kube:admin

➜  ~ oc get pods --namespace openshift-storage
NAME                                                              READY   STATUS      RESTARTS   AGE
aws-s3-provisioner-5b855f74b7-wx8rv                               1/1     Running     0          18m
awscli                                                            1/1     Running     0          6m36s
csi-cephfsplugin-2q9qv                                            3/3     Running     0          15m
csi-cephfsplugin-jbmcq                                            3/3     Running     0          15m
csi-cephfsplugin-provisioner-849485f449-5246v                     5/5     Running     0          15m
csi-cephfsplugin-provisioner-849485f449-z4qs6                     5/5     Running     0          15m
csi-cephfsplugin-sxdnm                                            3/3     Running     0          15m
csi-rbdplugin-2dhzh                                               3/3     Running     0          15m
csi-rbdplugin-jgl6p                                               3/3     Running     0          15m
csi-rbdplugin-provisioner-5794d4754b-t9ll2                        5/5     Running     0          15m
csi-rbdplugin-provisioner-5794d4754b-tlpwb                        5/5     Running     0          15m
csi-rbdplugin-ztckb                                               3/3     Running     0          15m
lib-bucket-provisioner-5f54b79d57-pb7ck                           1/1     Running     0          18m
noobaa-core-0                                                     0/1     Pending     0          11m
noobaa-db-0                                                       0/1     Pending     0          11m
noobaa-operator-6d94f8f586-vm89h                                  1/1     Running     0          18m
ocs-operator-748bcc894b-zkx2n                                     0/1     Running     0          18m
rook-ceph-crashcollector-ip-10-0-132-191-846884f68d-vbnsd         1/1     Running     0          13m
rook-ceph-crashcollector-ip-10-0-186-1-5cb9585885-2csmg           1/1     Running     0          13m
rook-ceph-crashcollector-ip-10-0-235-89-747584d594-4wdhd          1/1     Running     0          13m
rook-ceph-drain-canary-c9a36249ce89c25e4e6db452c99d9126-74m42w5   1/1     Running     0          11m
rook-ceph-drain-canary-ip-10-0-186-1.us-west-1.compute.intj7tj6   1/1     Running     0          11m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-758d4d786cwwd   0/1     Pending     0          11m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5b8f498ft2rdn   0/1     Pending     0          11m
rook-ceph-mgr-a-68b489c75-4hk4b                                   1/1     Running     0          12m
rook-ceph-mon-a-89cb68f77-cnbs2                                   1/1     Running     0          13m
rook-ceph-mon-b-6fccf77984-khjjf                                  1/1     Running     0          13m
rook-ceph-mon-c-64d99fcb9f-vwrkr                                  1/1     Running     0          13m
rook-ceph-operator-55cb4fb89f-fj4bt                               1/1     Running     0          18m
rook-ceph-osd-0-566878d969-9plq8                                  0/1     Pending     0          12m
rook-ceph-osd-1-654ffddbb6-nfmfr                                  1/1     Running     0          11m
rook-ceph-osd-2-6b876ff58f-5dmpq                                  1/1     Running     0          11m
rook-ceph-osd-prepare-ocs-deviceset-0-0-8trh2-9b5fr               0/1     Completed   0          12m
rook-ceph-osd-prepare-ocs-deviceset-1-0-k59jg-6g9f9               0/1     Completed   0          12m
rook-ceph-osd-prepare-ocs-deviceset-2-0-zcqrt-27krw               0/1     Completed   0          12m

➜  ~ oc rsh awscli
sh-4.2# whoami
root

➜  ~ oc get storagecluster
NAME                 AGE   PHASE         CREATED AT             VERSION
ocs-storagecluster   17m   Progressing   2020-06-25T14:59:04Z   4.4.0
➜  ~ oc describe storagecluster ocs-storagecluster
Name:         ocs-storagecluster
Namespace:    openshift-storage
Labels:       <none>
Annotations:  <none>
API Version:  ocs.openshift.io/v1
Kind:         StorageCluster
Metadata:
  Creation Timestamp:  2020-06-25T14:59:04Z
  Finalizers:
    storagecluster.ocs.openshift.io
  Generation:  2
  Managed Fields:
    API Version:  ocs.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:storageDeviceSets:
      f:status:
        f:conditions:
    Manager:         ocs-operator
    Operation:       Update
    Time:            2020-06-25T15:16:54Z
  Resource Version:  42991
  Self Link:         /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster
  UID:               f62ad713-7480-4647-a43e-f9b0e533a1cc
Spec:
  Storage Device Sets:
    Config:
    Count:  1
    Data PVC Template:
      Metadata:
        Creation Timestamp:  <nil>
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:         2Ti
        Storage Class Name:  gp2
        Volume Mode:         Block
      Status:
    Name:  ocs-deviceset
    Placement:
    Portable:  true
    Replica:   3
    Resources:
  Version:  4.4.0
Status:
  Ceph Block Pools Created:  true
  Ceph Filesystems Created:  true
  Conditions:
    Last Heartbeat Time:   2020-06-25T15:16:54Z
    Last Transition Time:  2020-06-25T14:59:07Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  ReconcileComplete
    Last Heartbeat Time:   2020-06-25T15:00:02Z
    Last Transition Time:  2020-06-25T14:59:06Z
    Message:               CephCluster resource is not reporting status
    Reason:                CephClusterStatus
    Status:                False
    Type:                  Available
    Last Heartbeat Time:   2020-06-25T15:16:54Z
    Last Transition Time:  2020-06-25T14:59:06Z
    Message:               Waiting on Nooba instance to finish initialization
    Reason:                NoobaaInitializing
    Status:                True
    Type:                  Progressing
    Last Heartbeat Time:   2020-06-25T14:59:06Z
    Last Transition Time:  2020-06-25T14:59:05Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                False
    Type:                  Degraded
    Last Heartbeat Time:   2020-06-25T15:03:03Z
    Last Transition Time:  2020-06-25T14:59:06Z
    Message:               CephCluster is creating: 
    Reason:                ClusterStateCreating
    Status:                False
    Type:                  Upgradeable
  Failure Domain:          rack
  Node Topologies:
    Labels:
      failure-domain.beta.kubernetes.io/region:
        us-west-1
      failure-domain.beta.kubernetes.io/zone:
        us-west-1a
        us-west-1b
      topology.rook.io/rack:
        rack0
        rack1
        rack2
  Phase:  Progressing
  Related Objects:
    API Version:            ceph.rook.io/v1
    Kind:                   CephCluster
    Name:                   ocs-storagecluster-cephcluster
    Namespace:              openshift-storage
    Resource Version:       42501
    UID:                    fb34e698-7a4f-4c92-90fc-6a0a0bb883ba
    API Version:            noobaa.io/v1alpha1
    Kind:                   NooBaa
    Name:                   noobaa
    Namespace:              openshift-storage
    Resource Version:       42750
    UID:                    59cb292d-f118-49cc-a8e0-2d63f4fc167e
  Storage Classes Created:  true
Events:                     <none>

Comment 15 Urvashi Mohnani 2020-06-25 16:42:39 UTC

So I got a cluster from Ben where I was finally able to reproduce the issue. Here is what I found:

When the pod is run in the `openshift-storage` namespace, it has the `openshift.io/scc: noobaa` annotation assigned to it.

➜  ~ oc describe pod awscli
Name:         awscli
Namespace:    openshift-storage
Priority:     0
Node:         ip-10-0-191-171.us-east-2.compute.internal/10.0.191.171
Start Time:   Thu, 25 Jun 2020 12:12:19 -0400
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.129.2.71"
                    ],
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.129.2.71"
                    ],
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: noobaa
Status:       Running
IP:           10.129.2.71
...

Looking at that scc, we see that it drops the following capabilities: KILL,MKNOD,SETUID,SETGID

➜  ~ oc describe scc noobaa
Name:						noobaa
Priority:					11
Access:						
  Users:					system:serviceaccount:openshift-storage:noobaa
  Groups:					<none>
Settings:					
  Allow Privileged:				false
  Allow Privilege Escalation:			true
  Default Add Capabilities:			<none>
  Required Drop Capabilities:			KILL,MKNOD,SETUID,SETGID
  Allowed Capabilities:				<none>
  Allowed Seccomp Profiles:			<none>
  Allowed Volume Types:				configMap,downwardAPI,emptyDir,persistentVolumeClaim,projected,secret
  Allowed Flexvolumes:				<all>
  Allowed Unsafe Sysctls:			<none>
  Forbidden Sysctls:				<none>
  Allow Host Network:				false
  Allow Host Ports:				false
  Allow Host PID:				false
...

And that is why the pod ends up with a non-root user.

When the same pod is run in the default namespace, there is no scc restricting the capabilities the pod runs with and it has the SETUID and SETGID capabilities, which allows it to run with uid/gid 0/0.

SCC work is done by the apiserver folks, and it is possible that something changed on that end. CRI-O is doing its part of assigning the correct uid and gid to the pods. Re-assigning this to the apiserver folks.

Comment 16 Standa Laznicka 2020-06-26 08:36:26 UTC

Anything goes in the `default` namespace as admission is turned off there. Do not deploy anything there, ever.

That being said, if
```
apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
  namespace: default
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /var/lib/www/html
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: pvc
       readOnly: false
```

is your full spec, that's quite vague for the security requirements that you appear to have raised throughout this BZ. If you want writable root filesystem and to run the payload as root, you need to specify that in the `securityContext` of your pod/container. If you don't, any custom SCC will bork your deployment (or, perhaps, even SCCs that are part of the platform now).

Comment 17 Ben Eli 2020-06-28 06:59:51 UTC

Urvashi, Standa - thank you very much for looking into the issue.
The YAML uses the "default" namespace, but it's being dynamically templated as part of OCS-CI prior to its application. 
We do not use the `default` namespace for anything AFAIK. It is used only as a placeholder.

We already tried setting `securityContext` for several of our pods, but I am not yet sure it works properly in the case of Deployments, StatefulSets, Jobs, and so on.
Still looking into it.

Regardless, if I understand correctly, it seems like the issue is actually stemming from NooBaa annotating pods that are not NooBaa-related with `openshift.io/scc: noobaa`.
I contacted the NooBaa team and am waiting for their response. 

Thanks again.

Comment 18 Ben Eli 2020-06-28 08:22:46 UTC

According to the NooBaa team, this happens because their SCC was set to have a higher priority than the anyuid one.
We're currently looking into the issue.

Once we'll verify it as the cause, I'll mark this as NOTABUG.
Thanks a lot for all the help with analyzing this issue!

Comment 19 Ben Eli 2020-06-28 10:30:41 UTC

NooBaa BZ - BZ1851697

Comment 20 Standa Laznicka 2020-06-29 08:42:12 UTC

It _might_ still be possible to avoid some of these issues with a proper security context being set. I believe this BZ could be a great opportunity to check out https://kubernetes.io/docs/tasks/configure-pod-container/security-context/, even if you end up NOTABUGging this BZ.

Comment 21 Ben Eli 2020-06-29 11:27:21 UTC

Thank you, Standa - I agree. 
I elaborated in the attached BZ that we have already begun our efforts to use security contexts where needed, it'll just take a while to test that everything works.
I already noticed that we're experiencing problems with creating Deployments, Jobs and Workloads with security contexts - so I still need to research more and understand how this works, in order to use it properly.
Until the proper use of security contexts is applied - we'll rely on the SCC change being reverted.

Comment 24 Michael Adam 2020-07-10 10:28:13 UTC

(In reply to Ben Eli from comment #18)
> According to the NooBaa team, this happens because their SCC was set to have
> a higher priority than the anyuid one.
> We're currently looking into the issue.
> 
> Once we'll verify it as the cause, I'll mark this as NOTABUG.
> Thanks a lot for all the help with analyzing this issue!

(In reply to Ben Eli from comment #19)
> NooBaa BZ - BZ1851697

Can we simply (ack and) move this one to ON_QA since the noobaa BZ is ON_QA now?
And fail QA if this was NOT the root cause?

Comment 27 Michael Adam 2020-07-10 11:22:39 UTC

Moving to ON_QA based on my suggestion from #24 and discussion with Karthick.

Comment 28 Ben Eli 2020-07-13 10:47:22 UTC

BZ1851697 verified, and subsequently, it means this bug was also fixed.

Verified.

Note You need to log in before you can comment on or make changes to this bug.