Description of problem: As part of the automation performed in OCS-CI, we test the creation of an NGINX application pod. However, it seems like the latest version of OCP 4.5 results in the pod being stuck in `CrashLoopBackOff`, even though it was fine in the previous version. Logs from the pod: oc logs pod-test-rbd-a4f97c822dc84ea188eff99cc3c99672 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: Can not modify /etc/nginx/conf.d/default.conf (read-only file system?), exiting /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2020/06/23 15:15:59 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2 2020/06/23 15:15:59 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied) nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied) Another pod we try to create is the awscli_pod, using this YAML: apiVersion: v1 kind: Pod metadata: name: awscli namespace: default spec: containers: - name: awscli image: amazon/aws-cli:2.0.13 # Override the default `aws` entrypoint in order to # allow the pod to run continuously and act as a relay command: ['/bin/sh'] stdin: true tty: true The pod is created successfully, but when we try to use `mkdir` inside the pod, it fails because of `Permission denied` once more. Just like the NGINX image - it always used to work fine up until now. Since the NGINX pod is part of our deployment testing, all our jobs depending on deployment verification (deployments, PR verifications) cannot execute, because the pod never enters a `Ready` state. Version-Release number of selected component (if applicable): OCP 4.5.0-0.nightly-2020-06-23-020504 OCS 4.5.0-460.ci How reproducible: 100% Steps to Reproduce: 1. Deploy a cluster with the OCP and OCS versions described above 2. Try to create a new directory inside any pod by using `mkdir` Actual results: mkdir: cannot create directory <dir>: Permission denied Expected results: The directory is created successfully Additional info:
what is the directory you're trying to create, and what's the pod yaml for the nginx pod that's failing?
In the case of NGINX - /var/cache/nginx/client_temp In the case of awscli_pod - /cert/ Both worked fine in the previous 4.5 versions. NGINX pod YAML: --- apiVersion: v1 kind: Pod metadata: name: demo-pod namespace: default spec: containers: - name: web-server image: nginx volumeMounts: - name: mypvc mountPath: /var/lib/www/html volumes: - name: mypvc persistentVolumeClaim: claimName: pvc readOnly: false
I just noticed something - `oc rsh` in OCP 4.5.0-0.nightly-2020-06-11-183238 sh-4.2# echo $(whoami) root `oc rsh awscli-relay-pod` in OCP 4.5.0-0.nightly-2020-06-23-075004 sh-4.2$ echo $(whoami) 1000570000 Somewhere along the way, the default user changed from root to... another one.
Do you have the full jenkins logs for this run?
Is this really the change in the ocp, or nginx image? If change in OCP 4.5 can we see some link to the RFE/documentation or change PR with the details to get better understanding of the change? Is it possible to still run rsh on some specific pods as root user by some configuration? If so can you please mention it here? Thanks
https://github.com/red-hat-storage/ocs-ci/blob/master/ocs_ci/templates/app-pods/rhel-7_7.yaml#L15-L17 Will this help us if we will set it on pods which are suppose to use root? securityContext: privileged: true runAsUser: 0
The change is also happening with a pinned AWSCLI image - so I do not think it's image-dependent. I would also love to have a link for RFE/docs regarding the change if it is indeed in OCP. I tested the securityContext attribute by adding it to our AWSCLI pod's YAML, and I did gain root access when I ssh'd into the pod, so it seems to work.
ultimately, this does not really seem like an issue with the node or container component. there is the working theory that the default user changed, does anyone know which component that is?
Hi Ben, With the following versions: Server Version: 4.5.0-0.nightly-2020-06-23-020504 Kubernetes Version: v1.18.3+c44581d crio version 1.18.2-15.dev.rhaos4.5.git7c4494f.el8 I don't see this issue when I am logged into oc as system:admin. I tried running both the nginx pod and awscli pod and both of them started with root user. ➜ ~ oc whoami system:admin awscli pod: apiVersion: v1 kind: Pod metadata: name: awscli namespace: default spec: containers: - name: awscli image: amazon/aws-cli:2.0.13 # Override the default `aws` entrypoint in order to # allow the pod to run continuously and act as a relay command: ['/bin/sh'] stdin: true tty: true ➜ ~ oc rsh awscli sh-4.2# whoami root sh-4.2# mkdir foo sh-4.2# ls -al total 0 drwxr-xr-x. 1 root root 29 Jun 24 18:02 . drwxr-xr-x. 1 root root 51 Jun 24 17:46 .. drwxr-xr-x. 2 root root 6 Jun 24 17:50 blah drwxr-xr-x. 2 root root 6 Jun 24 18:02 foo nginx pod: apiVersion: v1 kind: Pod metadata: name: demo-pod namespace: default spec: containers: - name: web-server image: nginx ➜ ~ oc rsh demo-pod # whoami root # mkdir foo # ls bin dev docker-entrypoint.sh foo lib media opt root sbin sys usr boot docker-entrypoint.d etc home lib64 mnt proc run srv tmp var When I logged into oc as a regular user, my pod was no longer running as root: ➜ ~ oc whoami newton awscli pod yaml is the same as above: ➜ ~ oc rsh awscli sh-4.2$ whoami 1000580000 sh-4.2$ mkdir foo mkdir: cannot create directory 'foo': Permission denied I would say check what user you are logging into your cluster as. It is possible that it changed from system:admin to a regular user and that is why you are hitting this issue. I looked through the cri-o code and nothing regarding the set up of user has changed on the cri-o end.
Hi Urvashi - thank you for looking into this! Forgive me; I was imprecise. There is one important difference between the YAML I shared here, and the YAML I actually use - and it's the namespace. When I used the YAML that created the pod under the `default` namespace, I indeed received a root shell on the pod. However, when I created the pod under the `openshift-storage` namespace - I was greeted with the unprivileged shell once more - meridian@metropolis:~$ oc whoami kube:admin meridian@metropolis:~$ oc rsh awscli sh-4.2$ whoami 1000580000 sh-4.2$ mkdir foo mkdir: cannot create directory 'foo': Permission denied I used the exact same YAML as the one you shared - just replaced `default` with `openshift-storage`. Thus, leading me to think that this issue is related to creating pods in the specific `openshift-storage` namespace.
Hmm, that is weird. Does the `openshift-storage` namespace come with the cluster at install, or is that one you created? I have a cluster that I started with cluster-bot and don't see such a namespace. These are the only namespaces I have with "storage" in the name: ➜ ~ oc projects | grep storage openshift-cluster-storage-operator openshift-kube-storage-version-migrator openshift-kube-storage-version-migrator-operator Created the awscli pod in those namespaces and the user was root.
It's a namespace that should be created by the user prior to installation of OCS on the cluster - https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.4/html/deploying_openshift_container_storage/deploying-openshift-container-storage#installing-openshift-container-storage-operator-using-the-operator-hub_rhocs
Hi Ben, So I followed the steps in the link you pasted above and deployed the Openshift Container Storage on my cluster. After that, I started the awscli pod in the `openshift-storage` namespace and see that it started with root as the user. I am unable to reproduce, is there something specific about the set up that I am missing? Server Version: 4.5.0-0.nightly-2020-06-23-020504 Kubernetes Version: v1.18.3+c44581d crio version 1.18.2-15.dev.rhaos4.5.git7c4494f.el8 ➜ ~ oc whoami kube:admin ➜ ~ oc get pods --namespace openshift-storage NAME READY STATUS RESTARTS AGE aws-s3-provisioner-5b855f74b7-wx8rv 1/1 Running 0 18m awscli 1/1 Running 0 6m36s csi-cephfsplugin-2q9qv 3/3 Running 0 15m csi-cephfsplugin-jbmcq 3/3 Running 0 15m csi-cephfsplugin-provisioner-849485f449-5246v 5/5 Running 0 15m csi-cephfsplugin-provisioner-849485f449-z4qs6 5/5 Running 0 15m csi-cephfsplugin-sxdnm 3/3 Running 0 15m csi-rbdplugin-2dhzh 3/3 Running 0 15m csi-rbdplugin-jgl6p 3/3 Running 0 15m csi-rbdplugin-provisioner-5794d4754b-t9ll2 5/5 Running 0 15m csi-rbdplugin-provisioner-5794d4754b-tlpwb 5/5 Running 0 15m csi-rbdplugin-ztckb 3/3 Running 0 15m lib-bucket-provisioner-5f54b79d57-pb7ck 1/1 Running 0 18m noobaa-core-0 0/1 Pending 0 11m noobaa-db-0 0/1 Pending 0 11m noobaa-operator-6d94f8f586-vm89h 1/1 Running 0 18m ocs-operator-748bcc894b-zkx2n 0/1 Running 0 18m rook-ceph-crashcollector-ip-10-0-132-191-846884f68d-vbnsd 1/1 Running 0 13m rook-ceph-crashcollector-ip-10-0-186-1-5cb9585885-2csmg 1/1 Running 0 13m rook-ceph-crashcollector-ip-10-0-235-89-747584d594-4wdhd 1/1 Running 0 13m rook-ceph-drain-canary-c9a36249ce89c25e4e6db452c99d9126-74m42w5 1/1 Running 0 11m rook-ceph-drain-canary-ip-10-0-186-1.us-west-1.compute.intj7tj6 1/1 Running 0 11m rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-758d4d786cwwd 0/1 Pending 0 11m rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5b8f498ft2rdn 0/1 Pending 0 11m rook-ceph-mgr-a-68b489c75-4hk4b 1/1 Running 0 12m rook-ceph-mon-a-89cb68f77-cnbs2 1/1 Running 0 13m rook-ceph-mon-b-6fccf77984-khjjf 1/1 Running 0 13m rook-ceph-mon-c-64d99fcb9f-vwrkr 1/1 Running 0 13m rook-ceph-operator-55cb4fb89f-fj4bt 1/1 Running 0 18m rook-ceph-osd-0-566878d969-9plq8 0/1 Pending 0 12m rook-ceph-osd-1-654ffddbb6-nfmfr 1/1 Running 0 11m rook-ceph-osd-2-6b876ff58f-5dmpq 1/1 Running 0 11m rook-ceph-osd-prepare-ocs-deviceset-0-0-8trh2-9b5fr 0/1 Completed 0 12m rook-ceph-osd-prepare-ocs-deviceset-1-0-k59jg-6g9f9 0/1 Completed 0 12m rook-ceph-osd-prepare-ocs-deviceset-2-0-zcqrt-27krw 0/1 Completed 0 12m ➜ ~ oc rsh awscli sh-4.2# whoami root ➜ ~ oc get storagecluster NAME AGE PHASE CREATED AT VERSION ocs-storagecluster 17m Progressing 2020-06-25T14:59:04Z 4.4.0 ➜ ~ oc describe storagecluster ocs-storagecluster Name: ocs-storagecluster Namespace: openshift-storage Labels: <none> Annotations: <none> API Version: ocs.openshift.io/v1 Kind: StorageCluster Metadata: Creation Timestamp: 2020-06-25T14:59:04Z Finalizers: storagecluster.ocs.openshift.io Generation: 2 Managed Fields: API Version: ocs.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:storageDeviceSets: f:status: f:conditions: Manager: ocs-operator Operation: Update Time: 2020-06-25T15:16:54Z Resource Version: 42991 Self Link: /apis/ocs.openshift.io/v1/namespaces/openshift-storage/storageclusters/ocs-storagecluster UID: f62ad713-7480-4647-a43e-f9b0e533a1cc Spec: Storage Device Sets: Config: Count: 1 Data PVC Template: Metadata: Creation Timestamp: <nil> Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 2Ti Storage Class Name: gp2 Volume Mode: Block Status: Name: ocs-deviceset Placement: Portable: true Replica: 3 Resources: Version: 4.4.0 Status: Ceph Block Pools Created: true Ceph Filesystems Created: true Conditions: Last Heartbeat Time: 2020-06-25T15:16:54Z Last Transition Time: 2020-06-25T14:59:07Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2020-06-25T15:00:02Z Last Transition Time: 2020-06-25T14:59:06Z Message: CephCluster resource is not reporting status Reason: CephClusterStatus Status: False Type: Available Last Heartbeat Time: 2020-06-25T15:16:54Z Last Transition Time: 2020-06-25T14:59:06Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2020-06-25T14:59:06Z Last Transition Time: 2020-06-25T14:59:05Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: False Type: Degraded Last Heartbeat Time: 2020-06-25T15:03:03Z Last Transition Time: 2020-06-25T14:59:06Z Message: CephCluster is creating: Reason: ClusterStateCreating Status: False Type: Upgradeable Failure Domain: rack Node Topologies: Labels: failure-domain.beta.kubernetes.io/region: us-west-1 failure-domain.beta.kubernetes.io/zone: us-west-1a us-west-1b topology.rook.io/rack: rack0 rack1 rack2 Phase: Progressing Related Objects: API Version: ceph.rook.io/v1 Kind: CephCluster Name: ocs-storagecluster-cephcluster Namespace: openshift-storage Resource Version: 42501 UID: fb34e698-7a4f-4c92-90fc-6a0a0bb883ba API Version: noobaa.io/v1alpha1 Kind: NooBaa Name: noobaa Namespace: openshift-storage Resource Version: 42750 UID: 59cb292d-f118-49cc-a8e0-2d63f4fc167e Storage Classes Created: true Events: <none>
So I got a cluster from Ben where I was finally able to reproduce the issue. Here is what I found: When the pod is run in the `openshift-storage` namespace, it has the `openshift.io/scc: noobaa` annotation assigned to it. ➜ ~ oc describe pod awscli Name: awscli Namespace: openshift-storage Priority: 0 Node: ip-10-0-191-171.us-east-2.compute.internal/10.0.191.171 Start Time: Thu, 25 Jun 2020 12:12:19 -0400 Labels: <none> Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.2.71" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.2.71" ], "default": true, "dns": {} }] openshift.io/scc: noobaa Status: Running IP: 10.129.2.71 ... Looking at that scc, we see that it drops the following capabilities: KILL,MKNOD,SETUID,SETGID ➜ ~ oc describe scc noobaa Name: noobaa Priority: 11 Access: Users: system:serviceaccount:openshift-storage:noobaa Groups: <none> Settings: Allow Privileged: false Allow Privilege Escalation: true Default Add Capabilities: <none> Required Drop Capabilities: KILL,MKNOD,SETUID,SETGID Allowed Capabilities: <none> Allowed Seccomp Profiles: <none> Allowed Volume Types: configMap,downwardAPI,emptyDir,persistentVolumeClaim,projected,secret Allowed Flexvolumes: <all> Allowed Unsafe Sysctls: <none> Forbidden Sysctls: <none> Allow Host Network: false Allow Host Ports: false Allow Host PID: false ... And that is why the pod ends up with a non-root user. When the same pod is run in the default namespace, there is no scc restricting the capabilities the pod runs with and it has the SETUID and SETGID capabilities, which allows it to run with uid/gid 0/0. SCC work is done by the apiserver folks, and it is possible that something changed on that end. CRI-O is doing its part of assigning the correct uid and gid to the pods. Re-assigning this to the apiserver folks.
Anything goes in the `default` namespace as admission is turned off there. Do not deploy anything there, ever. That being said, if ``` apiVersion: v1 kind: Pod metadata: name: demo-pod namespace: default spec: containers: - name: web-server image: nginx volumeMounts: - name: mypvc mountPath: /var/lib/www/html volumes: - name: mypvc persistentVolumeClaim: claimName: pvc readOnly: false ``` is your full spec, that's quite vague for the security requirements that you appear to have raised throughout this BZ. If you want writable root filesystem and to run the payload as root, you need to specify that in the `securityContext` of your pod/container. If you don't, any custom SCC will bork your deployment (or, perhaps, even SCCs that are part of the platform now).
Urvashi, Standa - thank you very much for looking into the issue. The YAML uses the "default" namespace, but it's being dynamically templated as part of OCS-CI prior to its application. We do not use the `default` namespace for anything AFAIK. It is used only as a placeholder. We already tried setting `securityContext` for several of our pods, but I am not yet sure it works properly in the case of Deployments, StatefulSets, Jobs, and so on. Still looking into it. Regardless, if I understand correctly, it seems like the issue is actually stemming from NooBaa annotating pods that are not NooBaa-related with `openshift.io/scc: noobaa`. I contacted the NooBaa team and am waiting for their response. Thanks again.
According to the NooBaa team, this happens because their SCC was set to have a higher priority than the anyuid one. We're currently looking into the issue. Once we'll verify it as the cause, I'll mark this as NOTABUG. Thanks a lot for all the help with analyzing this issue!
NooBaa BZ - BZ1851697
It _might_ still be possible to avoid some of these issues with a proper security context being set. I believe this BZ could be a great opportunity to check out https://kubernetes.io/docs/tasks/configure-pod-container/security-context/, even if you end up NOTABUGging this BZ.
Thank you, Standa - I agree. I elaborated in the attached BZ that we have already begun our efforts to use security contexts where needed, it'll just take a while to test that everything works. I already noticed that we're experiencing problems with creating Deployments, Jobs and Workloads with security contexts - so I still need to research more and understand how this works, in order to use it properly. Until the proper use of security contexts is applied - we'll rely on the SCC change being reverted.
(In reply to Ben Eli from comment #18) > According to the NooBaa team, this happens because their SCC was set to have > a higher priority than the anyuid one. > We're currently looking into the issue. > > Once we'll verify it as the cause, I'll mark this as NOTABUG. > Thanks a lot for all the help with analyzing this issue! (In reply to Ben Eli from comment #19) > NooBaa BZ - BZ1851697 Can we simply (ack and) move this one to ON_QA since the noobaa BZ is ON_QA now? And fail QA if this was NOT the root cause?
Moving to ON_QA based on my suggestion from #24 and discussion with Karthick.
BZ1851697 verified, and subsequently, it means this bug was also fixed. Verified.