Description of problem: When mount azure-file on RHEL(7.8) node, I try to read/write something but got "Permission denied". From the node, we can write something to the mounted directory. Version-Release number of selected component (if applicable): [wduan@MINT 01_general]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-05-25-232952 True False 3h10m Cluster version is 4.5.0-0.nightly-2020-05-25-232952 How reproducible: Always Steps to Reproduce: 1. Create rwx pvc with azure-file 2. Created damonset using pvc 3. Try to list forder and touch file in pod under mounted directory [wduan@MINT 01_general]$ oc get pod NAME READY STATUS RESTARTS AGE dpod-7v665 1/1 Running 0 62m dpod-hbkdr 1/1 Running 0 62m dpod-x2gsh 1/1 Running 0 62m [wduan@MINT 01_general]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mypvc-rwx Bound pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 2Gi RWX azurefile-ocp 73m [wduan@MINT 01_general]$ oc rsh dpod-7v665 ls -lZ /mnt total 0 drwxrwxrwx. 2 1500 1500 system_u:object_r:cifs_t:s0 0 May 26 05:27 storage sh-4.4$ ps -Z LABEL PID TTY TIME CMD system_u:system_r:container_t:s0:c19,c24 57 pts/0 00:00:00 sh system_u:system_r:container_t:s0:c19,c24 63 pts/0 00:00:00 ps drwxrwxrwx. 2 1500 1500 system_u:object_r:cifs_t:s0 0 May 26 05:27 storage sh-4.4$ ls /mnt/storage/ ls: cannot open directory '/mnt/storage/': Permission denied sh-4.4$ touch /mnt/storage/testfile touch: cannot touch '/mnt/storage/testfile': Permission denied Actual results: mounted directory have no read/write access right Expected results: mounted directory should have read/write access right Master Log: Node Log (of failed PODs): sh-4.2# more /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) PV Dump: [wduan@MINT 01_general]$ oc get pv pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 -o yaml apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: azure-file-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/azure-file creationTimestamp: "2020-05-26T05:27:23Z" finalizers: - kubernetes.io/pv-protection name: pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 resourceVersion: "71436" selfLink: /api/v1/persistentvolumes/pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 uid: 25d5a3de-da86-46f8-a45c-7a3f33ae627f spec: accessModes: - ReadWriteMany azureFile: secretName: azure-storage-account-fbcea863d20444b57822569-secret secretNamespace: wduan shareName: wduan0526-vt4fq-dynami-pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 capacity: storage: 2Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: mypvc-rwx namespace: wduan resourceVersion: "71427" uid: 1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 mountOptions: - uid=1500 - gid=1500 - mfsymlinks persistentVolumeReclaimPolicy: Delete storageClassName: azurefile-ocp volumeMode: Filesystem status: phase: Bound [wduan@MINT 01_general]$ PVC Dump: [wduan@MINT 01_general]$ oc get pvc mypvc-rwx -o yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-file volume.kubernetes.io/selected-node: wduan0526-vt4fq-rhel-2 creationTimestamp: "2020-05-26T05:27:20Z" finalizers: - kubernetes.io/pvc-protection name: mypvc-rwx namespace: wduan resourceVersion: "71438" selfLink: /api/v1/namespaces/wduan/persistentvolumeclaims/mypvc-rwx uid: 1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 spec: accessModes: - ReadWriteMany resources: requests: storage: 2Gi storageClassName: azurefile-ocp volumeMode: Filesystem volumeName: pvc-1c9c1dda-de21-4c1e-8f1f-e8d6cb253ba1 status: accessModes: - ReadWriteMany capacity: storage: 2Gi phase: Bound StorageClass Dump (if StorageClass used by PV/PVC): [wduan@MINT 01_general]$ oc get sc azurefile-ocp -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: "2020-05-26T04:26:01Z" name: azurefile-ocp resourceVersion: "51126" selfLink: /apis/storage.k8s.io/v1/storageclasses/azurefile-ocp uid: 342dfbb6-8aec-46d8-85f8-7ac07b0709dd mountOptions: - uid=1500 - gid=1500 - mfsymlinks parameters: skuName: Standard_LRS provisioner: kubernetes.io/azure-file reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer Additional info:
The cluster is 3 master(coreos) + 3 worker(RHEL), we tried to make pod assign to master node), mount/read/write works well. [wduan@MINT 01_general]$ oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME wduan0526-vt4fq-master-0 Ready master 4h22m v1.18.2 10.0.0.6 <none> Red Hat Enterprise Linux CoreOS 45.81.202005252026-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el8 wduan0526-vt4fq-master-1 Ready master 4h22m v1.18.2 10.0.0.8 <none> Red Hat Enterprise Linux CoreOS 45.81.202005252026-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el8 wduan0526-vt4fq-master-2 Ready master 4h23m v1.18.2 10.0.0.7 <none> Red Hat Enterprise Linux CoreOS 45.81.202005252026-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el8 wduan0526-vt4fq-rhel-0 Ready worker 3h14m v1.18.2 10.0.1.6 <none> Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.8.2.el7.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el7 wduan0526-vt4fq-rhel-1 Ready worker 3h14m v1.18.2 10.0.1.7 <none> Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.8.2.el7.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el7 wduan0526-vt4fq-rhel-2 Ready worker 3h14m v1.18.2 10.0.1.8 <none> Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.8.2.el7.x86_64 cri-o://1.18.1-1.dev.rhaos4.5.git60ac541.el7 [wduan@MINT 01_general]$ oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dpod-4b4nr 1/1 Running 0 103s 10.129.2.28 wduan0526-vt4fq-rhel-0 <none> <none> dpod-7bdnj 1/1 Running 0 3m10s 10.129.0.56 wduan0526-vt4fq-master-2 <none> <none> dpod-c2jt6 1/1 Running 0 3m10s 10.130.0.37 wduan0526-vt4fq-master-1 <none> <none> dpod-cfv85 1/1 Running 0 3m10s 10.128.0.37 wduan0526-vt4fq-master-0 <none> <none> dpod-ghs2d 1/1 Running 0 2m 10.130.2.5 wduan0526-vt4fq-rhel-1 <none> <none> dpod-td2jx 1/1 Running 0 2m10s 10.131.2.6 wduan0526-vt4fq-rhel-2 <none> <none> [wduan@MINT 01_general]$ oc rsh dpod-cfv85 sh-4.4$ touch /mnt/storage/aaa sh-4.4$ ls /mnt/storage aaa test sh-4.4$ ls -lZ /mnt total 0 drwxrwxrwx. 2 1500 1500 system_u:object_r:cifs_t:s0 0 May 26 05:27 storage sh-4.4$ ps -Z LABEL PID TTY TIME CMD system_u:system_r:container_t:s0:c19,c24 16 pts/0 00:00:00 sh system_u:system_r:container_t:s0:c19,c24 28 pts/0 00:00:00 ps
Reason is that master has SELinux boolean virt_use_samba "on", while the RHEL7.8 nodes have it "off".
Simple fix: setsebool -P virt_use_samba 1 That needs to be added to documentation, however, I wasn't able to find any documentation that we actually support OCP with RHEL 7 nodes.
Found it, https://docs.openshift.com/container-platform/4.4/machine_management/adding-rhel-compute.html I compared all filesystem related SELinux booleans between RHCOS 8.x and RHEL 7.8, these should be set on RHEL7.x via scaleup.yml to be on par with RHEL 8: virt_use_samba "on" container_use_cephfs "on"
Verify this bug with openshift-ansible-4.5.0-202005271957.git.1.af03ff7.el7.noarch.rpm During RHEL worker scale-up: TASK [openshift_node : Setting sebool virt_use_samba] ************************** Friday 29 May 2020 19:17:06 +0800 (0:00:01.422) 0:05:25.936 ************ changed: [10.0.1.7] => {"changed": true, "name": "virt_use_samba", "persistent": true, "state": true} changed: [10.0.1.6] => {"changed": true, "name": "virt_use_samba", "persistent": true, "state": true} TASK [openshift_node : Setting sebool container_use_cephfs] ******************** Friday 29 May 2020 19:17:07 +0800 (0:00:01.151) 0:05:27.087 ************ changed: [10.0.1.7] => {"changed": true, "name": "container_use_cephfs", "persistent": true, "state": true} changed: [10.0.1.6] => {"changed": true, "name": "container_use_cephfs", "persistent": true, "state": true} Check the related SElinux booleans on the RHEL worker: [root@gpei-455-9jxvj-rhel-0 cloud-user]# getsebool virt_use_samba virt_use_samba --> on [root@gpei-455-9jxvj-rhel-0 cloud-user]# getsebool container_use_cephfs container_use_cephfs --> on
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409