Description of problem: After OKD has updated Fedora to `selinux-policy-34.9-1.fc34` two Kubernetes tests started failing: [sig-storage] In-tree Volumes [Driver: hostPath] [Testpattern: Inline-volume (default fs)] volumes should store data [Suite:openshift/conformance/parallel] [Suite:k8s] [sig-storage] In-tree Volumes [Driver: hostPathSymlink] [Testpattern: Inline-volume (default fs)] volumes should store data [Suite:openshift/conformance/parallel] [Suite:k8s] These tests are mounting files to `/opt` in container - and cannot be read anymore. If I understood correctly this is caused by https://github.com/fedora-selinux/selinux-policy/pull/748. selinux-policy-34.8-1.fc34 package works as expected Version-Release number of selected component (if applicable): selinux-policy-34.9-1.fc34 How reproducible: Always
These are the denials listed in the PR: Jun 22 14:51:56.069000 ip-10-0-160-48 audit[31152]: AVC avc: denied { open } for pid=31152 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=2073 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Jun 22 14:51:56.104682 ip-10-0-160-48 kernel: audit: type=1400 audit(1624373516.069:1083): avc: denied { open } for pid=31152 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=2073 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Jun 22 15:06:37.766000 ip-10-0-160-48 audit[199350]: AVC avc: denied { open } for pid=199350 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=5221 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Jun 22 15:13:55.330000 ip-10-0-160-48 audit[258541]: AVC avc: denied { open } for pid=258541 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=2073 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Jun 22 15:14:05.359000 ip-10-0-160-48 audit[258856]: AVC avc: denied { open } for pid=258856 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=6533 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Jun 22 15:14:05.395091 ip-10-0-160-48 kernel: audit: type=1400 audit(1624374845.359:8579): avc: denied { open } for pid=258856 comm="cat" path="/opt/0/index.html" dev="tmpfs" ino=6533 scontext=system_u:system_r:container_t:s0:c0,c1 tcontext=system_u:object_r:tmp_t:s0 tclass=file permissive=0 Do you have any additional information? Apart from test, what is the use case? Currently it does not seem to be likely we revert the reverted commit unless there was a strong justification. In that case the proper component to address the problem would be container-selinux, using files_read_all_files() or more targeted interface, e. g. files_read_generic_tmp_files().
(In reply to Zdenek Pytela from comment #1) > Do you have any additional information? Apart from test, what is the use > case? I didn't hit it in the wild, so I don't know the implications of that. Unfortunately we can't release a version which doesn't pass these tests and collect feedback about which functionality was impacted. > In that case the proper component to > address the problem would be container-selinux, using files_read_all_files() > or more targeted interface, e. g. files_read_generic_tmp_files(). So container-selinux needs to be adjusted as well? Could we revert this change, have container-selinux update their code and re-introduce this commit?
An open on a file indicates that it was not inherited.
(In reply to Vadim Rutkovsky from comment #2) > (In reply to Zdenek Pytela from comment #1) > > Do you have any additional information? Apart from test, what is the use > > case? > > I didn't hit it in the wild, so I don't know the implications of that. > Unfortunately we can't release a version which doesn't pass these tests and > collect feedback about which functionality was impacted. > > > In that case the proper component to > > address the problem would be container-selinux, using files_read_all_files() > > or more targeted interface, e. g. files_read_generic_tmp_files(). > > So container-selinux needs to be adjusted as well? Could we revert this > change, have container-selinux update their code and re-introduce this > commit? That sounds overcomplicated. If it is to be addressed, then container-selinux is the right component as it contains the container_t type. Reassigning for assessing by container-selinux team.
Basically "/opt/0/index.html" are mislabeled who is creating these files? They should be labeled with something that can be read by containers.
(In reply to Zdenek Pytela from comment #4) > That sounds overcomplicated. Its not overcomplicated, its a way of fixing regression in selinux-policy instead of passing it to container-selinux team. (In reply to Daniel Walsh from comment #5) > Basically "/opt/0/index.html" are mislabeled who is creating these files? Test itself is creating these in the container: https://github.com/kubernetes/kubernetes/blob/release-1.20/test/e2e/framework/volume/fixtures.go#L532
If you are volume mounting content into a container then the content needs to be labeled with container_file_t, chcon -R -t container_file_t /opt/0
Why would it not be labeled with `container_file_t` if we're creating the file in the container? I don't understand why it gets `tmp_t` instead Jan, Fabio, could you help me resolving this upstream?
The file created will get the label of the parent directory, since the parent directory is being volume mounted into the container, it must be labeled tmp_t. If the parent directory were labeled container_file_t, then the container will be allowed to read/write there. A confined container would not be allowed to create this file, so I guess this was created by and unconfiend container. This is about labeling the volume you are sticking into a container correctly. The SELinux policy does not allow a container_t process to files files with a tmp_t type, That is what the AVC is telling you.
@dwalsh I am thinking of changing the Level of the container here https://github.com/kubernetes/kubernetes/blob/58dbff6510967d81bc3fab0fd930d56023acb737/test/e2e/framework/volume/fixtures.go#L375 Could this work, without need to revert the commit in SELinux ?
I am not sure of all of the changes,but yes this would force the container to always have that SELinux label (level)
@dwalsh @jsafrane @vrutkovs I have reproduced the problem and below are my findings and possible solutions :- - The two test cases are related to "In-tree plugin kubernetes.io/host-path". Basically this driver creates a file on the "Host" within the **/tmp** , to be mounted as a "Volume" within a Pod. - Each test case contains `two` different Pods (i.e. hostpath-injector and hostpath-client). - hostpath-injector pod creates a file "/opt/0/index.html" and writes some content in it. It also reads the content to verify, "successfully". - hostpath-client pod tries to read the file to verify its content. However, it fails with " can't open '/opt/0/index.html': Permission denied ". - This happens only with SELinux enabled, since it does not allow two different processes/pods to read a file written within "/tmp". - By default, "In-tree plugin kubernetes.io/host-path" uses "/tmp" on the host machine. - Both Pods are being created by the same function, and the function contains SELinuxOptions. It is used only when the host enables SELinux https://github.com/kubernetes/kubernetes/blob/30b5da2e751c17febcdd6d54686823e4a02c9fb7/test/e2e/framework/volume/fixtures.go#L375 I think if we use correct SELinux label, which allows both processes/pods read files within "/tmp" on the host. Please correct me if I am wrong
I think this is related to https://github.com/kubernetes/kubernetes/issues/84585. Basically, the test creates 2 pods and tries to share data between them, which is something SELinux doesn't like. I believe the right way to fix this (as I pointed in the issue above) is to prepare a special directory in the host with the proper SELinux label and use that for the test. However, traditionally the way we fix this kind of problem is running the containers as privileged.
A better solution would be to run the containers with the same MCS Label, or to label the content with system_u:object_r:container_file_t:s0