Created attachment 1146654 [details] Code to reproduce bug Description of problem: Unix sockets can be used to transfer file descriptors across processes. Because containers share the same container state, this is also possible for processes running in different containers. However, SELinux blocks the file descriptor transfer (which I view as a bug) while otherwise allowing regular communication on Unix sockets across containers. File descriptors can be transferred if one of the (two) containers has the same IPC namespace as the host or if both the containers are in the same IPC namespace or SELinux is off. Version-Release number of selected component (if applicable): Tested on Centos-based Atomic running Docker 1.9.1. Also tested on Fedora 23 with the same Docker version. How reproducible: Always reproduced Steps to Reproduce: Use the attached code and create a Docker container using the provided Dockerfile. Here is a transcript. The message "printing to new stdout" is a printf done by the remote server after it receives client's stdout fd. bash-4.2# docker build -t test . ... bash-4.2# mkdir myvol bash-4.2# docker run -it --rm --name server -v myvol:/tmp test /server In another terminal... bash-4.2# setenforce 0 bash-4.2# docker run -it --rm --name client -v myvol:/tmp test /client sending message now printing to new stdout bash-4.2# setenforce 1 bash-4.2# docker run -it --rm --name client -v myvol:/tmp test /client sending message now bash-4.2# docker run -it --rm --name client -v myvol:/tmp --ipc host test /client sending message now printing to new stdout bash-4.2# docker run -it --rm --name client -v myvol:/tmp --ipc container:server test /client sending message now printing to new stdout Actual results: Results shown above in the transcript Expected results: With setenforce 1 and no --ipc option, we should still see a "printing to new stdout" message. Additional info:
typo here: "Because containers share the same container state" => "Because containers share the same kernel state"
When you share the same IPC Namespace, your containers share the Same SELinux label, so to make this work, you need to have both containers share the same MCS Label. Something like this will work docker run -it --rm --security-opt label:level:s0:c1000,c1001 --name server -v myvol:/tmp test /server docker run -it --rm --security-opt label:level:s0:c1000,c1001 --name client -v myvol:/tmp test /client When a socket is created by a process it automatically gets assigned the same label as the process creating the socket. If that socket is passed to another process the new process label has to have access to the socket with the "socket label". So if I have a process labeled system_u:system_r:svirt_lxc_net_t:s0:c1,c2 and I pass that system_u:system_r:svirt_lxc_net_t:s0:c4,c5, SELinux will block the access.
Wrote a blog covering this. http://danwalsh.livejournal.com/74421.html
Thanks, this solution works well. However, the two containers now have no isolation from each other. I do not know much about SELinux so have a couple of questions: 1. Why can regular communication happen on sockets (created as in OP in shared volumes) while only the transfer of file descriptors is restricted? 2. When a volume is mounted with the :z directive, the socket created inside it does not have MCS labels (the label, as shown by ls -lZ on host, is only system_u:object_r:svirt_sandbox_file_t:s0), and yet the file descriptors cannot be transferred. Why should this be the case?
Thanks for the comment and the blog post, I didn't realize there was new activity until posted my comment. Feel free to close the bug again, but I would be very much educated with the answers to my questions. Also, is there a way we could enable passing of file descriptors on a particular socket while keeping the two containers isolated otherwise? Will be happy to do this discussion in the comments to the blog post, if you prefer.
Could you attach the AVC's that you received? And were you able to get two containers to talk over a shared socket, that should probably be blocked.
Here is the example that illustrates it all. The first is a typescript from a terminal running the server. The second is the one with the client, interspersed with disabling/enabling of SELinux. The "hello world" message is always printed on the server (even when SELinux is enabled); this message is sent from the client. The message "printing to new stdout" is written by the server on the new fd. If the fd transfer succeeds, the message is printed on client else on the server. The AVC received is shown below the client terminal output. The path "/3" probably corresponds to fd 3. [root@localhost code]# docker run -it --rm --name server -v /vagrant/code/myvol:/tmp:z test /server hello world remote fd number 1 new fd number 3 hello world remote fd number 1 new fd number 0 printing to new stdout [root@localhost code]# setenforce 0 [root@localhost code]# docker run -it --rm --name client -v /vagrant/code/myvol:/tmp test /client sending message now printing to new stdout [root@localhost code]# setenforce 1 [root@localhost code]# docker run -it --rm --name client -v /vagrant/code/myvol:/tmp test /client sending message now type=AVC msg=audit(1462820444.992:2087): avc: denied { read write } for pid=23359 comm="server" path="/3" dev="devpts" ino=6 scontext=system_u:system_r:svirt_lxc_net_t:s0:c463,c604 tcontext=system_u:object_r:svirt_sandbox_file_t:s0:c386,c667 tclass=chr_file permissive=0 Containers talking over shared sockets appears like a common case, and I can see it elsewhere on the Web: http://stackoverflow.com/questions/24956322/can-docker-port-forward-to-a-unix-file-socket-on-the-host-container http://jpetazzo.github.io/2014/06/23/docker-ssh-considered-evil/#restart-my-service http://stackoverflow.com/questions/32180589/docker-how-to-expose-a-socket-over-a-port-for-a-django-application Servers like mysql and redis optionally provide services over Unix sockets and some containers use them as such.
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle. Changing version to '25'.
The best way to handle this would be to tell them to share IPC between the containers. docker run --ipc container=CONTINAINER1UUID ... This will cause the SELinux labels to be the same and the IPC will work.