Red Hat Bugzilla – Bug 1330141
openonload applications fail inside containers
Last modified: 2016-11-04 05:07:50 EDT
Description of problem: Trying to run openonload (a userspace networking stack) in a container. # docker run -d label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload netserver type=AVC msg=audit(1461588083.314:5082): avc: denied { ioctl } for pid=13623 comm="netserver" path="onload:[tcp:1:3]" dev="onloadfs" ino=3135700 scontext=system_u:system_r:svirt_lxc_net_t:s0:c671,c870 tcontext=system_u:object_r:unlabeled_t:s0 tclass=sock_file Disabling selinux for this container as a workaround. # docker run -d --security-opt label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload netserver ... Would we be interested in adding support for this to our policy files or would the vendor have to carry this ? #============= svirt_lxc_net_t ============== allow svirt_lxc_net_t unlabeled_t:sock_file ioctl;
I think your first example is wrong. docker run -d --security-opt label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload netserver Your example is missing the --security-opt flag? The process inside of the container is still running with a locked down svirt_lxc_net_t? The problem here is that the device did not get a label that SELinux is allowed to write. What version of docker are you using?
(In reply to Daniel Walsh from comment #3) > I think your first example is wrong. > > docker run -d --security-opt label:disable --device=/dev/onload > --device=/dev/onload_epoll --net=host r7perf_onload onload netserver > > Your example is missing the --security-opt flag? The process inside of the > container is still running with a locked down svirt_lxc_net_t? Sorry, first example is just a cut/paste error on my part. > The problem here is that the device did not get a label that SELinux is > allowed to write. > > What version of docker are you using? docker-1.9.1-28.el7.x86_64
On the host could you do a ls -lZ /dev/onload* And in a container do it also. probably in permissive mode. docker run -ti --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload ls -lZ /dev/onload*
root@bkr-hv10: ~ # ls -alZ /dev/onload* crw-rw-rw-. root root unconfined_u:object_r:device_t:s0 /dev/onload crw-rw-rw-. root root unconfined_u:object_r:device_t:s0 /dev/onload_epoll root@bkr-hv10: ~ # setenforce 0 ; docker run -ti --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload ls -lZ /dev/onload* ; setenforce 1 crw-rw-rw-. root root system_u:object_r:svirt_sandbox_file_t:s0:c35,c543 /dev/onload crw-rw-rw-. root root system_u:object_r:svirt_sandbox_file_t:s0:c35,c543 /dev/onload_epoll
So the containers stuff is doing the right thing, the problem here is these devices have no labels. So this is an SELinux issue. It looks like we need some support for labeling the "onloadfs" file system?
type=AVC msg=audit(1461588083.314:5082): avc: denied { ioctl } for pid=13623 comm="netserver" path="onload:[tcp:1:3]" dev="onloadfs" ino=3135700 scontext=system_u:system_r:svirt_lxc_net_t:s0:c671,c870 tcontext=system_u:object_r:unlabeled_t:s0 tclass=sock_file Since these devices have labels, Could this be a kernel issue? How does one get a label on this sock_file?
onloadfs is created by an out of tree kernel module: https://github.com/jeremyeder/openonload/blob/af9408f6df1bb19659f70661cfeb58a981b4a053/src/driver/linux_onload/onloadfs.c This is why I am wondering if we need to ask Solarflare to carry the labeling patch in their distribution. The other thing is that most of those customers disable selinux, but they'd probably be receptive or at least understanding. I've cc'd Davor Frank from Solarflare on this BZ.
Add a genfscon onloadfs entry to policy?
We could try that, and see if it fixes the issue. Getting an updated policy into rhel could be difficult, but Miroslav, could probably throw together a scratch build.
Looks like it should work based on the code referenced above, which calls d_instantiate(), which SELinux already hooks (security_d_instantiate -> selinux_d_instantiate) to set the incore inode security label. Obviously you'll need to allow access from svirt_lxc_net_t to whatever type is assigned to onloadfs inodes.
Sure. As long as we don't have to give access to unlabeled_t.
Added policy for defining onloadfs. https://github.com/fedora-selinux/selinux-policy/pull/129 Once this is merged, I will add the policy to docker-selinux.
Looks like it got merged into rawhide policy. Lukas can we get this back ported to rhel7?
Yes, Fix will be in: selinux-policy-3.13.1-77.el7 BZ from selinux-policy side: https://bugzilla.redhat.com/show_bug.cgi?id=1342930
Looks like this should be fixed in 7.3.
Note: No idea w/n the policy actually works as reported, but it is defined in the latest 7.3 build (snap 5): # rpm -q selinux-policy selinux-policy-3.13.1-100.el7.noarch # strings /etc/selinux/targeted/policy/policy.30 | grep -i onload onload_fs_t onloadfs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2634.html