Bug 1330141

Summary: openonload applications fail inside containers
Product: Red Hat Enterprise Linux 7 Reporter: Jeremy Eder <jeder>
Component: dockerAssignee: Daniel Walsh <dwalsh>
Status: CLOSED ERRATA QA Contact: atomic-bugs <atomic-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: abeausol, cevich, dfrank, dwalsh, eparis, gouyang, jarod, lsm5, lvrabec, mgrepl, mmalik, plautrba, pmoore, pvrabec, rstonehouse, sdsmall, ssekidde
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1342930 (view as bug list) Environment:
Last Closed: 2016-11-04 09:07:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1342930, 1342931    

Description Jeremy Eder 2016-04-25 12:50:04 UTC
Description of problem:

Trying to run openonload (a userspace networking stack) in a container.

# docker run -d label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload netserver

type=AVC msg=audit(1461588083.314:5082): avc:  denied  { ioctl } for  pid=13623 comm="netserver" path="onload:[tcp:1:3]" dev="onloadfs" ino=3135700 scontext=system_u:system_r:svirt_lxc_net_t:s0:c671,c870 tcontext=system_u:object_r:unlabeled_t:s0 tclass=sock_file

Disabling selinux for this container as a workaround.

# docker run -d --security-opt label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload netserver

...

Would we be interested in adding support for this to our policy files or would the vendor have to carry this ?

#============= svirt_lxc_net_t ==============
allow svirt_lxc_net_t unlabeled_t:sock_file ioctl;

Comment 3 Daniel Walsh 2016-05-03 13:18:53 UTC
I think your first example is wrong.

docker run -d --security-opt label:disable --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload netserver

Your example is missing the --security-opt flag?  The process inside of the container is still running with a locked down svirt_lxc_net_t?


The problem here is that the device did not get a label that SELinux is allowed to write.

What version of docker are you using?

Comment 4 Jeremy Eder 2016-05-03 13:22:00 UTC
(In reply to Daniel Walsh from comment #3)
> I think your first example is wrong.
> 
> docker run -d --security-opt label:disable --device=/dev/onload
> --device=/dev/onload_epoll --net=host r7perf_onload onload netserver
> 
> Your example is missing the --security-opt flag?  The process inside of the
> container is still running with a locked down svirt_lxc_net_t?

Sorry, first example is just a cut/paste error on my part.

> The problem here is that the device did not get a label that SELinux is
> allowed to write.
> 
> What version of docker are you using?

docker-1.9.1-28.el7.x86_64

Comment 5 Daniel Walsh 2016-05-03 13:26:37 UTC
On the host could you do a 

ls -lZ /dev/onload*

And in a container do it also. probably in permissive mode.

 docker run -ti --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload ls -lZ /dev/onload*

Comment 6 Jeremy Eder 2016-05-03 13:55:51 UTC
root@bkr-hv10: ~ # ls -alZ /dev/onload*
crw-rw-rw-. root root unconfined_u:object_r:device_t:s0 /dev/onload
crw-rw-rw-. root root unconfined_u:object_r:device_t:s0 /dev/onload_epoll


root@bkr-hv10: ~ # setenforce 0 ; docker run -ti --device=/dev/onload --device=/dev/onload_epoll --net=host r7perf_onload onload ls -lZ /dev/onload* ; setenforce 1
crw-rw-rw-. root root system_u:object_r:svirt_sandbox_file_t:s0:c35,c543 /dev/onload
crw-rw-rw-. root root system_u:object_r:svirt_sandbox_file_t:s0:c35,c543 /dev/onload_epoll

Comment 7 Daniel Walsh 2016-05-03 14:21:54 UTC
So the containers stuff is doing the right thing, the problem here is these  devices have no labels.  So this is an SELinux issue.  It looks like we need some support for labeling the "onloadfs" file system?

Comment 8 Daniel Walsh 2016-05-03 14:23:27 UTC
type=AVC msg=audit(1461588083.314:5082): avc:  denied  { ioctl } for  pid=13623 comm="netserver" path="onload:[tcp:1:3]" dev="onloadfs" ino=3135700 scontext=system_u:system_r:svirt_lxc_net_t:s0:c671,c870 tcontext=system_u:object_r:unlabeled_t:s0 tclass=sock_file

Since these devices have labels, Could this be a kernel issue?  How does one get a label on this sock_file?

Comment 9 Jeremy Eder 2016-05-03 14:40:00 UTC
onloadfs is created by an out of tree kernel module:

https://github.com/jeremyeder/openonload/blob/af9408f6df1bb19659f70661cfeb58a981b4a053/src/driver/linux_onload/onloadfs.c

This is why I am wondering if we need to ask Solarflare to carry the labeling patch in their distribution.  The other thing is that most of those customers disable selinux, but they'd probably be receptive or at least understanding.

I've cc'd Davor Frank from Solarflare on this BZ.

Comment 10 Stephen Smalley 2016-05-03 14:43:41 UTC
Add a genfscon onloadfs entry to policy?

Comment 11 Daniel Walsh 2016-05-03 15:32:47 UTC
We could try that, and see if it fixes the issue.  Getting an updated policy into rhel could be difficult, but Miroslav, could probably throw together a scratch build.

Comment 12 Stephen Smalley 2016-05-03 15:38:28 UTC
Looks like it should work based on the code referenced above, which calls d_instantiate(), which SELinux already hooks (security_d_instantiate -> selinux_d_instantiate) to set the incore inode security label.  Obviously you'll need to allow access from svirt_lxc_net_t to whatever type is assigned to onloadfs inodes.

Comment 13 Daniel Walsh 2016-05-03 17:32:13 UTC
Sure.  As long as we don't have to give access to unlabeled_t.

Comment 14 Daniel Walsh 2016-06-03 18:55:40 UTC
Added policy for defining onloadfs.

https://github.com/fedora-selinux/selinux-policy/pull/129


Once this is merged, I will add the policy to docker-selinux.

Comment 15 Daniel Walsh 2016-06-07 11:42:29 UTC
Looks like it got merged into rawhide policy.

Lukas can we get this back ported to rhel7?

Comment 16 Lukas Vrabec 2016-06-07 11:51:38 UTC
Yes, 

Fix will be in: selinux-policy-3.13.1-77.el7

BZ from selinux-policy side:
https://bugzilla.redhat.com/show_bug.cgi?id=1342930

Comment 19 Daniel Walsh 2016-09-23 12:39:37 UTC
Looks like this should be fixed in 7.3.

Comment 20 Chris Evich 2016-09-27 18:56:12 UTC
Note: No idea w/n the policy actually works as reported, but it is defined in the latest 7.3 build (snap 5):

# rpm -q selinux-policy
selinux-policy-3.13.1-100.el7.noarch

# strings /etc/selinux/targeted/policy/policy.30 | grep -i onload
onload_fs_t
onloadfs

Comment 21 errata-xmlrpc 2016-11-04 09:07:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2634.html