Hide Forgot
Description of problem: When choosing the overlay2 graph driver and using SELinux in enforcing mode, pods cannot start. Version-Release number of selected component (if applicable): Fedora rawhide docker-1.12 overlay2 storage driver docker info: Containers: 18 Running: 0 Paused: 0 Stopped: 18 Images: 35 Server Version: 1.12.1 Storage Driver: overlay2 Backing Filesystem: extfs Logging Driver: journald Cgroup Driver: systemd Plugins: Volume: local Network: bridge null host overlay Swarm: inactive Runtimes: oci runc Default Runtime: oci Security Options: seccomp selinux Kernel Version: 4.8.0-0.rc4.git1.1.fc26.x86_64 Operating System: Fedora 26 (Rawhide) OSType: linux Architecture: x86_64 Number of Docker Hooks: 2 CPUs: 2 Total Memory: 3.838 GiB Name: limelight ID: FJCX:NNCM:NYFN:E552:IDKE:6TJU:65J2:DLVN:SYB3:JYMZ:DEAU:ARWA Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Insecure Registries: 172.30.0.0/16 127.0.0.0/8 Registries: docker.io (secure) How reproducible: At-will Steps to Reproduce: 1. Enable overlay2 driver (in /etc/sysconfig/docker-storage): DOCKER_STORAGE_OPTIONS= --storage-driver=overlay2 2. Enable insecure-registry (in /etc/sysconfig/docker): INSECURE_REGISTRY='--insecure-registry 172.30.0.0/16' 3. Ensure SELinux is enabled (should be on by default, in /etc/sysconfig/docker): OPTIONS='--selinux-enabled --log-driver=journald' 4. Download the latest release binaries from https://github.com/openshift/origin/releases 5. Run "oc cluster up" Actual results: The main Web UI components should start, but "docker ps" should show crashed containers for the registry and router pods. When logged in system:admin, "oc get pods --all-namespaces" shows the following: $ oc get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default docker-registry-1-deploy 0/1 ContainerCannotRun 0 1m Looking at the audit logs, we should see: type=AVC msg=audit(1472846505.099:618): avc: denied { entrypoint } for pid=12603 comm="exe" path="/pod" dev="overlay" ino=101594 scontext=system_u:system_r:svirt_lxc_net_t:s0:c236,c793 tcontext=system_u:object_r:docker_var_lib_t:s0 tclass=file permissive=0 or in permissive (setenforce 0) mode: type=AVC msg=audit(1472846701.727:662): avc: denied { entrypoint } for pid=12870 comm="exe" path="/pod" dev="overlay" ino=102520 scontext=system_u:system_r:svirt_lxc_net_t:s0:c59,c580 tcontext=system_u:object_r:docker_var_lib_t:s0 tclass=file permissive=1 Expected results: The application should start correctly. Additional info: The Dockerfile of the failing pod is here: https://github.com/openshift/origin/tree/master/images/pod You can also manually reproduce by running (origin-pod is on DockerHub): $ docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3 standard_init_linux.go:175: exec user process caused "permission denied"
Dan, /pod is a simple go binary that just waits for signal and exits like the kubernetes pause container - https://github.com/openshift/origin/blob/master/images/pod/pod.go
This is working for me on Rawhide. docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3 # uname -r 4.8.0-0.rc4.git1.1.fc26.x86_64 # docker info Containers: 52 Running: 1 Paused: 0 Stopped: 51 Images: 57 Server Version: 1.12.1 Storage Driver: overlay Backing Filesystem: extfs Logging Driver: journald Cgroup Driver: systemd Plugins: Volume: local Network: host bridge null overlay Swarm: inactive Runtimes: oci runc Default Runtime: oci Security Options: seccomp selinux Kernel Version: 4.8.0-0.rc4.git1.1.fc26.x86_64 Operating System: Fedora 26 (Workstation Edition) OSType: linux Architecture: x86_64 Number of Docker Hooks: 2 CPUs: 4 Total Memory: 7.478 GiB Name: dhcp-10-19-62-196.boston.devel.redhat.com ID: QCJD:BQVE:IUG3:CFBA:4EKW:A3RD:JUR2:7VOG:XWP6:2ELL:KMIY:E5JM Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://atomic-registry.usersys.redhat.com/v1/ Insecure Registries: atomic-registry.usersys.redhat.com:5000 registry.access.stage.redhat.com atomic-registry.usersys.redhat.com 127.0.0.0/8
Created attachment 1198718 [details] systemctl status docker
Dan, original report seems to be w.r.t overlay2. Can you try using overlay2 instead of overlay.
Me and Dan were debugging this and I think we found the root cause. Docker selinux policy needs to be changed to label /var/lib/docker/overlay2 as docker_share_t instead of docker_var_lib_t. Once that change in place, it should be fixed. In the mean time, you can make this change manually on your system and see if you can make progress. # cd /var/lib/docker/ # chcon -t docker_share_t -R overlay2
Vivek, Thanks! I gave it a try and it worked. Here's the default (has permission problems): # ls -alZ . total 52K drwx--x--x. 11 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 13:54 . drwxr-xr-x. 24 root root system_u:object_r:var_lib_t:s0 4.0K Sep 2 07:54 .. drwx------. 8 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 13:50 containers drwx------. 4 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 devicemapper drwx------. 5 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 10:45 image drwxr-x---. 3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 network drwx------. 3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 13:54 overlay2 drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 swarm drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 10:45 tmp drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 trust drwx------. 88 root root system_u:object_r:docker_var_lib_t:s0 12K Sep 6 14:49 volumes After running chcon and restarting, everything works: # chcon -t docker_share_t -R /var/lib/docker/overlay2 # ls -alZ total 52K drwx--x--x. 11 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 13:54 . drwxr-xr-x. 24 root root system_u:object_r:var_lib_t:s0 4.0K Sep 2 07:54 .. drwx------. 8 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 13:50 containers drwx------. 4 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 devicemapper drwx------. 5 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 10:45 image drwxr-x---. 3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 network drwx------. 3 root root system_u:object_r:docker_share_t:s0 4.0K Sep 7 13:54 overlay2 drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 swarm drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep 7 10:45 tmp drwx------. 2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 trust drwx------. 88 root root system_u:object_r:docker_var_lib_t:s0 12K Sep 6 14:49 volumes # docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3 <process hangs, no error output - as expected> Changing the security context back makes the problem reproducible once again: # chcon -t docker_var_lib_t -R /var/lib/docker/overlay2 # docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3 standard_init_linux.go:175: exec user process caused "permission denied"
Pushed a fix to docker-selinux on github. commit 346ed1d81aee0b85613635a041de2ed78d4ef6a4 Author: Dan Walsh <dwalsh> Date: Wed Sep 7 16:25:12 2016 -0400 Label /var/lib/docker/overlay2 as docker_share_t I have asked Lokesh to build a new docker-selinux package from master for F24, F25 and Rawhide.
it looks like this has made it into f25. I'm going to close this bug
FYI I opened a new bug with some detail on a new issue that is happening in F25 https://bugzilla.redhat.com/show_bug.cgi?id=1403398