Bug 1373658 - OpenShift cannot start with overlay graph driver and SELinux [NEEDINFO]
Summary: OpenShift cannot start with overlay graph driver and SELinux
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: docker
Version: rawhide
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Lokesh Mandvekar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-06 21:52 UTC by Jonathan Yu
Modified: 2016-12-09 23:48 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-09 23:26:12 UTC
Type: Bug
dwalsh: needinfo? (lsm5)


Attachments (Terms of Use)
systemctl status docker (2.82 KB, text/plain)
2016-09-07 12:45 UTC, Daniel Walsh
no flags Details

Description Jonathan Yu 2016-09-06 21:52:19 UTC
Description of problem:

When choosing the overlay2 graph driver and using SELinux in enforcing mode, pods cannot start.

Version-Release number of selected component (if applicable):

Fedora rawhide
docker-1.12
overlay2 storage driver

docker info:

Containers: 18
 Running: 0
 Paused: 0
 Stopped: 18
Images: 35
Server Version: 1.12.1
Storage Driver: overlay2
 Backing Filesystem: extfs
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: inactive
Runtimes: oci runc
Default Runtime: oci
Security Options: seccomp selinux
Kernel Version: 4.8.0-0.rc4.git1.1.fc26.x86_64
Operating System: Fedora 26 (Rawhide)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 2
Total Memory: 3.838 GiB
Name: limelight
ID: FJCX:NNCM:NYFN:E552:IDKE:6TJU:65J2:DLVN:SYB3:JYMZ:DEAU:ARWA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 172.30.0.0/16
 127.0.0.0/8
Registries: docker.io (secure)

How reproducible: At-will

Steps to Reproduce:

1. Enable overlay2 driver (in /etc/sysconfig/docker-storage): DOCKER_STORAGE_OPTIONS= --storage-driver=overlay2
2. Enable insecure-registry (in /etc/sysconfig/docker): 
INSECURE_REGISTRY='--insecure-registry 172.30.0.0/16'
3. Ensure SELinux is enabled (should be on by default, in /etc/sysconfig/docker): OPTIONS='--selinux-enabled --log-driver=journald'
4. Download the latest release binaries from https://github.com/openshift/origin/releases
5. Run "oc cluster up"

Actual results:

The main Web UI components should start, but "docker ps" should show crashed containers for the registry and router pods. When logged in system:admin, "oc get pods --all-namespaces" shows the following:

$ oc get pods --all-namespaces
NAMESPACE   NAME                       READY     STATUS               RESTARTS   AGE
default     docker-registry-1-deploy   0/1       ContainerCannotRun   0          1m

Looking at the audit logs, we should see:

type=AVC msg=audit(1472846505.099:618): avc:  denied  { entrypoint } for  pid=12603 comm="exe" path="/pod" dev="overlay" ino=101594 scontext=system_u:system_r:svirt_lxc_net_t:s0:c236,c793 tcontext=system_u:object_r:docker_var_lib_t:s0 tclass=file permissive=0

or in permissive (setenforce 0) mode:

type=AVC msg=audit(1472846701.727:662): avc:  denied  { entrypoint } for  pid=12870 comm="exe" path="/pod" dev="overlay" ino=102520 scontext=system_u:system_r:svirt_lxc_net_t:s0:c59,c580 tcontext=system_u:object_r:docker_var_lib_t:s0 tclass=file permissive=1

Expected results:

The application should start correctly.

Additional info:

The Dockerfile of the failing pod is here: https://github.com/openshift/origin/tree/master/images/pod

You can also manually reproduce by running (origin-pod is on DockerHub):

$ docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3
standard_init_linux.go:175: exec user process caused "permission denied"

Comment 1 Mrunal Patel 2016-09-06 21:54:14 UTC
Dan,
/pod is a simple go binary that just waits for signal and exits like the kubernetes pause container - https://github.com/openshift/origin/blob/master/images/pod/pod.go

Comment 2 Daniel Walsh 2016-09-07 12:44:24 UTC
This is working for me on Rawhide.

 docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3

# uname -r
4.8.0-0.rc4.git1.1.fc26.x86_64

# docker info
Containers: 52
 Running: 1
 Paused: 0
 Stopped: 51
Images: 57
Server Version: 1.12.1
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: host bridge null overlay
Swarm: inactive
Runtimes: oci runc
Default Runtime: oci
Security Options: seccomp selinux
Kernel Version: 4.8.0-0.rc4.git1.1.fc26.x86_64
Operating System: Fedora 26 (Workstation Edition)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 4
Total Memory: 7.478 GiB
Name: dhcp-10-19-62-196.boston.devel.redhat.com
ID: QCJD:BQVE:IUG3:CFBA:4EKW:A3RD:JUR2:7VOG:XWP6:2ELL:KMIY:E5JM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://atomic-registry.usersys.redhat.com/v1/
Insecure Registries:
 atomic-registry.usersys.redhat.com:5000
 registry.access.stage.redhat.com
 atomic-registry.usersys.redhat.com
 127.0.0.0/8

Comment 3 Daniel Walsh 2016-09-07 12:45:06 UTC
Created attachment 1198718 [details]
systemctl status docker

Comment 4 Vivek Goyal 2016-09-07 13:47:18 UTC
Dan, original report seems to be w.r.t overlay2. Can you try using overlay2 instead of overlay.

Comment 5 Vivek Goyal 2016-09-07 20:49:26 UTC
Me and Dan were debugging this and I think we found the root cause. Docker selinux policy needs to be changed to label /var/lib/docker/overlay2 as docker_share_t instead of docker_var_lib_t. Once that change in place, it should be fixed.

In the mean time, you can make this change manually on your system and see if you can make progress.

# cd /var/lib/docker/
# chcon -t docker_share_t -R overlay2

Comment 6 Jonathan Yu 2016-09-07 21:01:44 UTC
Vivek,

Thanks! I gave it a try and it worked.

Here's the default (has permission problems):

# ls -alZ .
total 52K
drwx--x--x. 11 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 13:54 .
drwxr-xr-x. 24 root root system_u:object_r:var_lib_t:s0        4.0K Sep  2 07:54 ..
drwx------.  8 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 13:50 containers
drwx------.  4 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 devicemapper
drwx------.  5 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 10:45 image
drwxr-x---.  3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 network
drwx------.  3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 13:54 overlay2
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 swarm
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 10:45 tmp
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 trust
drwx------. 88 root root system_u:object_r:docker_var_lib_t:s0  12K Sep  6 14:49 volumes

After running chcon and restarting, everything works:

# chcon -t docker_share_t -R /var/lib/docker/overlay2
# ls -alZ
total 52K
drwx--x--x. 11 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 13:54 .
drwxr-xr-x. 24 root root system_u:object_r:var_lib_t:s0        4.0K Sep  2 07:54 ..
drwx------.  8 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 13:50 containers
drwx------.  4 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 devicemapper
drwx------.  5 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 10:45 image
drwxr-x---.  3 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 network
drwx------.  3 root root system_u:object_r:docker_share_t:s0   4.0K Sep  7 13:54 overlay2
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 swarm
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Sep  7 10:45 tmp
drwx------.  2 root root system_u:object_r:docker_var_lib_t:s0 4.0K Aug 31 13:10 trust
drwx------. 88 root root system_u:object_r:docker_var_lib_t:s0  12K Sep  6 14:49 volumes
# docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3
<process hangs, no error output - as expected>

Changing the security context back makes the problem reproducible once again:

# chcon -t docker_var_lib_t -R /var/lib/docker/overlay2
# docker run --rm -it openshift/origin-pod:v1.3.0-alpha.3
standard_init_linux.go:175: exec user process caused "permission denied"

Comment 7 Daniel Walsh 2016-09-07 21:26:05 UTC
Pushed a fix to docker-selinux on github.

commit 346ed1d81aee0b85613635a041de2ed78d4ef6a4
Author: Dan Walsh <dwalsh>
Date:   Wed Sep 7 16:25:12 2016 -0400

    Label /var/lib/docker/overlay2 as docker_share_t


I have asked Lokesh to build a new docker-selinux package from master for F24, F25 and Rawhide.

Comment 8 Dusty Mabe 2016-12-09 23:26:12 UTC
it looks like this has made it into f25. I'm going to close this bug

Comment 9 Dusty Mabe 2016-12-09 23:48:55 UTC
FYI I opened a new bug with some detail on a new issue that is happening in F25 https://bugzilla.redhat.com/show_bug.cgi?id=1403398


Note You need to log in before you can comment on or make changes to this bug.