Bug 1375661

Summary: entrypoint script name 'init' causes debian images to mount /run as tmpfs
Product: [Fedora] Fedora Reporter: Ulf Seltmann <ulf.seltmann>
Component: oci-systemd-hookAssignee: Mrunal Patel <mpatel>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: adimania, admiller, amurdaca, dwalsh, ichavero, jcajka, jchaloup, lsm5, marianne, miminar, nalin, ppyy, riek, ulf.seltmann, vbatts
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-18 15:08:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ulf Seltmann 2016-09-13 15:44:51 UTC
When using entrypoint=/init (from Dockerfile) /run is mounted by default as a tmpfs which interferes with custom mounts like /var/run/mysqld which points to /run/mysqld.

This does not happen if one overrides the entrypoint. see below for output.

Steps to reproduce the issue:
1. git clone --single-branch -b run-creation-bug ssh://git/useltmann/docker-dev-dotdeb.git run-creation-bug
2. cd run-creation-bug; docker build -t run-creation-bug . 
3. docker run --rm -ti run-creation-bug mount | grep run

/dev/mapper/fedora-root on /run/secrets type ext4 (rw,relatime,seclabel,data=ordered)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c232,c383",size=65536k,mode=755)
Describe the results you received:
One can see there is a tmpfs mounted on /run


docker run --rm -ti --entrypoint=mount run-creation-bug | grep run
/dev/mapper/fedora-root on /run/secrets type ext4 (rw,relatime,seclabel,data=ordered)
And here there is no tmpfs mounted on /run which i personally prefer as expected behaviour

Output of docker version:

Client:
 Version:         1.10.3
 API version:     1.22
 Package version: docker-1.10.3-50.gita612434.fc24.x86_64
 Go version:      go1.6.3
 Git commit:      a612434/1.10.3
 Built:           
 OS/Arch:         linux/amd64

Server:
 Version:         1.10.3
 API version:     1.22
 Package version: docker-1.10.3-50.gita612434.fc24.x86_64
 Go version:      go1.6.3
 Git commit:      a612434/1.10.3
 Built:           
 OS/Arch:         linux/amd64

Output of docker info:

Containers: 20
 Running: 2
 Paused: 0
 Stopped: 18
Images: 353
Server Version: 1.10.3
Storage Driver: devicemapper
 Pool Name: fedora_home-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 15.39 GB
 Data Space Total: 107.2 GB
 Data Space Available: 91.76 GB
 Metadata Space Used: 8.049 MB
 Metadata Space Total: 541.1 MB
 Metadata Space Available: 533 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.122 (2016-04-09)
Execution Driver: native-0.2
Logging Driver: journald
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 4.7.2-201.fc24.x86_64
Operating System: Fedora 24 (Workstation Edition)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 8
Total Memory: 31.35 GiB
Name: snowball
ID: KI4M:XPJ4:QYQK:OPUL:SMQW:KKGQ:J3QF:IJUJ:5N2N:YCLX:CBO5:KFFB
Registries: docker.io (secure)

Additional environment details (AWS, VirtualBox, physical, etc.):
selinux is Enforcing

renaming the entypoint-script from /init to /buggy everything works as expected.

Comment 1 Antonio Murdaca 2016-09-13 15:52:25 UTC
Dan (Walsh) - this could be caused by oci-systemd-hook which is probably thinking that /init is systemd and ends up mounting /run as a tmpfs.

To the op: if you remove oci-systemd-hook with "rm -rf /usr/libexec/oci/hooks.d/oci-systemd-hook" does the problem persist?

Comment 2 Antonio Murdaca 2016-09-13 16:16:15 UTC
Confirmed, this is a bug with oci-systemd-hook - by uninstalling oci-systemd-hook or rm'ing the oci-systemd-hook binary under /usr/libexec/oci/hooks.d/, /run isn't mounted as a tmpfs.

Dan, what would you do here?

Comment 3 Antonio Murdaca 2016-09-13 16:25:38 UTC
I'm not really saying this is a bug though - it's probably the expected behavior with "oci-systemd-hook" and probably needs more documentation around the entrypoint name - just thinking out loud.

Comment 4 Daniel Walsh 2016-09-13 19:31:25 UTC
oci-systemd-hook looks for init or systemd as the entrypoint and makes the assumption that you are running systemd inside the container.  Is there something that is breaking or is this just you would prefer /run to not be on a tmpfs?

Comment 5 Ulf Seltmann 2016-09-14 08:53:38 UTC
(In reply to Daniel Walsh from comment #4)
> oci-systemd-hook looks for init or systemd as the entrypoint and makes the
> assumption that you are running systemd inside the container.  Is there
> something that is breaking or is this just you would prefer /run to not be
> on a tmpfs?

first of all its inconsistent among different host oses since at least ubuntu is not mounting /run as tmpfs

second it breaks volume mapping e.g. /var/run/mysqld for sharing mysqld's socket because the systemd-mount is overlaying and in debian /var/run is a symlink to /run.

not to mention that installing apps like apache2 create directories under /var/run. needless to say that this directories are missing as well if systemd mounts /run as tmpfs on container startup.

Comment 6 Daniel Walsh 2016-09-16 12:41:50 UTC
But it is consistent with the host you are running it on.  The code should handle the fact that a /run is on any kind of file system.

THe breaks the volume mapping is not supposed to happen.  There is a patch that is supposed to copy the underlying file system ontop of the tmpfs in runc.

That should be implemented in oci-register-machine.  ANything that is in /run on the host image should show up on the tmpfs in the image.  

If you were running systemd as pid 1 it would have populated the /run tmpfs using systemd-tmpfiles just like it does in Fedora/Centos/RHEL.  The problem here is that you were using a non systemd init program and confused the tool.  

Bottom line we should fix the mounting of the tmpfs to have the underlying content.

Comment 7 Ulf Seltmann 2016-09-19 08:55:35 UTC
(In reply to Daniel Walsh from comment #6)
> But it is consistent with the host you are running it on.  The code should
> handle the fact that a /run is on any kind of file system.
how is it consistency when a debian image behaves differently depending on the underlying host system? isnt docker supposed to abstract host differences and handle all images the same regardless where they are running? 
> 
> THe breaks the volume mapping is not supposed to happen.  There is a patch
> that is supposed to copy the underlying file system ontop of the tmpfs in
> runc.
by "copy the underlying filesystem ontop" you mean literally copying the files and folder structure over into the /run folder? because this is not a satisfying solution (which is not working whatsoever). the big advantage of bind mounts is that you can modify the content bidirectional. copying the content creates - obviously - a copy and all changes are made to the copy.

Comment 8 Daniel Walsh 2016-09-19 11:44:46 UTC
bind mounts should continue to work.  If you volume mount in -v /run/foobar, then this should get mounted on the tmpfs /run/foobar.

There are always going to be differences between distributions, and versions of docker.  There is no guarantee that containers will work on every platform, although that is the goal.   I personally believe that most containers should run in --read-only mode with tmpfs mounted on /run and /tmp, to prevent a hacked version of an application from modifying the container image.  

The goal is the debian image not to run differently, if it does then this is a bug. If there was a directory in /run/mariadb or /run/httpd in the image and this does not show up on the /run tmpfs then this is a bug.

Comment 9 Ulf Seltmann 2016-09-19 13:21:05 UTC
OK. so if you need anything else from me than the description i provided in this issue, let me know.