Hide Forgot
When using entrypoint=/init (from Dockerfile) /run is mounted by default as a tmpfs which interferes with custom mounts like /var/run/mysqld which points to /run/mysqld. This does not happen if one overrides the entrypoint. see below for output. Steps to reproduce the issue: 1. git clone --single-branch -b run-creation-bug ssh://git/useltmann/docker-dev-dotdeb.git run-creation-bug 2. cd run-creation-bug; docker build -t run-creation-bug . 3. docker run --rm -ti run-creation-bug mount | grep run /dev/mapper/fedora-root on /run/secrets type ext4 (rw,relatime,seclabel,data=ordered) tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c232,c383",size=65536k,mode=755) Describe the results you received: One can see there is a tmpfs mounted on /run docker run --rm -ti --entrypoint=mount run-creation-bug | grep run /dev/mapper/fedora-root on /run/secrets type ext4 (rw,relatime,seclabel,data=ordered) And here there is no tmpfs mounted on /run which i personally prefer as expected behaviour Output of docker version: Client: Version: 1.10.3 API version: 1.22 Package version: docker-1.10.3-50.gita612434.fc24.x86_64 Go version: go1.6.3 Git commit: a612434/1.10.3 Built: OS/Arch: linux/amd64 Server: Version: 1.10.3 API version: 1.22 Package version: docker-1.10.3-50.gita612434.fc24.x86_64 Go version: go1.6.3 Git commit: a612434/1.10.3 Built: OS/Arch: linux/amd64 Output of docker info: Containers: 20 Running: 2 Paused: 0 Stopped: 18 Images: 353 Server Version: 1.10.3 Storage Driver: devicemapper Pool Name: fedora_home-docker--pool Pool Blocksize: 524.3 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 15.39 GB Data Space Total: 107.2 GB Data Space Available: 91.76 GB Metadata Space Used: 8.049 MB Metadata Space Total: 541.1 MB Metadata Space Available: 533 MB Udev Sync Supported: true Deferred Removal Enabled: true Deferred Deletion Enabled: true Deferred Deleted Device Count: 0 Library Version: 1.02.122 (2016-04-09) Execution Driver: native-0.2 Logging Driver: journald Plugins: Volume: local Network: bridge null host Kernel Version: 4.7.2-201.fc24.x86_64 Operating System: Fedora 24 (Workstation Edition) OSType: linux Architecture: x86_64 Number of Docker Hooks: 2 CPUs: 8 Total Memory: 31.35 GiB Name: snowball ID: KI4M:XPJ4:QYQK:OPUL:SMQW:KKGQ:J3QF:IJUJ:5N2N:YCLX:CBO5:KFFB Registries: docker.io (secure) Additional environment details (AWS, VirtualBox, physical, etc.): selinux is Enforcing renaming the entypoint-script from /init to /buggy everything works as expected.
Dan (Walsh) - this could be caused by oci-systemd-hook which is probably thinking that /init is systemd and ends up mounting /run as a tmpfs. To the op: if you remove oci-systemd-hook with "rm -rf /usr/libexec/oci/hooks.d/oci-systemd-hook" does the problem persist?
Confirmed, this is a bug with oci-systemd-hook - by uninstalling oci-systemd-hook or rm'ing the oci-systemd-hook binary under /usr/libexec/oci/hooks.d/, /run isn't mounted as a tmpfs. Dan, what would you do here?
I'm not really saying this is a bug though - it's probably the expected behavior with "oci-systemd-hook" and probably needs more documentation around the entrypoint name - just thinking out loud.
oci-systemd-hook looks for init or systemd as the entrypoint and makes the assumption that you are running systemd inside the container. Is there something that is breaking or is this just you would prefer /run to not be on a tmpfs?
(In reply to Daniel Walsh from comment #4) > oci-systemd-hook looks for init or systemd as the entrypoint and makes the > assumption that you are running systemd inside the container. Is there > something that is breaking or is this just you would prefer /run to not be > on a tmpfs? first of all its inconsistent among different host oses since at least ubuntu is not mounting /run as tmpfs second it breaks volume mapping e.g. /var/run/mysqld for sharing mysqld's socket because the systemd-mount is overlaying and in debian /var/run is a symlink to /run. not to mention that installing apps like apache2 create directories under /var/run. needless to say that this directories are missing as well if systemd mounts /run as tmpfs on container startup.
But it is consistent with the host you are running it on. The code should handle the fact that a /run is on any kind of file system. THe breaks the volume mapping is not supposed to happen. There is a patch that is supposed to copy the underlying file system ontop of the tmpfs in runc. That should be implemented in oci-register-machine. ANything that is in /run on the host image should show up on the tmpfs in the image. If you were running systemd as pid 1 it would have populated the /run tmpfs using systemd-tmpfiles just like it does in Fedora/Centos/RHEL. The problem here is that you were using a non systemd init program and confused the tool. Bottom line we should fix the mounting of the tmpfs to have the underlying content.
(In reply to Daniel Walsh from comment #6) > But it is consistent with the host you are running it on. The code should > handle the fact that a /run is on any kind of file system. how is it consistency when a debian image behaves differently depending on the underlying host system? isnt docker supposed to abstract host differences and handle all images the same regardless where they are running? > > THe breaks the volume mapping is not supposed to happen. There is a patch > that is supposed to copy the underlying file system ontop of the tmpfs in > runc. by "copy the underlying filesystem ontop" you mean literally copying the files and folder structure over into the /run folder? because this is not a satisfying solution (which is not working whatsoever). the big advantage of bind mounts is that you can modify the content bidirectional. copying the content creates - obviously - a copy and all changes are made to the copy.
bind mounts should continue to work. If you volume mount in -v /run/foobar, then this should get mounted on the tmpfs /run/foobar. There are always going to be differences between distributions, and versions of docker. There is no guarantee that containers will work on every platform, although that is the goal. I personally believe that most containers should run in --read-only mode with tmpfs mounted on /run and /tmp, to prevent a hacked version of an application from modifying the container image. The goal is the debian image not to run differently, if it does then this is a bug. If there was a directory in /run/mariadb or /run/httpd in the image and this does not show up on the /run tmpfs then this is a bug.
OK. so if you need anything else from me than the description i provided in this issue, let me know.