Description of problem: When installing the moby-engine package on a fresh fedora 30 machine, some of my docker images could not build anymore (`debian:jessie` based). The `apt-get update` command was hanging. I first thought about a network issue but I was able to ping the outside world. After `strace`ing the process (like explained in [1]), I saw the similar error than in [2], which was solved by setting a number of opened files limit. Would you be interested by adding the following to the default daemon `OPTIONS` [3] ? ``` --default-ulimit nofile=1024:1024 ``` (cf the docker CLI ref for daemon [4] and container [5]) [1] https://stackoverflow.com/questions/20980303/docker-build-freezes-installing-packages-from-apt [2] https://bugs.launchpad.net/ubuntu/+source/apt/+bug/1332440 [3] https://src.fedoraproject.org/rpms/moby-engine/blob/master/f/docker.sysconfig [4] https://docs.docker.com/engine/reference/commandline/dockerd/#default-ulimit-settings [5] https://docs.docker.com/engine/reference/commandline/run/#set-ulimits-in-container---ulimit Kernel : 5.0.17-300.fc30.x86_64 Installed `moby-engine` package details : Name : moby-engine Version : 18.06.3 Release : 2.ce.gitd7080c1.fc30 Architecture : x86_64 Size : 226 M Source : moby-engine-18.06.3-2.ce.gitd7080c1.fc30.src.rpm Repository : @System From repo : fedora Steps to Reproduce: 1. install `moby-engine` package : `$ sudo dnf install -y moby-engine` 2. add yourself to the docker group : `$ sudo usermod -a -G docker <your_username>` 3. reboot (log logout and login again) 4. pull a `debian:jessie` image : `$ docker pull debian:jessie` 5. run a container : `$ docker run --rm -it debian:jessie /bin/bash` 6. try to update the package list: `container$ apt-get update` <= hangs Actual results: The `apt-get update` task ran into the container hangs (and consumes 100% of a CPU core). Expected results: the task should perform normally. Additional info: When setting the ulimit, the task perfoms normally. I'd like to open a PR but it seems like forking the repo is not an easy step (although I accepted the contributor agreement). Note that the issue does not appear with a `ubuntu:18.04` image.
+1 the default ulimit value (1073741816 on my Fedora 30) makes yum unusable from centos:7 but also other programs like the Erlang based (i.e. rabbitmq-server) have problems because of the high ulimit -n value
FEDORA-2019-572b06a0f7 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-572b06a0f7
moby-engine-18.09.7-4.ce.git2d0083d.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-572b06a0f7
Bug 1723106 is related to this bug.
Thanks for the consideration, The issue still persists on F30 with these changes when installing a package with yum inside a centos:7 container, e.g. : root@container$ yum install -y gcc One CPU core hangs at 100% usage and the process never completes. (as reported by Daniele https://bugzilla.redhat.com/show_bug.cgi?id=1715254#c1 )
Does anybody know the reason for the high limits? Is this a mistake? Podman also increases the limits compared to outside of the container, but far not that much. To the maintainer: the priority of this issue should be quite high as it is currently very hard to run a Produktion grade docker infrastructure on fedora.
Here's the systemd config for the docker.service from docker-ce on fedora 30 [1] ``` cat /usr/lib/systemd/system/docker.service [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target ``` As you can see, the ExecStart is quite simple : `ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock` I tried the above line instead of the one provided by `moby-engine`, but docker refuses to start when running `$ sudo systemctl start docker`. Then, I reverted the `ExecStart` line to the original ones provided by `moby-engine`, and I tried with the OPTIONS of this commit [2] : ``` OPTIONS="--selinux-enabled \ --log-driver=journald \ --live-restore \ --default-ulimit nofile=1024:1024 \ --init-path /usr/libexec/docker/docker-init \ --userland-proxy-path /usr/libexec/docker/docker-proxy \ " ``` (except for the `live-restore` option which was preventing me to run docker swarm mode); and it was working :) I was able to run a `yum install` inside a container : ``` me@host$ docker pull centos:7.6.1810 me@host$ docker run --rm -it centos:7.6.1810 yum install -y gcc # ... me@host$ echo $? O ``` So far I was missing the `--init-path` and `--userland-proxy-path` options. I think these two have solved the issue. Thank you for the fix :D -------------------- [1] https://github.com/docker/for-linux/issues/600#issuecomment-515918169 [2] https://src.fedoraproject.org/rpms/moby-engine/c/b73040075e618f039def4adb0476adaba24b68bd
Actually, I had an other issue with the `--userland-proxy-path` option. I wasn't able to start a container having a port binding on my host: ``` starting container failed: container 66453e7d9a481dbd6a0d6c75717e86bc2c71bcc770d14139adc89716d6094808: endpoint join on GW Network failed: driver failed programming external connectivity on endpoint gateway_88e5e7ca7cb5 (bb3095e9cbd0dd23623878ea1b235a3672d3f9501cb6115d46d0a7746807976b): fork/exec /usr/libexec/docker/docker-proxy: no such file or directory ``` from the following config of the docker service : ``` services: varnish: ports: - published: 8080 target: 80 protocol: tcp mode: host ``` the 8080 port biding on the host could not be created. There was nothing on my machine on that port yet ( `sudo lsof -i:8080` returned nothing) Removing the `--userland-proxy-path` option fixed this issue (and I'm still able to install yum packages on the CentOS base container).
moby-engine-18.09.7-4.ce.git2d0083d.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.