Bug 1127006 - Docker does not honor rlimit settings in systemd or on the system.
Summary: Docker does not honor rlimit settings in systemd or on the system.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: John Keck
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1113141 1200876
TreeView+ depends on / blocked
 
Reported: 2014-08-05 20:46 UTC by Eric Rich
Modified: 2019-03-06 02:09 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-20 13:33:25 UTC


Attachments (Terms of Use)

Description Eric Rich 2014-08-05 20:46:17 UTC
Description of problem:

Altering the rlimits (nproc limit) in systemd unit file or on the system (/etc/security/limit{conf,.d/20-nproc.conf})

Version-Release number of selected component (if applicable): RHEL 7

How reproducible: VERY

Steps to Reproduce:

$ cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.io
After=network.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/docker
ExecStart=/usr/bin/docker -d --selinux-enabled
Restart=on-failure
LimitNOFILE=1048576
#LimitNPROC=1048576
LimitNPROC=5

[Install]
WantedBy=multi-user.target

$ cat /etc/security/limits.d/20-nproc.conf | grep -v "#"
docker  soft    nproc 5
*          soft    nproc     4096
root       soft    nproc     unlimited

$ cat /etc/security/limits.conf | grep -v "#"
docker          hard    nproc            5 

With this set I restart the docker process (as root) and then become a developer user (who is part of the docker group) and run the following: 

~~~
$ docker run -i -t --rm rhel7 bash                                                                                                                                                           
bash-4.2# ulimit -u
5
bash-4.2# exit
exit
^C

~~~

This shows me that the container once started should be limited to 5 process. However in my earlier tests where I was doing similar operations: 

~~~
$ docker run -i -t --rm rhel7 /bin/bash 
bash-4.2# ulimit -u
1048576
bash-4.2# ulimit -u 5
bash-4.2# ulimit -u
5
bash-4.2# for i in {0..5}; do python -m SimpleHTTPServer 808$i & done
[1] 6
[2] 7
[3] 8
[4] 9
[5] 10
[6] 11
bash-4.2# Serving HTTP on 0.0.0.0 port 8085 ...
Serving HTTP on 0.0.0.0 port 8084 ...
Serving HTTP on 0.0.0.0 port 8081 ...
Serving HTTP on 0.0.0.0 port 8080 ...
Serving HTTP on 0.0.0.0 port 8082 ...
Serving HTTP on 0.0.0.0 port 8083 ...
~~~

This did not seem to be enforced! IE: have the host define the limit, or if I set the limits in the container neither are honored. This can be seen with the commands above as well as with the following: 

$ for containers in {0..6}; do docker run -d rhel7 python -m SimpleHTTPServer 8080 ; done
f46403b0e06ec25bb6b4cb690522165b669b53b9a1a7c58f90c078d8fd71bee3
de87bdbf967f6611f42b18abdba7a484dd2908f1b4f066f44a221a033bf532f5
c05ac37b4d3456459919a9a151ecf1509f11c025197a53b152884c26b422bb2c
b2b4020d3cfcc84f017a9bccb3b42a31cbc991a46bdcda9073a66ebf8bd1b404
89dc5ee2f3d93b662e34aa5a21a078bb60fef1e0c9f1075ab68c78aef3616f98
b05bb89ff6216d08c5ef024a47b3ff99460fa490da6703ef1e7a23d19c77d381
d938e4532663c5fc594902d81e35a0c3159328adfc96f343625e0e33bda75333

$ docker ps
CONTAINER ID        IMAGE                                COMMAND                CREATED             STATUS              PORTS               NAMES
d938e4532663        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   6 seconds ago       Up 4 seconds                            trusting_colden6          
b05bb89ff621        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   7 seconds ago       Up 6 seconds                            stupefied_brattain3       
89dc5ee2f3d9        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   8 seconds ago       Up 7 seconds                            cocky_sinoussi1           
b2b4020d3cfc        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   9 seconds ago       Up 8 seconds                            condescending_rosalind7   
c05ac37b4d34        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   10 seconds ago      Up 9 seconds                            hopeful_sinoussi2         
de87bdbf967f        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   11 seconds ago      Up 10 seconds                           hopeful_sammet5           
f46403b0e06e        registry.access.redhat.com/rhel7:0   python -m SimpleHTTP   12 seconds ago      Up 11 seconds                           ecstatic_sinoussi1  

Actual results:

rlimits set by systemd or the system, should keep the docker process from creating more than 5 containers. 

rlimits set by the container (inherited by systemd) should limit the container from creating more than 5 process. 

Expected results:

Neither setting is honored. 

Additional info:

A question that is not clear at this point in time is: How can the host system limit the resources at the host layer and with in each individual container to provide effective system(container) management.

Comment 2 Trevor Jay 2014-08-06 13:56:03 UTC
Just to reiterate: Docker containers are not a security mechanism and Red Hat does not support their use for multitenancy.

  * Docker security with SELinux - http://opensource.com/business/14/7/docker-security-selinux
  * NEW DOCKERCON VIDEO: DOCKER SECURITY (RENAMED FROM DOCKER AND SELINUX) - http://blog.docker.com/2014/07/new-dockercon-video-docker-security-renamed-from-docker-and-selinux/

That said...

Again, it isn't a multitenant solution but you *can* limit non-root users within Docker containers to a certain number of processes. The problem with your this report is that it only ever uses the root user. root users in and outside containers are immune to limits. However, if you edit the LimitNPROC line of /usr/lib/systemd/system/docker.service (as was done) and then both run:

    systemctl daemon-reload

and 

    service docker restart

then any non-root users within the container *will* be limited by the LimitNPROC limit.

An example:

    sudo sed -i.bak 's/LimitNPROC=.*/LimitNPROC=21/g' /usr/lib/systemd/system/docker.service
    sudo systemctl daemon-reload
    sudo service docker restart
    docker run -i -t rhel /bin/bash -c "useradd user; su user -c 'sleep 30 & export slp="$!"; for i in {10..70}; do sleep 180 & done ; wait $slp'"

Results in many:

    [...]
    bash: fork: retry: Resource temporarily unavailable
    [...]

errors, just as you would hope. I should also point out that if a non-root (non-privileged) container user uses ulimit -u to reduce their available processes, they can't increase them later as they lack a critical kernel capability to do so. Thus having a script use ulimit -u N before getting down to real business (instead of messing with LimitNPROC) will also work.

Again, there are no protections from malicious code but rlimits do work as currently intended by the design of Docker.

Comment 3 Eric Rich 2014-08-06 21:33:10 UTC
(In reply to Trevor Jay from comment #2)

I understand what your getting at however this works because the container inherits the nproc limit from the docker process (set by systemd)

If I set the following: 

$ grep nobody /etc/security/limits{.conf,.d/20-nproc.conf}
/etc/security/limits.conf:nobody                hard    nproc            5 
/etc/security/limits.d/20-nproc.conf:nobody     hard    nproc            5

Then run: 

$ docker run -t -i -u nobody --rm rhel7 /bin/bash 
bash-4.2$ ulimit -u
1048576
bash-4.2$ whoami
nobody
bash-4.2$ exit
 
You can see that the nobody user's limits are not used in the context of the container. 

Is it only possible for you to inherit what the docker process has?

Comment 4 Eric Rich 2014-08-12 21:29:19 UTC
It seems my questions are really related to the following upstream items. 

https://github.com/docker/docker/pull/4469 
https://github.com/docker/docker/issues/5305


A good question asked by this bug and these up stream trackers seems to be: 

How do you set limits for 100 containers (generically) but then also limit each container to 5 process?

Comment 6 Daniel Walsh 2014-09-12 17:09:08 UTC
Fixed in docker-1.2

Comment 7 Trevor Jay 2014-10-14 01:36:55 UTC
If dwalsh's fix doesn't give you the explicit behavior you need, there's also another workaround if you fallback to LXC. See:

https://groups.google.com/d/msg/docker-user/UF0GxTp3NHI/EehyMEzLqHcJ


Note You need to log in before you can comment on or make changes to this bug.