Description of problem: On the latest Fedora 26 AH, it seems really easy to get the docker daemon to exhaust all the system memory by repeated `docker cp` commands until it fails with: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory This makes it impractical in testing environments where we need to routinely spin up and provision throw-away containers. Version-Release number of selected component (if applicable): [root@jlebon ~]# rpm-ostree status State: idle Deployments: * fedora-atomic:fedora/26/x86_64/atomic-host Version: 26.101 (2017-08-06 21:27:14) Commit: f6331bcd14577e0ee43db3ba5a44e0f63f74a86e3955604c20542df0b7ad8ad6 [root@jlebon ~]# rpm -q docker docker-1.13.1-19.git27e468e.fc26.x86_64 How reproducible: Always. Steps to Reproduce: [root@jlebon ~]# cat reproducer.sh #!/bin/bash set -xeuo pipefail dd if=/dev/urandom of=myfile count=500000 docker run --detach --name cnt registry.fedoraproject.org/fedora:26 sleep infinity for i in $(seq 5); do docker cp myfile cnt:/var/tmp docker cp cnt:/var/tmp/myfile . done [root@jlebon ~]# sh reproducer.sh + dd if=/dev/urandom of=myfile count=500000 500000+0 records in 500000+0 records out 256000000 bytes (256 MB, 244 MiB) copied, 1.81003 s, 141 MB/s + docker run --detach --name cnt registry.fedoraproject.org/fedora:26 sleep infinity 5c2203de1f075d2198194dc4cf2d7a3e891e2973f997ae19c9fb695e0922025f ++ seq 5 + for i in $(seq 5) + docker cp myfile cnt:/var/tmp Error response from daemon: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory [root@jlebon ~]# Actual results: Crashes Expected results: Doesn't crash Additional info: Looking at the memory usage of the docker service (just a simple `watch -n 1 systemctl status docker`), it's as if it's mapping the whole file into memory and not releasing it.
Also reproduced on the latest Fedora 26 AH release: [root@jlebon ~]# rpm-ostree status State: idle Deployments: * fedora-atomic:fedora/26/x86_64/atomic-host Version: 26.110 (2017-08-20 18:10:09) Commit: 13ed0f241c9945fd5253689ccd081b5478e5841a71909020e719437bbeb74424 fedora-atomic:fedora/26/x86_64/atomic-host Version: 26.101 (2017-08-06 21:27:14) Commit: f6331bcd14577e0ee43db3ba5a44e0f63f74a86e3955604c20542df0b7ad8ad6
I hit this specifically when trying to move PAPR[1] to Fedora 26 AH. We often spin up e.g. 8 containers at a time there where we need to `docker cp` whole git repositories simultaneously. Though as the reproducer shows, it doesn't even have to be simultaneous. I forgot to add the docker version included in v26.110 in my previous comment: # rpm -q docker docker-1.13.1-21.git27e468e.fc26.x86_64 [1] https://github.com/projectatomic/papr
Same bug on RHEL side: https://bugzilla.redhat.com/show_bug.cgi?id=1489517 This keeps us from updating Cockpit's OpenShift test VM to something newer than Fedora 25 (which is EOL).
AFAICT, this is no longer an issue in the latest Fedora 27 AH release at least. # rpm-ostree status State: idle Deployments: * fedora-atomic:fedora/27/x86_64/atomic-host Version: 27.81 (2018-02-12 17:50:48) Commit: b25bde0109441817f912ece57ca1fc39efc60e6cef4a7a23ad9de51b1f36b742 GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4 Martin, might be worth checking if that is the case for you as well. Then we can probably close this bug.
@Jonathan: This is still easily reproducible on current Fedora 27: # docker create --name foo docker.io/openshift/origin:v3.7.1 # docker cp foo:/usr/bin/ /tmp/oc-bin [ 186.781101] dockerd-current invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=-999
Thanks Martin. Reproduced it as well here. My original reproducer no longer triggers a crash for some reason.
(In reply to Jonathan Lebon from comment #6) > Thanks Martin. Reproduced it as well here. My original reproducer no longer > triggers a crash for some reason. Jonathan, could you please re-test this removing the rhel-push-plugin? You can remove it by removing the line: --authorization-plugin=rhel-push-plugin from the docker.service (systemctl edit --full docker.service). Re-start the daemon and re-check. Thanks This should be related to https://github.com/openshift/origin/issues/18952 as well
I can confirm that I cannot reproduce the issue when removing the --authorization-plugin flag.
[root@jlebon-tmp2 ~]# rpm-ostree status State: idle; auto updates disabled Deployments: * ostree://fedora-atomic:fedora/27/x86_64/atomic-host Version: 27.93 (2018-02-25 20:49:19) Commit: da0bd968610aa1e29c5bb37065649407fbbfffa53e63831afdadbd34a3b05327 GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4 [root@jlebon-tmp2 ~]# rpm -q docker docker-1.13.1-44.git584d391.fc27.x86_64
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
My test case does not work any more on Fedora 28, since it seems docker cp now got differently broken: # docker cp foo:/usr/bin /tmp/oc-bin invalid symlink "/tmp/oc-bin/bin/Mail" -> "../../bin/mailx" Same result with -a or -L. So I won't bump this for now.
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.