Bug 1618902 - docker crashes while exporting a container of several GB
Summary: docker crashes while exporting a container of several GB
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: docker
Version: 28
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Daniel Walsh
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1619176
TreeView+ depends on / blocked
 
Reported: 2018-08-17 23:30 UTC by Jakub Filak
Modified: 2019-05-29 00:11 UTC (History)
14 users (show)

Fixed In Version:
Clone Of:
: 1619176 (view as bug list)
Environment:
Last Closed: 2019-05-29 00:11:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
complete stacktrace (115.53 KB, text/plain)
2018-08-17 23:30 UTC, Jakub Filak
no flags Details

Description Jakub Filak 2018-08-17 23:30:22 UTC
Created attachment 1476727 [details]
complete stacktrace

Description of problem:
Unfortunately, our images have sometimes hundreds and always tens of Gigabytes.
When I try to export such an image I get a coredump of docker-current.

Version-Release number of selected component (if applicable):
docker-1.13.1-61.git9cb56fd.fc28.x86_64

How reproducible:
Always

Steps to Reproduce:
1. docker run -it --name big fedora /bin/bash
2. dd if=/dev/urandom of=/var/tmp/bigfile bs=4096 count=$((4*1024*1024)) status=prorgress
3. docker export big | dd of=/var/tmp/backup.tar.gz status=progress

Actual results:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
0+0 records in
0+0 records out
0 bytes copied, 107.724 s, 0.0 kB/s

Expected results:
16G bytes copied ...

Additional info:

Few lines from logs (the complete backtrace has ~1000 lines)

dockerd-current[3399]: fatal error: runtime: out of memory
dockerd-current[3399]: runtime stack:
dockerd-current[3399]: runtime.throw(0x18b563e, 0x16)
dockerd-current[3399]: runtime.sysMap(0xc7f2630000, 0x197ad0000, 0x0, 0x2511678)
dockerd-current[3399]: runtime.(*mheap).sysAlloc(0x24f6840, 0x197ad0000, 0x0)
dockerd-current[3399]: runtime.(*mheap).grow(0x24f6840, 0xcbd67, 0x0)

Comment 1 Daniel Walsh 2018-08-19 17:28:24 UTC
Any chance you could try this out with podman to see if it works with them?

Comment 2 Jakub Filak 2018-08-20 09:16:37 UTC
Well, if you tell me how to make podman working in parallel with docker, then I have no problem run those 3 commands on my machine.

At the moment, I have several docker images but `podman images` returns nothing.

By the way, I am getting the same results on RHEL-7.5 : docker-1.13.1-63.git94f4240.el7.x86_64

Comment 3 Daniel Walsh 2018-08-20 12:27:30 UTC
Podman does not use the Docker database.  You would have to pull the images into podman.

Theoretically you can pull images directly out of the docker-daemon.

# docker images | grep centos
docker.io/centos                      7                   49f7960eb7e4        2 months ago        200 MB
docker.io/centos/ruby-22-centos7      latest              e42d0dccf073        2 months ago        566 MB
# podman pull docker-daemon:docker.io/centos:7
Getting image source signatures
Copying blob sha256:bcc97fbfc9e1a709f0eb78c1da59caeb65f43dc32cd5deeb12b8c1784e5b8237
 198.59 MB / 198.59 MB [====================================================] 1s
Copying config sha256:49f7960eb7e4cb46f1a02c1f8174c6fac07ebf1eb6d8deffbcb5c695f1c9edd5
 2.15 KB / 2.15 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures
49f7960eb7e4cb46f1a02c1f8174c6fac07ebf1eb6d8deffbcb5c695f1c9edd5
#podman images | grep centos
docker.io/library/centos            7        49f7960eb7e4   2 months ago   208MB

Comment 4 Jakub Filak 2018-08-21 08:27:42 UTC
I installed podman, imported an image from docker-deamon, ran my test and podman successfully export the container.

How can I use podman to export a running docker container (on direct-lvm)?

Can I update the configuration in /etc/containers/ to use stuff created by docker?

How comes I could create a file of 16GB when podman uses overlay
where image size should not exceed 10GB?

Comment 5 Daniel Walsh 2018-08-21 18:56:30 UTC
Sorry podman can not use docker containers only import its images.  overlay does not have size limits unless you turn on quota.  Devicemapper sets a devault max size for its devices.

Would it be possible to convert your work load over from Docker to Podman.

Comment 6 Jakub Filak 2018-08-21 21:32:20 UTC
Yes, but only if I save the container which is running right now. The tools I am working with these days cannot be automated and I spent few hours configuring the container content (alright, the tools can be automated but we don't know how).

Would it be possible to fix the crash first?

I need export because I have no disk space for commit.

I am happy to migrate from Docker to Podman because I hope that Podman is more verbose (i.e. reports progress for commit or export) and detects full disk (I ran docker commit and it didn't stop upon consuming entire LVM).

Comment 7 Daniel Walsh 2018-08-22 10:48:23 UTC
Vivek is there a way to expand the default volume size in devmapper for Docker?

Comment 8 Daniel Walsh 2018-08-22 10:49:44 UTC
You might be able to do
docker export containerid | podman import -f -

I am a little shaky on the syntax.

Comment 9 Vivek Goyal 2018-08-22 12:54:38 UTC
docker backtrace suggests out of memory. How much memory does this machine have? I suspect if you make more memory available, it will work.

Question of optimizing docker will remain though. I don't know if it is due ot docker consuming too much of memory or it is because of too little memory available on the box.

Comment 10 Jakub Filak 2018-08-22 13:30:22 UTC
> Vivek is there a way to expand the default volume size in devmapper for Docker?

I could add more space by adding a new volume group and expanding the docker pool lvm, but I have no physical device I can attach to that machine ...

> docker export containerid | podman import -f -

The problems is that "docker export" does not work - that's the subject of this bug report.

> I don't know if it is due ot docker consuming too much of memory or it is because of too little memory available on the box.

$ free -h
              total        used        free      shared  buff/cache   available
Mem:            15G        8.0G        3.7G        950M        3.7G        6.2G
Swap:          7.8G          0B        7.8G

What about to teach docker to read layers by small chunks instead of pulling it to memory completely?

We have 200GB images, I hope I don't need to buy more RAM to be able to export those images :)

Anyways, I was able to export big containers with older versions of Docker.

Comment 11 Antonio Murdaca 2018-08-27 14:17:11 UTC
can you test this out by removing all authorization plugins? basically:

1. systemctl edit --full docker.service
2. remove this line (remove the backslash as well): "--authorization-plugin=rhel-push-plugin \"

with that, it shouldn't panic anymore and it should work.
Please test it out and report back if you can :) I'm working on a fix

Comment 13 Jakub Filak 2018-08-28 15:25:14 UTC
I can confirm that removing the option '--authorization-plugin=rhel-push-plugin' from dockerd-current command line arguments fixes the crash for me.

Thank you very much indeed!

Now, I can start migrating to podman.

Comment 14 Antonio Murdaca 2018-08-28 15:39:20 UTC
(In reply to Jakub Filak from comment #13)
> I can confirm that removing the option
> '--authorization-plugin=rhel-push-plugin' from dockerd-current command line
> arguments fixes the crash for me.
> 
> Thank you very much indeed!
> 
> Now, I can start migrating to podman.

awesome, next docker release is going to have the fix even with authz plugins enabled, so if you need that, just wait for a docker update.

Comment 15 Michal Minar 2018-08-29 10:18:52 UTC
Confirming that the latest projectatomic/docker:docker-1.13.1-rhel branch fixes the problem. Thanks!

Comment 16 Ben Cotton 2019-05-02 19:06:58 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Ben Cotton 2019-05-29 00:11:39 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.