Hide Forgot
Created attachment 1248347 [details] docker inspect for the running container Description of problem: When trying out the docker live-restore function I am having issues with "docker exec" after a daemon restart. [root@osemaster1 docker]# docker exec -it b2e277f2bc59 bash rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:75: starting setns process caused \\\"fork/exec /proc/self/exe: no such file or directory\\\"\"\n" Container seems to be working fine and requests are served. also "docker logs" works. [root@osemaster1 docker]# docker logs b2e277f2bc59 ---> Running application ... dbug: Microsoft.AspNetCore.Hosting.Internal.WebHost[3] Hosting starting dbug: Microsoft.AspNetCore.Hosting.Internal.WebHost[4] Hosting started Hosting environment: Production Content root path: /opt/app-root/src Now listening on: http://0.0.0.0:8080 Now listening on: https://0.0.0.0:8081 Application started. Press Ctrl+C to shut down. Version-Release number of selected component (if applicable): [root@osemaster1 docker]# docker version Client: Version: 1.12.5 API version: 1.24 Package version: docker-common-1.12.5-14.el7.x86_64 Go version: go1.7.4 Git commit: 047e51b/1.12.5 Built: Wed Jan 11 17:53:20 2017 OS/Arch: linux/amd64 Server: Version: 1.12.5 API version: 1.24 Package version: docker-common-1.12.5-14.el7.x86_64 Go version: go1.7.4 Git commit: 047e51b/1.12.5 Built: Wed Jan 11 17:53:20 2017 OS/Arch: linux/amd64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Docker on RHEL does not support daemon restart and live-restore. The RHEL kernel has a bug which forces us to run the docker daemon within its own Mount Namespace. If you look at the docker.service ``` grep slave /etc/systemd/system/docker.service MountFlags=slave ``` I am writing a blog on why we do this. But bottom line is for now we can not support --live-restore on RHEL kernels. Hopefully this will be fixed with the RHEL 7.4 kernel.
*** Bug 1419871 has been marked as a duplicate of this bug. ***
Live-restore will not work on RHEL until we fix an issue in the kernel. We run the docker daemon within its own mount namespace to prevent the leakage of mount points from the docker daemon, that cause the docker daemon to crash. We have a fix in upstream kernels that allow us to run the docker daemon in the hosts mount namespace. The RHEL7 kernel will be fixed in RHEL7.4 update.
Lokesh have we fixed this issue in rhel7-3.3 release. You should remove the live-restore flag from /etc/docker/daemon.json file. Then restart the docker daemon. After this all containers will shutdown when the daemon restarts. live-restore will be fixed in RHEL7.4 release.
I should have said in rhel7-3.4 release, which is happening any day now.
Per comment#10 steps, works fine in docker-1.12.6-31.git3a6eaeb.el7.x86_64 3.10.0-514.25.2.el7.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1620
Created attachment 1395933 [details] reproducible in 7.4 sosreport
Do they have the Mount flag in the docker unit file?
THey need to remove this flag from the docker unit, which should allow the restart to happen. We should fix this in our next release.
Check the systemd unit file and see if the MountFlags=slave line is in there. If it is, then the problem is not fixed.
The comments vs. the errata mentioned in comment 16 (which closed the BZ) are conflicting. The comments state this is a kernel bug and align with the RedHat article - https://access.redhat.com/articles/2938171 However the errata link in comment 16 that closed this BZ has docker RPMs only and makes no mention of any dependency on a kernel update. Based on the previous comments I would have expected this errata to be a kernel version. Then Daniel Walsh mentions in comment 21 that a docker update is required to remove the "MountFlags=slave" from docker.service unit file. Can someone clarify?
I would think that both are required. The latest docker package and the kernel update.
I agree, but the errata that closed this issue in comment 16 makes no mention of a kernel update which I think would be a prerequisite to the docker update that supports live-restore if I understand this issue correctly. Not to mention comment 21 stated that the docker update had not yet came out that removes the "MountFlags=slave" - so it left me wondering what the errata in comment 16 actually did to resolve this BZ. All I was trying to say is the order of the comments vs. the contents of the errata are very confusing.