Bug 1419877 - Problem with docker exec after daemon restart
Summary: Problem with docker exec after daemon restart
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Daniel Walsh
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Keywords: Extras
: 1419871 (view as bug list)
Depends On: 1247935 1347821
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-07 10:11 UTC by Jonas Nordell
Modified: 2019-03-06 00:56 UTC (History)
12 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2017-06-28 15:39:34 UTC


Attachments (Terms of Use)
docker inspect for the running container (14.72 KB, text/plain)
2017-02-07 10:11 UTC, Jonas Nordell
no flags Details
reproducible in 7.4 sosreport (12.13 MB, application/x-xz)
2018-02-14 13:45 UTC, Daniele
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1620 normal SHIPPED_LIVE docker bug fix and enhancement update 2017-06-28 19:33:52 UTC
Red Hat Knowledge Base (Solution) 2991041 None None None 2017-04-04 07:04 UTC

Description Jonas Nordell 2017-02-07 10:11:24 UTC
Created attachment 1248347 [details]
docker inspect for the running container

Description of problem:

When trying out the docker live-restore function I am having issues with "docker exec" after a daemon restart.

[root@osemaster1 docker]# docker exec -it b2e277f2bc59 bash
rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:75: starting setns process caused \\\"fork/exec /proc/self/exe: no such file or directory\\\"\"\n"

Container seems to be working fine and requests are served. 

also "docker logs" works.

[root@osemaster1 docker]# docker logs b2e277f2bc59
---> Running application ...
dbug: Microsoft.AspNetCore.Hosting.Internal.WebHost[3]
      Hosting starting
dbug: Microsoft.AspNetCore.Hosting.Internal.WebHost[4]
      Hosting started
Hosting environment: Production
Content root path: /opt/app-root/src
Now listening on: http://0.0.0.0:8080
Now listening on: https://0.0.0.0:8081
Application started. Press Ctrl+C to shut down.

Version-Release number of selected component (if applicable):

[root@osemaster1 docker]# docker version
Client:
 Version:         1.12.5
 API version:     1.24
 Package version: docker-common-1.12.5-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      047e51b/1.12.5
 Built:           Wed Jan 11 17:53:20 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.5
 API version:     1.24
 Package version: docker-common-1.12.5-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      047e51b/1.12.5
 Built:           Wed Jan 11 17:53:20 2017
 OS/Arch:         linux/amd64


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Daniel Walsh 2017-02-07 13:53:56 UTC
Docker on RHEL does not support daemon restart and live-restore.

The RHEL kernel has a bug which forces us to run the docker daemon within its own Mount Namespace.  If you look at the docker.service 

```
grep slave /etc/systemd/system/docker.service 
MountFlags=slave
```

I am writing a blog on why we do this.  But bottom line is for now we can not support --live-restore on RHEL kernels.  Hopefully this will be fixed with the RHEL 7.4 kernel.

Comment 3 Daniel Walsh 2017-02-07 13:54:44 UTC
*** Bug 1419871 has been marked as a duplicate of this bug. ***

Comment 5 Daniel Walsh 2017-02-20 17:17:59 UTC
Live-restore will not work on RHEL until we fix an issue in the kernel.  We run the docker daemon within its own mount namespace to prevent the leakage of mount points from the docker daemon, that cause the docker daemon to crash.  We have a fix in upstream kernels that allow us to run the docker daemon in the hosts mount namespace.  The RHEL7 kernel will be fixed in RHEL7.4 update.

Comment 9 Daniel Walsh 2017-04-04 07:36:30 UTC
Lokesh have we fixed this issue in rhel7-3.3 release.

You should remove the live-restore flag from /etc/docker/daemon.json file.
Then restart the docker daemon.  After this all containers will shutdown when the daemon restarts.  

live-restore will be fixed in RHEL7.4 release.

Comment 12 Daniel Walsh 2017-04-10 18:28:58 UTC
I should have said in rhel7-3.4 release, which is happening any day now.

Comment 14 Luwen Su 2017-06-19 09:46:04 UTC
Per comment#10 steps, works fine in 
docker-1.12.6-31.git3a6eaeb.el7.x86_64
3.10.0-514.25.2.el7.x86_64

Comment 16 errata-xmlrpc 2017-06-28 15:39:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1620

Comment 18 Daniele 2018-02-14 13:45 UTC
Created attachment 1395933 [details]
reproducible in 7.4 sosreport

Comment 19 Daniel Walsh 2018-02-15 15:40:20 UTC
Do they have the Mount flag in the docker unit file?

Comment 21 Daniel Walsh 2018-02-21 14:09:55 UTC
THey need to remove this flag from the docker unit, which should allow the restart to happen.  We should fix this in our next release.

Comment 23 Daniel Walsh 2018-03-12 15:06:21 UTC
Check the systemd unit file and see if the MountFlags=slave line is in there.  If it is, then the problem is not fixed.

Comment 26 Aaron 2018-10-18 17:03:13 UTC
The comments vs. the errata mentioned in comment 16 (which closed the BZ) are conflicting.

The comments state this is a kernel bug and align with the RedHat article - https://access.redhat.com/articles/2938171

However the errata link in comment 16 that closed this BZ has docker RPMs only and makes no mention of any dependency on a kernel update. Based on the previous comments I would have expected this errata to be a kernel version.

Then Daniel Walsh mentions in comment 21 that a docker update is required to remove the "MountFlags=slave" from docker.service unit file.

Can someone clarify?

Comment 27 Daniel Walsh 2018-10-20 11:27:34 UTC
I would think that both are required.  The latest docker package and the kernel update.

Comment 28 Aaron 2018-10-22 17:49:36 UTC
I agree, but the errata that closed this issue in comment 16 makes no mention of a kernel update which I think would be a prerequisite to the docker update that supports live-restore if I understand this issue correctly.

Not to mention comment 21 stated that the docker update had not yet came out that removes the "MountFlags=slave" - so it left me wondering what the errata in comment 16 actually did to resolve this BZ.

All I was trying to say is the order of the comments vs. the contents of the errata are very confusing.


Note You need to log in before you can comment on or make changes to this bug.