Bug 1412148

Summary: containerd: container did not start before the specified timeout
Product: [Fedora] Fedora Reporter: Lukas Slebodnik <lslebodn>
Component: dockerAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: adimania, admiller, amurdaca, dwalsh, ichavero, jcajka, jchaloup, lsm5, marianne, miminar, nalin, riek, vbatts
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: docker-1.12.6-3.git51ef5a8.fc25 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1412164 (view as bug list) Environment:
Last Closed: 2017-01-13 02:23:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1412164    

Description Lukas Slebodnik 2017-01-11 11:53:23 UTC
Description of problem:
I run CI test in docker containers with various distributions
and int sometimes happen to me that "container did not start before the specified timeout"


Version-Release number of selected component (if applicable):
sh# rpm -q docker
docker-1.12.5-4.git03508cc.fc25.x86_64

How reproducible:
non-deterministic

Steps to Reproduce:
Unfortunately, I do not have a reliable reproducer.

Actual results:
//there is just a created container; but it is not running.
sh# docker ps -a
3e9c420dc892        temp/debian_testing_ci   "sh -e -c 'sudo su -c"   10 hours ago        Created                                                      pensive_ardinghelli

Expected results:
No such intermittent issue

Additional info:
related part of journal log

Jan 11 02:15:32 vm-153.example.com docker-containerd-current[11813]: time="2017-01-11T02:15:32.323752307+01:00" level=error msg="containerd: start container" error="containerd: container did not start before the specified timeout" id=3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f
Jan 11 02:15:32 vm-153.example.com systemd[1]: libcontainer-23026-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
Jan 11 02:15:32 vm-153.example.com systemd[1]: libcontainer-23026-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
Jan 11 02:15:33 vm-153.example.com dockerd-current[11882]: time="2017-01-11T02:15:33.099431736+01:00" level=error msg="Create container failed with error: containerd: container did not start before the specified timeout"
Jan 11 02:15:34 vm-153.example.com kernel: docker0: port 2(veth132dd3f) entered disabled state
Jan 11 02:15:34 vm-153.example.com kernel: device veth132dd3f left promiscuous mode
Jan 11 02:15:34 vm-153.example.com kernel: docker0: port 2(veth132dd3f) entered disabled state
Jan 11 02:15:34 vm-153.example.com audit: ANOM_PROMISCUOUS dev=veth132dd3f prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 11 02:15:34 vm-153.example.com NetworkManager[1128]: <info>  [1484097334.5298] device (vetha4686a3): driver 'veth' does not support carrier detection.
Jan 11 02:15:34 vm-153.example.com NetworkManager[1128]: <info>  [1484097334.5309] device (veth132dd3f): driver 'veth' does not support carrier detection.
Jan 11 02:15:35 vm-153.example.com kernel: XFS (dm-6): Unmounting Filesystem
Jan 11 02:15:35 vm-153.example.com dockerd-current[11882]: time="2017-01-11T02:15:35.307872220+01:00" level=error msg="Handler for POST /v1.24/containers/3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f/start returned error: containerd: container did not start before the specified timeout"

Comment 1 Lukas Slebodnik 2017-01-11 11:56:53 UTC
docker inspect says:

[
    {
        "Id": "3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f",
        "Created": "2017-01-11T01:15:06.130476175Z",
        "Path": "sh",
        "Args": [
            "-e",
            "-c",
            "sudo su -c \"apt-get update || apt-get update\" \u0026\u0026 sudo su -c \"DEBIAN_FRONTEND=noninteractive apt-get upgrade -y\" \u0026\u0026 sudo su -c \"DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -y\" \u0026\u0026 git pull -v \u0026\u0026 git am -3 /home/build/*.patch \u0026\u0026 contrib/ci/run -f -p $HOSTNAME:/home/build/"
        ],
        "State": {
            "Status": "created",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 128,
            "Error": "containerd: container did not start before the specified timeout",
            "StartedAt": "0001-01-01T00:00:00Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:3a50a5f12f83b1b9e37d0413c54938894fdb21507788f140541b565217451b28",
        "ResolvConfPath": "/var/lib/docker/containers/3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f/hostname",
        "HostsPath": "/var/lib/docker/containers/3e9c420dc892144aeb404a4f3acc7f3b3b96ccf3e5f1f2161434a27a05f8b56f/hosts",
        "LogPath": "",
        "Name": "/pensive_ardinghelli",
        "RestartCount": 0,
        "Driver": "devicemapper",
        "MountLabel": "system_u:object_r:container_file_t:s0:c276,c582",
        "ProcessLabel": "system_u:system_r:container_t:s0:c276,c582",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "debian_testing_ci_container_id.1484097304",
            "LogConfig": {
                "Type": "journald",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "oci",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 1024,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "devicemapper",
            "Data": {
                "DeviceId": "56435",
                "DeviceName": "docker-253:1-261561-a1043dd326699e7af5affeac3486fb48529124a26f3bcc3688b2c127521dc120",
                "DeviceSize": "107374182400"
            }
        },
        "Mounts": [],
        "Config": {
            "Hostname": "3e9c420dc892",
            "Domainname": "",
            "User": "build",
            "AttachStdin": false,
            "AttachStdout": true,
            "AttachStderr": true,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "TALLOC_FREE_FILL=222",
                "MALLOC_PERTURB_=190",
                "USER=build"
            ],
            "Cmd": [
                "sh",
                "-e",
                "-c",
                "sudo su -c \"apt-get update || apt-get update\" \u0026\u0026 sudo su -c \"DEBIAN_FRONTEND=noninteractive apt-get upgrade -y\" \u0026\u0026 sudo su -c \"DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -y\" \u0026\u0026 git pull -v \u0026\u0026 git am -3 /home/build/*.patch \u0026\u0026 contrib/ci/run -f -p $HOSTNAME:/home/build/"
            ],
            "Image": "temp/debian_testing_ci",
            "Volumes": null,
            "WorkingDir": "/home/build",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {}
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "600cd344921d7b3e75f49438c61e7a60e532404d385480e1887e94019ee5c26d",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": null,
            "SandboxKey": "/var/run/docker/netns/600cd344921d",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "add43d5a6a68390c2573ab139b1411accf2fd8292f87c8b1c717d5db405e0d92",
                    "EndpointID": "",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": ""
                }
            }
        }
    }
]

Comment 2 Lukas Slebodnik 2017-01-11 11:58:19 UTC
And I could not see such problems on fedora24 with older docker 1.10.3-55.gite03ddb8.fc24

Comment 3 Antonio Murdaca 2017-01-11 12:03:06 UTC
Alright, we need to add --start-timeout=2m to the docker-containerd unit file. This is happening in Fedora because we use a different containerd process while docker upstream embeds containerd spawning. I'll update the unit file and release an update. Meanwhile, could you add "--start-timeout=2m" to the docker-containerd.service ExecStart and re-test? 

Source: https://github.com/docker/docker/issues/22226

Comment 4 Lukas Slebodnik 2017-01-11 12:33:15 UTC
OK; I tried workaround. I home that 2 mins will be enough

sh# mkdir /etc/systemd/system/docker-containerd.service.d/

sh# cat /etc/systemd/system/docker-containerd.service.d/timeout.conf 
[Service]
ExecStart=
ExecStart=/usr/libexec/docker/docker-containerd-current --listen unix:///run/containerd.sock --shim /usr/libexec/docker/docker-containerd-shim-current --start-timeout=2m

sh# systemctl daemon-reload
sh# systemctl restart docker.service docker-containerd.service

Comment 5 Fedora Update System 2017-01-11 13:29:29 UTC
docker-1.12.6-3.git51ef5a8.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-dbc2b618eb

Comment 6 Fedora Update System 2017-01-12 07:52:33 UTC
docker-1.12.6-3.git51ef5a8.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-dbc2b618eb

Comment 7 Fedora Update System 2017-01-13 02:23:54 UTC
docker-1.12.6-3.git51ef5a8.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 8 Lukas Slebodnik 2017-01-17 08:16:48 UTC
FYI, I have been testing docker-containerd.service with shorter timeout. Just 1 minute. And I could not see any problem yet.

Comment 9 Antonio Murdaca 2017-01-17 08:23:44 UTC
(In reply to Lukas Slebodnik from comment #8)
> FYI, I have been testing docker-containerd.service with shorter timeout.
> Just 1 minute. And I could not see any problem yet.

Cool, 2min has no harm afaict though, just following what upstrea does. Thanks for testing Lukas.