Description of problem: Builds are failing due to conflict in container name rpc error: code = 2 desc = Error response from daemon: {"message":"Conflict. The name \"/k8s_sti-build_cakephp-mysql-example-26-build_proj27_f0d04e4c-455a-11e7-9b8e-02ca84fd3e44_0\" is already in use by container e0d916b0ba24721582405d5517cb1408049310e6d18dc35b66b782c19b7fee18. You have to remove (or rename) that container to be able to reuse that name."} 0 10m oc logs for pod show following Error from server (BadRequest): container "sti-build" in pod "cakephp-mysql-example-26-build" is waiting to start: rpc error: code = 2 desc = Error response from daemon: {"message":"Conflict. The name \"/k8s_sti-build_cakephp-mysql-example-26-build_proj22_f0dd6098-455a-11e7-9b8e-02ca84fd3e44_0\" is already in use by container 7f8d403b92c7db283dfc8cb18744897ad068378c12e01c497f54f609abb24f38. You have to remove (or rename) that container to be able to reuse that name."} Version-Release number of selected component (if applicable): openshift v3.6.79 kubernetes v1.6.1+5115d708d7 etcd 3.1.0 Client: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 How reproducible: Running 30 concurrent builds on 2 m4.xlarge compute nodes. Actual results: Builds fail due to the error. Expected results: Builds should pass. Additional info: Attaching node logs
Created attachment 1283508 [details] compute node logs
the issue is that the attempt to create a pod sandbox is timing out when interacting with the docker runtime. this is demonstrated with all of the "operation timeout: context deadline exceeded" errors throughout the logs. this typically is symptomatic of an overwhelmed docker daemon. how many concurrent pods are running on this node? how many of those pods are builds (it sounds like ~30 but are they spread across different nodes)? what are the resource requirements for the build pods in this scenario? did this scenario work in earlier versions of the product? i.e. have we ever been able to support 30 concurrent build pods in this environment? is 30 a tipping point, and things worked great with 10, 20, 25, etc? it would be good to have metrics data collected for some of the fields i reference below to know where the system actually tips over with longer tail latency in these types of bug reports. we did not appear to set --system-reserved values for this node. why? we should be running all test scenarios by at least reserving time for system daemons, kubelet, and docker daemon to function. do we know if docker is cpu starved in this scenario? are you able to get usage stats for docker relative to other end-user pods during this run? from the logs, it appears we would be setting /kubepods.slice cgroup to have cpu shares equal to 4096, and /system.slice must have the default of 1024. this ratio may be out of whack if we are actually cpu starved. i would like us to actually start monitoring usage of the docker daemon in this scenario, and understand that we should be reserving resource, and how much resource. so setting --system-reserved=cpu=1,memory={what we think /system.slice} is using in this scenario would be a good start. we could then look to not just enforce node allocatable on end-user pods but also the /system.slice by setting --enforce-node-allocatable=pods,system-reserved, but i would like to start by actually setting some cpu and memory reservation before doing that to see if the cpu ratio of 4096:1024 was just not right versus say 3096:1024 (assuming you actually reserve cpu=1). it would be good to know what the metrics are for the following fields on on this node in this scenario as well. can you collect the output of the following to know what the 50, 90, and 95% latency measurements and attach to the bz? $ oc get --raw /api/v1/nodes/{node-name}/proxy/metrics) in particular, i am interested in seeing what the tail latencies are for the following: pod_start_latency_microseconds docker_operations_latency_microseconds docker_operations docker_operations_errors docker_operations_timeout pod_worker_start_latency_microseconds pleg_relist_latency_microseconds pleg_relist_interval_microseconds runtime_operations runtime_operations_latency_microseconds runtime_operations_errors i suspect we are seeing high tail latency on pleg_relist and docker_operations when this is happening.
Hi Derek, Please see other answers below how many concurrent pods are running on this node? Current status on the node is following Containers: 125 Running: 2 Paused: 0 Stopped: 123 Images: 378 Server Version: 1.12.6 how many of those pods are builds (it sounds like ~30 but are they spread across different nodes)? They are spread across 2 compute nodes, to ideally it should have 15 build pods running at one point. There is nothing else in the projects just build configs. what are the resource requirements for the build pods in this scenario? I am attaching "oc describe node" and "oc describe buuld pod" for this info. did this scenario work in earlier versions of the product? We have run 50 concurrent builds before on 2 compute nodes and never seen this error. We have seen more failures at 50 concurrent builds on 2 nodes but 30 builds have been stable. we did not appear to set --system-reserved values for this node. why? $ oc get --raw /api/v1/nodes/{node-name}/proxy/metrics) Please see output attached.
Created attachment 1284161 [details] describe build pod
Created attachment 1284163 [details] describe node
Created attachment 1284164 [details] oc get raw output
This is an openshift-ansible install. If --system-reserved is not set, it is because the install did not set it. @sdodson - should I open a bz for that? 30 concurrent builds is a workload which was successful in previous releases. On an AWS EC2 m4.xlarge (4vCPU/16GB RAM) instance, 50 concurrent builds was the "normal" outer limit in previous releases (3.5, 3.4, etc)
Mike, Yeah, lets open a bug to add system reserved resources. It's come up in the past as a good idea but we've never done it. We'd need guidance on what to default those values to, if that's documented somewhere already let us know.
Thanks for the data, I will investigate further. From the metrics, I see the following: kubelet_docker_operations{operation_type="create_container"} 5336 kubelet_docker_operations{operation_type="info"} 1283 kubelet_docker_operations{operation_type="inspect_container"} 55139 kubelet_docker_operations{operation_type="inspect_image"} 16749 kubelet_docker_operations{operation_type="list_containers"} 343141 kubelet_docker_operations{operation_type="list_images"} 20207 kubelet_docker_operations{operation_type="pull_image"} 7 kubelet_docker_operations{operation_type="remove_container"} 4600 kubelet_docker_operations{operation_type="remove_image"} 848 kubelet_docker_operations{operation_type="start_container"} 3090 kubelet_docker_operations{operation_type="stop_container"} 5858 kubelet_docker_operations{operation_type="version"} 51625 # HELP kubelet_docker_operations_errors Cumulative number of Docker operation errors by operation type. # TYPE kubelet_docker_operations_errors counter kubelet_docker_operations_errors{operation_type="create_container"} 2246 kubelet_docker_operations_errors{operation_type="inspect_container"} 177 kubelet_docker_operations_errors{operation_type="inspect_image"} 7 kubelet_docker_operations_errors{operation_type="list_containers"} 23975 kubelet_docker_operations_errors{operation_type="list_images"} 3222 kubelet_docker_operations_errors{operation_type="remove_container"} 432 kubelet_docker_operations_errors{operation_type="remove_image"} 232 kubelet_docker_operations_errors{operation_type="start_container"} 1 kubelet_docker_operations_errors{operation_type="stop_container"} 48 kubelet_docker_operations_errors{operation_type="version"} 7076 It looks like 1/2 of our attempts to create a container resulted in an error, followed by a fair number of errors when listing containers and listing images. I suspect most of these were similar deadline exceeded errors or its follow-on. I need to research if the move to the CRI versus running without the CRI would have had a major impact here for the underlying docker interaction.
Possible variant of https://bugzilla.redhat.com/show_bug.cgi?id=1451902 ?
I agree this is a variant of the same underlying problem referenced in 1451902.
Following the life of the pod: cakephp-mysql-example-26-build_proj11 May 30 13:10:55 ip-172-31-5-23 atomic-openshift-node: I0530 13:10:55.236933 15333 kubelet.go:1813] SyncLoop (ADD, "api"): "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" May 30 13:10:55 ip-172-31-5-23 atomic-openshift-node: I0530 13:10:55.566783 15333 kuberuntime_manager.go:370] No sandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" ca n be found. Need to start a new one May 30 13:12:56 ip-172-31-5-23 atomic-openshift-node: E0530 13:12:56.524357 15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded The first attempt to create the pod sandbox took longer than 2m to respond. This deadline is imposed by the CRI, so any attempt to recover in the dockershim for a conflicting name will not be handled. The theory here is that initial create of the sandbox container just took longer than 2m and may or may not have succeeded. Note from the log, the attempt to create the container was >2m. For the pod in question, we try for many minutes to create the pod sandbox (for example, 3 minutes later) May 30 13:15:19 ip-172-31-5-23 atomic-openshift-node: E0530 13:15:19.110654 15333 pod_workers.go:182] Error syncing pod f0f7f873-455a-11e7-9b8e-02ca84fd3e44 ("cakephp-mysql-example-26-build_proj11(f0f7f873-455 a-11e7-9b8e-02ca84fd3e44)"), skipping: failed to "CreatePodSandbox" for "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" with CreatePodSandboxError: "CreatePodSandbox for pod \"cakep hp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)\" failed: rpc error: code = 2 desc = failed to create a sandbox for pod \"cakephp-mysql-example-26-build\": operation timeout: context deadl ine exceeded" At each attempt, the operation is timing out after 2m May 30 13:15:44 ip-172-31-5-23 dockerd-current: time="2017-05-30T13:15:44.727273103-04:00" level=error msg="Handler for POST /v1.24/containers/create?name=k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-4 55a-11e7-9b8e-02ca84fd3e44_1 returned error: Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98ab b4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name." May 30 13:15:44 ip-172-31-5-23 dockerd-current: time="2017-05-30T13:15:44.727322814-04:00" level=error msg="Handler for POST /v1.24/containers/create returned error: Conflict. The name \"/k8s_POD_cakephp-mysql-e xample-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to b e able to reuse that name." May 30 13:15:44 ip-172-31-5-23 atomic-openshift-node: W0530 13:15:44.741984 15333 docker_sandbox.go:263] NetworkPlugin cni failed on the status hook for pod "cakephp-mysql-example-26-build_proj11": Unexpected command output nsenter: cannot open : No such file or directory May 30 13:15:44 ip-172-31-5-23 atomic-openshift-node: W0530 13:15:44.755155 15333 docker_sandbox.go:263] NetworkPlugin cni failed on the status hook for pod "cakephp-mysql-example-26-build_proj11": Unexpected command output nsenter: cannot open : No such file or directory Later, it appears the docker daemon finally gives a response that the container with the specified name exists (presumably, it actually completed in one of the many prior attempts that had exceeded 2m earlier), but the kubelet has no idea if the sandbox had been created prior or not since everything just timed out. Approximately 30s later, the initial sandbox container died. May 30 13:16:16 ip-172-31-5-23 atomic-openshift-node: I0530 13:16:16.002501 15333 kubelet.go:1847] SyncLoop (PLEG): "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)", event: &pleg.PodLifecycleEvent{ID:"f0f7f873-455a-11e7-9b8e-02ca84fd3e44", Type:"ContainerDied", Data:"98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b"} And the kubelet again tries to create a new sandbox: May 30 13:16:53 ip-172-31-5-23 atomic-openshift-node: E0530 13:16:53.658582 15333 remote_runtime.go:86] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": Error response from daemon: {"message":"Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name."} May 30 13:16:53 ip-172-31-5-23 atomic-openshift-node: E0530 13:16:53.658655 15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": Error response from daemon: {"message":"Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name."} After many more minutes, the docker daemon still is overwhelmed, and calls to create a container taking more than 2m to respond. May 30 13:22:37 ip-172-31-5-23 atomic-openshift-node: E0530 13:22:37.826214 15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded May 30 13:22:37 ip-172-31-5-23 atomic-openshift-node: E0530 13:22:37.826236 15333 kuberuntime_manager.go:619] createPodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded I think its safe to say that this node was not healthy, the kubelet was trying to do what it could given the environment. We need to know the root cause for the 2m timeout, as a result, I am duping on the referenced bug. *** This bug has been marked as a duplicate of bug 1451902 ***
On second thought, I think this is slightly different than 1451902 in that the core timeout is happening on CreateContainer operation (>2 minutes for multiple minutes) which is different than the PLEG calling list containers. we need to understand why a create container would take > 2m, which is the timeout from the kubelet -> CRI shim interaction.
Was the forced delete option on for the docker daemon?
This was a typical openshift-ansible install, There were no changes done to docker settings after the install. How do I check for force delete option ?
--storage-opts dm.use_deferred_deletion=true