Bug 1456943 - Build failures due to conflict in container name
Summary: Build failures due to conflict in container name
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.7.0
Assignee: Jhon Honce
QA Contact: DeShuai Ma
URL:
Whiteboard: aos-scalability-36
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-30 17:30 UTC by Vikas Laad
Modified: 2017-09-25 13:19 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-22 16:33:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
compute node logs (1.25 MB, application/zip)
2017-05-30 17:33 UTC, Vikas Laad
no flags Details
describe build pod (4.36 KB, text/plain)
2017-06-01 14:30 UTC, Vikas Laad
no flags Details
describe node (5.96 KB, text/plain)
2017-06-01 14:31 UTC, Vikas Laad
no flags Details
oc get raw output (264.57 KB, text/plain)
2017-06-01 14:31 UTC, Vikas Laad
no flags Details

Description Vikas Laad 2017-05-30 17:30:37 UTC
Description of problem:
Builds are failing due to conflict in container name

rpc error: code = 2 desc = Error response from daemon: {"message":"Conflict. The name \"/k8s_sti-build_cakephp-mysql-example-26-build_proj27_f0d04e4c-455a-11e7-9b8e-02ca84fd3e44_0\" is already in use by container e0d916b0ba24721582405d5517cb1408049310e6d18dc35b66b782c19b7fee18. You have to remove (or rename) that container to be able to reuse that name."}   0          10m

oc logs for pod show following
Error from server (BadRequest): container "sti-build" in pod "cakephp-mysql-example-26-build" is waiting to start: rpc error: code = 2 desc = Error response from daemon: {"message":"Conflict. The name \"/k8s_sti-build_cakephp-mysql-example-26-build_proj22_f0dd6098-455a-11e7-9b8e-02ca84fd3e44_0\" is already in use by container 7f8d403b92c7db283dfc8cb18744897ad068378c12e01c497f54f609abb24f38. You have to remove (or rename) that container to be able to reuse that name."}


Version-Release number of selected component (if applicable):
openshift v3.6.79
kubernetes v1.6.1+5115d708d7
etcd 3.1.0


Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-28.git1398f24.el7.x86_64
 Go version:      go1.7.4
 Git commit:      1398f24/1.12.6
 Built:           Wed May 17 01:16:44 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-28.git1398f24.el7.x86_64
 Go version:      go1.7.4
 Git commit:      1398f24/1.12.6
 Built:           Wed May 17 01:16:44 2017

How reproducible:
Running 30 concurrent builds on 2 m4.xlarge compute nodes.

Actual results:
Builds fail due to the error.

Expected results:
Builds should pass.

Additional info:
Attaching node logs

Comment 1 Vikas Laad 2017-05-30 17:33:02 UTC
Created attachment 1283508 [details]
compute node logs

Comment 2 Derek Carr 2017-06-01 02:29:39 UTC
the issue is that the attempt to create a pod sandbox is timing out when interacting with the docker runtime.  this is demonstrated with all of the "operation timeout: context deadline exceeded" errors throughout the logs.  this typically is symptomatic of an overwhelmed docker daemon.

how many concurrent pods are running on this node?

how many of those pods are builds (it sounds like ~30 but are they spread across different nodes)?

what are the resource requirements for the build pods in this scenario?
 
did this scenario work in earlier versions of the product?  i.e. have we ever been able to support 30 concurrent build pods in this environment?  is 30 a tipping point, and things worked great with 10, 20, 25, etc?  it would be good to have metrics data collected for some of the fields i reference below to know where the system actually tips over with longer tail latency in these types of bug reports.

we did not appear to set --system-reserved values for this node.  why?  we should be running all test scenarios by at least reserving time for system daemons, kubelet, and docker daemon to function.  do we know if docker is cpu starved in this scenario?  are you able to get usage stats for docker relative to other end-user pods during this run?

from the logs, it appears we would be setting /kubepods.slice cgroup to have cpu shares equal to 4096, and /system.slice must have the default of 1024.  this ratio may be out of whack if we are actually cpu starved.  i would like us to actually start monitoring usage of the docker daemon in this scenario, and understand that we should be reserving resource, and how much resource.  so setting --system-reserved=cpu=1,memory={what we think /system.slice} is using in this scenario would be a good start.  we could then look to not just enforce node allocatable on end-user pods but also the /system.slice by setting --enforce-node-allocatable=pods,system-reserved, but i would like to start by actually setting some cpu and memory reservation before doing that to see if the cpu ratio of 4096:1024 was just not right versus say 3096:1024 (assuming you actually reserve cpu=1).

it would be good to know what the metrics are for the following fields on on this node in this scenario as well.

can you collect the output of the following to know what the 50, 90, and 95% latency measurements and attach to the bz?

$ oc get --raw /api/v1/nodes/{node-name}/proxy/metrics) 

in particular, i am interested in seeing what the tail latencies are for the following:

pod_start_latency_microseconds
docker_operations_latency_microseconds
docker_operations
docker_operations_errors
docker_operations_timeout
pod_worker_start_latency_microseconds
pleg_relist_latency_microseconds
pleg_relist_interval_microseconds
runtime_operations
runtime_operations_latency_microseconds
runtime_operations_errors

i suspect we are seeing high tail latency on pleg_relist and docker_operations when this is happening.

Comment 3 Vikas Laad 2017-06-01 14:30:30 UTC
Hi Derek,

Please see other answers below

how many concurrent pods are running on this node?
Current status on the node is following 

Containers: 125
 Running: 2
 Paused: 0
 Stopped: 123
Images: 378
Server Version: 1.12.6

how many of those pods are builds (it sounds like ~30 but are they spread across different nodes)?
They are spread across 2 compute nodes, to ideally it should have 15 build pods running at one point. There is nothing else in the projects just build configs.


what are the resource requirements for the build pods in this scenario?
I am attaching "oc describe node" and "oc describe buuld pod" for this info.

did this scenario work in earlier versions of the product? 
We have run 50 concurrent builds before on 2 compute nodes and never seen this error. We have seen more failures at 50 concurrent builds on 2 nodes but 30 builds have been stable.

we did not appear to set --system-reserved values for this node.  why?


$ oc get --raw /api/v1/nodes/{node-name}/proxy/metrics) 
Please see output attached.

Comment 4 Vikas Laad 2017-06-01 14:30:57 UTC
Created attachment 1284161 [details]
describe build pod

Comment 5 Vikas Laad 2017-06-01 14:31:18 UTC
Created attachment 1284163 [details]
describe node

Comment 6 Vikas Laad 2017-06-01 14:31:55 UTC
Created attachment 1284164 [details]
oc get raw output

Comment 7 Mike Fiedler 2017-06-01 14:34:27 UTC
This is an openshift-ansible install.  If --system-reserved is not set, it is because the install did not set it.   @sdodson - should I open a bz for that?

30 concurrent builds is a workload which was successful in previous releases.   On an AWS EC2 m4.xlarge (4vCPU/16GB RAM) instance, 50 concurrent builds was the "normal" outer limit in previous releases (3.5, 3.4, etc)

Comment 8 Scott Dodson 2017-06-01 15:16:56 UTC
Mike,

Yeah, lets open a bug to add system reserved resources. It's come up in the past as a good idea but we've never done it. We'd need guidance on what to default those values to, if that's documented somewhere already let us know.

Comment 9 Derek Carr 2017-06-01 22:55:30 UTC
Thanks for the data, I will investigate further.

From the metrics, I see the following:

kubelet_docker_operations{operation_type="create_container"} 5336
kubelet_docker_operations{operation_type="info"} 1283
kubelet_docker_operations{operation_type="inspect_container"} 55139
kubelet_docker_operations{operation_type="inspect_image"} 16749
kubelet_docker_operations{operation_type="list_containers"} 343141
kubelet_docker_operations{operation_type="list_images"} 20207
kubelet_docker_operations{operation_type="pull_image"} 7
kubelet_docker_operations{operation_type="remove_container"} 4600
kubelet_docker_operations{operation_type="remove_image"} 848
kubelet_docker_operations{operation_type="start_container"} 3090
kubelet_docker_operations{operation_type="stop_container"} 5858
kubelet_docker_operations{operation_type="version"} 51625
# HELP kubelet_docker_operations_errors Cumulative number of Docker operation errors by operation type.
# TYPE kubelet_docker_operations_errors counter
kubelet_docker_operations_errors{operation_type="create_container"} 2246
kubelet_docker_operations_errors{operation_type="inspect_container"} 177
kubelet_docker_operations_errors{operation_type="inspect_image"} 7
kubelet_docker_operations_errors{operation_type="list_containers"} 23975
kubelet_docker_operations_errors{operation_type="list_images"} 3222
kubelet_docker_operations_errors{operation_type="remove_container"} 432
kubelet_docker_operations_errors{operation_type="remove_image"} 232
kubelet_docker_operations_errors{operation_type="start_container"} 1
kubelet_docker_operations_errors{operation_type="stop_container"} 48
kubelet_docker_operations_errors{operation_type="version"} 7076

It looks like 1/2 of our attempts to create a container resulted in an error, followed by a fair number of errors when listing containers and listing images.  I suspect most of these were similar deadline exceeded errors or its follow-on.

I need to research if the move to the CRI versus running without the CRI would have had a major impact here for the underlying docker interaction.

Comment 10 Jeremy Eder 2017-06-01 23:54:49 UTC
Possible variant of https://bugzilla.redhat.com/show_bug.cgi?id=1451902 ?

Comment 11 Derek Carr 2017-06-02 14:52:54 UTC
I agree this is a variant of the same underlying problem referenced in 1451902.

Comment 12 Derek Carr 2017-06-07 18:13:31 UTC
Following the life of the pod: cakephp-mysql-example-26-build_proj11

May 30 13:10:55 ip-172-31-5-23 atomic-openshift-node: I0530 13:10:55.236933   15333 kubelet.go:1813] SyncLoop (ADD, "api"): "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)"
May 30 13:10:55 ip-172-31-5-23 atomic-openshift-node: I0530 13:10:55.566783   15333 kuberuntime_manager.go:370] No sandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" ca
n be found. Need to start a new one
May 30 13:12:56 ip-172-31-5-23 atomic-openshift-node: E0530 13:12:56.524357   15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded

The first attempt to create the pod sandbox took longer than 2m to respond.  This deadline is imposed by the CRI, so any attempt to recover in the dockershim for a conflicting name will not be handled.  The theory here is that initial create of the sandbox container just took longer than 2m and may or may not have succeeded.  Note from the log, the attempt to create the container was >2m.

For the pod in question, we try for many minutes to create the pod sandbox (for example, 3 minutes later)

May 30 13:15:19 ip-172-31-5-23 atomic-openshift-node: E0530 13:15:19.110654   15333 pod_workers.go:182] Error syncing pod f0f7f873-455a-11e7-9b8e-02ca84fd3e44 ("cakephp-mysql-example-26-build_proj11(f0f7f873-455
a-11e7-9b8e-02ca84fd3e44)"), skipping: failed to "CreatePodSandbox" for "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" with CreatePodSandboxError: "CreatePodSandbox for pod \"cakep
hp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)\" failed: rpc error: code = 2 desc = failed to create a sandbox for pod \"cakephp-mysql-example-26-build\": operation timeout: context deadl
ine exceeded"

At each attempt, the operation is timing out after 2m

May 30 13:15:44 ip-172-31-5-23 dockerd-current: time="2017-05-30T13:15:44.727273103-04:00" level=error msg="Handler for POST /v1.24/containers/create?name=k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-4
55a-11e7-9b8e-02ca84fd3e44_1 returned error: Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98ab
b4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name."
May 30 13:15:44 ip-172-31-5-23 dockerd-current: time="2017-05-30T13:15:44.727322814-04:00" level=error msg="Handler for POST /v1.24/containers/create returned error: Conflict. The name \"/k8s_POD_cakephp-mysql-e
xample-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to b
e able to reuse that name."
May 30 13:15:44 ip-172-31-5-23 atomic-openshift-node: W0530 13:15:44.741984   15333 docker_sandbox.go:263] NetworkPlugin cni failed on the status hook for pod "cakephp-mysql-example-26-build_proj11": Unexpected command output nsenter: cannot open : No such file or directory
May 30 13:15:44 ip-172-31-5-23 atomic-openshift-node: W0530 13:15:44.755155   15333 docker_sandbox.go:263] NetworkPlugin cni failed on the status hook for pod "cakephp-mysql-example-26-build_proj11": Unexpected command output nsenter: cannot open : No such file or directory

Later, it appears the docker daemon finally gives a response that the container with the specified name exists (presumably, it actually completed in one of the many prior attempts that had exceeded 2m earlier), but the kubelet has no idea if the sandbox had been created prior or not since everything just timed out.

Approximately 30s later, the initial sandbox container died.

May 30 13:16:16 ip-172-31-5-23 atomic-openshift-node: I0530 13:16:16.002501   15333 kubelet.go:1847] SyncLoop (PLEG): "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)", event: &pleg.PodLifecycleEvent{ID:"f0f7f873-455a-11e7-9b8e-02ca84fd3e44", Type:"ContainerDied", Data:"98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b"}

And the kubelet again tries to create a new sandbox:

May 30 13:16:53 ip-172-31-5-23 atomic-openshift-node: E0530 13:16:53.658582   
15333 remote_runtime.go:86] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": Error response from daemon: {"message":"Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name."}
May 30 13:16:53 ip-172-31-5-23 atomic-openshift-node: E0530 13:16:53.658655   15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": Error response from daemon: {"message":"Conflict. The name \"/k8s_POD_cakephp-mysql-example-26-build_proj11_f0f7f873-455a-11e7-9b8e-02ca84fd3e44_1\" is already in use by container 98b847320b96651007b7be98abb4104bacf9334c2a28057b822f7f0431c8717b. You have to remove (or rename) that container to be able to reuse that name."}


After many more minutes, the docker daemon still is overwhelmed, and calls to create a container taking more than 2m to respond.

May 30 13:22:37 ip-172-31-5-23 atomic-openshift-node: E0530 13:22:37.826214   15333 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded
May 30 13:22:37 ip-172-31-5-23 atomic-openshift-node: E0530 13:22:37.826236   15333 kuberuntime_manager.go:619] createPodSandbox for pod "cakephp-mysql-example-26-build_proj11(f0f7f873-455a-11e7-9b8e-02ca84fd3e44)" failed: rpc error: code = 2 desc = failed to create a sandbox for pod "cakephp-mysql-example-26-build": operation timeout: context deadline exceeded

I think its safe to say that this node was not healthy, the kubelet was trying to do what it could given the environment.

We need to know the root cause for the 2m timeout, as a result, I am duping on the referenced bug.

*** This bug has been marked as a duplicate of bug 1451902 ***

Comment 13 Derek Carr 2017-06-07 18:16:44 UTC
On second thought, I think this is slightly different than 1451902 in that the core timeout is happening on CreateContainer operation (>2 minutes for multiple minutes) which is different than the PLEG calling list containers.  we need to understand why a create container would take > 2m, which is the timeout from the kubelet -> CRI shim interaction.

Comment 14 Jhon Honce 2017-06-14 15:49:57 UTC
Was the forced delete option on for the docker daemon?

Comment 15 Vikas Laad 2017-06-27 17:19:41 UTC
This was a typical openshift-ansible install, There were no changes done to docker settings after the install.

How do I check for force delete option ?

Comment 16 Jhon Honce 2017-07-17 23:27:29 UTC
--storage-opts dm.use_deferred_deletion=true


Note You need to log in before you can comment on or make changes to this bug.