Bug 1437121

Summary: Builds stuck in Pending state for hours
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: BuildAssignee: Ben Parees <bparees>
Status: CLOSED ERRATA QA Contact: Vikas Laad <vlaad>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: aos-bugs, bparees, jokerman, mmccomas, vgoyal, vlaad, wsun, xtian
Target Milestone: ---   
Target Release: 3.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Containers were being force killed by the build process. Consequence: Some container resources were not freed when they were force killed, ultimately resulting in resource exhaustion and the inability to run new builds. Fix: Containers will be gracefully killed instead of force killing them, allowing them to clean up their resources. Result: Resource exhaustion will no longer occur, builds will continue to run normally.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-15 18:37:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1442875    
Bug Blocks:    
Attachments:
Description Flags
desc build
none
desc pod none

Description Vikas Laad 2017-03-29 14:29:09 UTC
Description of problem:
I have concurrent builds tests running which create thousands of builds, while running these builds docker started running out of disc space on the worker nodes. Some builds were stuck in pending state, and even after container failed to create builds are still stuck in Pending. If I start a new build it works but these are stuck in Pending forever.

root@ip-172-31-25-76: ~ # oc get build --all-namespaces |grep Pending

NAMESPACE   NAME                        TYPE      FROM          STATUS                               STARTED         DURATION
proj1       cakephp-mysql-example-195   Source    Git           Pending                                              
proj12      cakephp-mysql-example-199   Source    Git           Pending                                              
proj14      cakephp-mysql-example-181   Source    Git           Pending                                              
proj25      cakephp-mysql-example-204   Source    Git           Pending                                              
proj32      cakephp-mysql-example-195   Source    Git           Pending                                              
proj37      cakephp-mysql-example-192   Source    Git           Pending                                              
proj39      cakephp-mysql-example-178   Source    Git           Pending                                              
proj43      cakephp-mysql-example-210   Source    Git           Pending                                              

See attached logs for desc of build and pod.

root@ip-172-31-25-76: ~ # oc get pods -n proj1 | grep 195
cakephp-mysql-example-195-build   0/1       RunContainerError    0          8h


Version-Release number of selected component (if applicable):
openshift v3.5.3
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Steps to Reproduce:
1. Create a cluster 1 master 1 etcd 1 infra and 2 worker nodes (m4.xlarge)
2. start concurrent builds in the batches of 25
3. after around 10K builds I ran into this problem.

Actual results:
Builds stuck in Pending

Expected results:
Build should fail since container creation failed

Additional info:

Comment 1 Vikas Laad 2017-03-29 14:30:17 UTC
Created attachment 1267310 [details]
desc build

Comment 2 Vikas Laad 2017-03-29 14:30:37 UTC
Created attachment 1267311 [details]
desc pod

Comment 3 Ben Parees 2017-03-30 13:26:10 UTC
This seems like a scheduler problem..  the pod is stuck in pending and never got scheduled/rescheduled after the initial failure to start the container.

Either the pod should be getting marked failed, or it should eventually get run.  The build can't do much about it if it sits in pending indefinitely.

Comment 4 Derek Carr 2017-03-30 20:49:56 UTC
The describe output for the pod in question:

Error syncing pod, skipping: failed to "StartContainer" for "sti-build" with RunContainerError$
 "runContainer: Error response from daemon: {\"message\":\"devmapper: Thin Pool has 6075 free data blocks which is less than minimum required 6083 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior\$


Was this error actually fixed?  It looks like you are still consuming too much disk space.

Comment 5 Seth Jennings 2017-03-30 21:37:03 UTC
The reason the thin pool is full is there are a lot of dangling layers being stored by docker i.e. thin devices that don't correlate to any container/image.

docker containers:
http://pastebin.test.redhat.com/469849

there were only 2 images with -a

docker layers:
http://pastebin.test.redhat.com/469960

yet there are almost 700 thin devices (layers)

Could be the result of failed builds.

docker is not cleaning these up and once they are orphaned like this, I'm not sure how to clean them up.

Moving to Containers for more investigation.

Comment 6 Vikas Laad 2017-03-31 14:22:26 UTC
I deleted all the project here are the stats after that

docker ps -a 
root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES


root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # ls -l | wc -l
652

root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # ~/get-thin-ls.sh | wc -l
652

root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # lvs
  LV          VG        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool docker_vg twi-a-t--- 29.70g             86.61  53.91                           
root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # vgs
  VG        #PV #LV #SN Attr   VSize  VFree  
  docker_vg   1   1   0 wz--n- 30.00g 236.00m

Specially after deleting the projects it should have reclaimed the space. Please let me know if there is anything else needed, I still have the env.

Comment 7 Jhon Honce 2017-04-04 14:45:21 UTC
Please attach the output from $(docker info).  We are looking for dead containers.

Comment 8 Vikas Laad 2017-04-04 15:18:35 UTC
Docker info on both the worker nodes

root@ip-172-31-8-87: /var/lib/docker/devicemapper/mnt # docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.12.6
Storage Driver: devicemapper
 Pool Name: docker_vg-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 27.62 GB
 Data Space Total: 31.89 GB
 Data Space Available: 4.27 GB
 Metadata Space Used: 18.09 MB
 Metadata Space Total: 33.55 MB
 Metadata Space Available: 15.47 MB
 Thin Pool Minimum Free Space: 3.189 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: overlay host bridge null
 Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Security Options: seccomp selinux
Kernel Version: 3.10.0-514.10.2.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 4
Total Memory: 15.26 GiB
Name: ip-172-31-8-87.us-west-2.compute.internal
ID: VHAL:ZTY2:A33U:LPS5:F437:E7LG:MLOZ:YLIW:AG5G:XPNJ:OQOR:PICC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://registry.ops.openshift.com/v1/
Insecure Registries:
 registry.ops.openshift.com
 127.0.0.0/8
Registries: registry.ops.openshift.com (insecure), registry.access.redhat.com (secure), docker.io (secure)



root@ip-172-31-34-20: ~ # docker info
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 1.12.6
Storage Driver: devicemapper
 Pool Name: docker_vg-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 26.36 GB
 Data Space Total: 31.89 GB
 Data Space Available: 5.536 GB
 Metadata Space Used: 14.86 MB
 Metadata Space Total: 33.55 MB
 Metadata Space Available: 18.69 MB
 Thin Pool Minimum Free Space: 3.189 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: host bridge null overlay
 Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Security Options: seccomp selinux
Kernel Version: 3.10.0-514.10.2.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.3 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 4
Total Memory: 15.26 GiB
Name: ip-172-31-34-20.us-west-2.compute.internal
ID: VHAL:ZTY2:A33U:LPS5:F437:E7LG:MLOZ:YLIW:AG5G:XPNJ:OQOR:PICC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://registry.ops.openshift.com/v1/
Insecure Registries:
 registry.ops.openshift.com
 127.0.0.0/8
Registries: registry.ops.openshift.com (insecure), registry.access.redhat.com (secure), docker.io (secure)

Comment 9 Vivek Goyal 2017-04-04 19:28:17 UTC
Are you doing forced removal of container. I suspect forced removal might be going on. That means container will disappear from "docker info" view but it will still be with deivcemapper graph driver and will keep thin pool full.

If yes, we need to stop doing force removal of containers.

Comment 10 Vikas Laad 2017-04-04 20:16:18 UTC
Ben already checked containers are not force removed.

Comment 11 Ben Parees 2017-04-05 02:38:23 UTC
if this is the conversation i remember from IRC i was asked if we force delete pods, which we don't.

We do force delete containers, at least until the recent changes to s2i are propagated into origin (as dug up by Jhon).

I'll make a pull to bump s2i tomorrow.

If we think that's the only issue here, please feel free to assign the bug to me.

Comment 12 Ben Parees 2017-04-05 12:42:43 UTC
s2i bump here: 
https://github.com/openshift/origin/pull/13633

Comment 13 Vivek Goyal 2017-04-05 12:51:04 UTC
I believe this issue is happening due to force container removal. Another way to verify this is to look into /var/lib/docker/devicemapper/metadata/ directory and if you see bunch of files having "-init" as suffix in their name, you know these are container metadata files.

For example.

c26669fe1dd29a6b0268a63072edd22a588151e60c47ef97ba49b53350b01752-init

If that's the case, there is a chance this is leftover from forced container removal and that's consuming space in thin pool.

Comment 14 Vikas Laad 2017-04-05 13:29:45 UTC
I see 429 *-init files on one node and 259 on another one.

e8ecd1344de5f381dcbfca75619fe3ff31189c0fd753c2ad086521f73e10ff93-init
e98efcd0e01611fc889e60b98f23f9dbbe9b36f58d6e5588adc13dec80861842-init
eb19b2cfd5e5655ebb89fc079de6094a8886bd5ab9e18e19c2856920106021e7-init
eb455a45a32dfb89e230b012c14966b8fb76e389c89122df9892ec404f9b7700-init
ebb80ecca6d87ecbe183231ed3e3c53eca38d9021f6c6df6491183c62168efb4-init
ebebae8d818a8d26191ea89c94886d49ee69e5caade69f2e52db8c541a7b905d-init
ec993c8e3225a023f806d106b00fc125bcb811928a06a4489993059e135ab047-init
ed03d3f07d20040fea4ceef770509b4109561b261eceff9877d2c268d7af64b3-init
ee1a34ffa5d7e2a600be014481e707288feb9c3fcde6c61105f288ebe3e14bdc-init
efd1639dc9a921873f174624e584987919d857bd069ea14517440869a1402a9b-init
efdc99a4d6e742a42be8c57ecbf60745ab2c7ffd48894612cd9bf282c988b54b-init
f06fd584d21b608b5cff06b701b826399fc34162c9369801578ed6b5bf6a72d9-init
f0c59543627c4e9e0be4a652b395d88e636459f50357807084acae9957cf2d46-init
f12ca63c98bcbf11fbfa569e7adf3f20f236ddca228ddb32f8ced09463362935-init
f4d2ffe51cbe09102be559c285f7131cbce6384cb6f2e855ccc08b83a2d730d2-init
f54bef1e9308f2211d7de0eb036f801792e2e1b902f063871168c10fd3e389d1-init
f638ddbea61ddcd46a36e61e46695228c56a02b5cdec6b5cc0814e9d247520cc-init
f8b295891d2ad650e06f8c3fc374ba0c919587878ed94cb23eaef031203a7b76-init
f8ce10e254918339c14e13cdccebe34ffef2e502574b8feb93b854040cd7a5b7-init
fb4a4c0c6c23fcec4e59b1d1babc900ddcee6f851bed78e1e10045e983e6d36a-init
fd16a5c26652e21242704fefabd7b6b05e0c24d82450683205a15893d1f75869-init
fef53ef677614303086b5ea7c30f58baf0a205c9a8fd85368057cafb4977a2de-init
ff31c0dfa6588e570ebf71e47d8d132d1aac6bf139985a5c3b05a39f2b80fcc4-init
ffaae4fe1dcd687dd32cd1e549ac8a7babfc202c30c6f66502ea9e4fe04062c4-init
ffd178798f92d49eab7ad5c0854af990b49f03afa52724223f1ccd583d9b6dd8-init

Comment 15 Vivek Goyal 2017-04-05 13:33:47 UTC
So this is the case of left over containers after forced removal. Lets fix that and re-run and see if problem is still visible.

Comment 16 Ben Parees 2017-04-05 13:39:41 UTC
k, assigning to me, the PR is already created, i'll move this to on_qa for re-testing once it merges.

Comment 17 Vivek Goyal 2017-04-05 13:45:20 UTC
(In reply to Vikas Laad from comment #14)
> I see 429 *-init files on one node and 259 on another one.

Vikas, 

Remember, just presence of -init file itself does not mean we have the case
of forced remove container. It just means that this belongs to
container metadata (and not image metadata)

If these containers are not visible in "docker ps -a" output, that means
that devicemapper has data about some container which is not visible
to docker upper layer. And that means that somebody did a force removal
of container.

Comment 18 Vikas Laad 2017-04-05 13:54:16 UTC
(In reply to Vivek Goyal from comment #17)

Right, "docker ps -a" returns 2 containers total from both the nodes.

Comment 19 Wei Sun 2017-04-13 06:42:06 UTC
Hello Vikas,could you help verify this bug?

Comment 20 Vikas Laad 2017-04-13 14:38:20 UTC
Wei, I started my tests to verify this.

Comment 21 Vikas Laad 2017-04-17 14:06:27 UTC
I tested again on following openshift version
openshift v3.5.5.5
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Did lots of cakephp app builds, after some time nodes started running out of disk space and builds started failing. Here are the completed and failed builds numbers
root@ip-172-31-22-217: ~ # oc get build --all-namespaces | wc -l
2550
root@ip-172-31-22-217: ~ # oc get build --all-namespaces | grep Complete |wc -l                                                                               
1334
root@ip-172-31-22-217: ~ # oc get build --all-namespaces | grep Failed |wc -l                                                                                 
1216

stats on one of the worker nodes
root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # ls -l *-init | wc -l
63
root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # lvs
  LV          VG        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool docker_vg twi-aot--- 29.70g             88.82  46.51                           
root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # vgs
  VG        #PV #LV #SN Attr   VSize  VFree  
  docker_vg   1   1   0 wz--n- 30.00g 236.00m

After checking I deleted all the projects, and now I do not see any running containers on the worker nodes. Here are the stats after deleting projects

root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # lvs
  LV          VG        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool docker_vg twi-a-t--- 29.70g             75.60  38.12                           
root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # vgs
  VG        #PV #LV #SN Attr   VSize  VFree  
  docker_vg   1   1   0 wz--n- 30.00g 236.00m
root@ip-172-31-34-80: /var/lib/docker/devicemapper/metadata # ls -l *-init | wc -l
69

As we can see above there are still *-init files exists in docker director and disk space is also not claimed by docker.

Ben, please let me know if I am running a old build or there is something wrong.

Comment 22 Ben Parees 2017-04-17 14:24:12 UTC
The change to not force kill containers was put into origin(https://github.com/openshift/origin/pull/13633), it was not backported to 3.5.

Comment 23 Vikas Laad 2017-04-28 14:00:54 UTC
Verification of this bug is blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1442875

Comment 24 Vikas Laad 2017-05-11 18:24:46 UTC
I am able to run my tests with docker version 

root@ip-172-31-4-211: ~ # docker version
Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-16.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Tue Mar 21 13:30:59 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-16.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Tue Mar 21 13:30:59 2017
 OS/Arch:         linux/amd64

There were many build failures, here are the stats on both the nodes. I see less number of -init files
root@ip-172-31-17-182: /var/lib/docker/devicemapper/metadata # docker ps -a | wc -l
43
root@ip-172-31-17-182: /var/lib/docker/devicemapper/metadata # ls *-init | wc -l   
52

Another node
root@ip-172-31-43-54: /var/lib/docker/devicemapper/metadata # docker ps -a | wc -l
55
root@ip-172-31-43-54: /var/lib/docker/devicemapper/metadata # ls *-init | wc -l
81

I am still running tests, will close it after couple of cycles.

Comment 26 errata-xmlrpc 2017-06-15 18:37:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1425