Bug 1292845

Summary: Cleaning up docker space
Product: OpenShift Container Platform Reporter: Miheer Salunke <misalunk>
Component: ContainersAssignee: Michal Fojtik <mfojtik>
Status: CLOSED NOTABUG QA Contact: weiwei jiang <wjiang>
Severity: low Docs Contact:
Priority: low    
Version: 3.1.0CC: aos-bugs, bchilds, dmcphers, dmoessne, eminguez, erich, erjones, fgrosjea, geliu, haowang, jfoots, jokerman, jolee, jsafrane, knakayam, mfojtik, miminar, misalunk, mmccomas, mturansk, nschuetz, rhowe, vgoyal, wmeng
Target Milestone: ---Keywords: Performance
Target Release: 3.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-12 21:22:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1292964    
Bug Blocks:    

Description Miheer Salunke 2015-12-18 13:44:48 UTC
Description of problem:

On busy nodes it seems the docker space (docker vg) is constantly building up and event the cleanup playbook ( https://github.com/openshift/openshift-ansible/blob/master/playbooks/adhoc/docker_storage_cleanup/docker_storage_cleanup.yml ) does not fix it.

[root@ose-node2 ~]# lvs
  LV          VG             Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool docker-vg      twi-aot--- 99.43g             92.35  6.18                            
  root        rhel_ose-node2 -wi-ao---- 17.47g                                                    
  swap        rhel_ose-node2 -wi-a-----  2.00g                                 

We tried evicting the node:

oadm manage-node --schedulable=false --selector='env=ewu,kubernetes.io/hostname=ose-node2.example.com,router=ewu'

oadm manage-node --evacuate --selector='env=ewu,kubernetes.io/hostname=ose-node2.example.com,router=ewu'

We tried the playbook which seems to reclaim minimal space, at max a few GB:

"docker ps -a | awk '/Exited|Dead/ {print $1}' | xargs --no-run-if-empty docker rm"
"docker images -q -f dangling=true | xargs --no-run-if-empty docker rmi"
"docker images | grep -v -e registry.access.redhat.com -e docker-registry.usersys.redhat.com -e docker-registry.ops.rhcloud.com | awk '{print $3}' | xargs --no-run-if-empty docker rmi 2>/dev/null"

We tried rebooting afterwards.

Docker images show about 4GB of images:

[root@ose-node2 ~]# docker images -a
REPOSITORY                                                        TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
registry.access.redhat.com/openshift3/ose-docker-builder          v3.1.0.4            90857752fab7        4 weeks ago         395.3 MB
registry.access.redhat.com/openshift3/ose-sti-builder             v3.1.0.4            27fb9b206f9c        4 weeks ago         395.3 MB
registry.access.redhat.com/openshift3/ose-haproxy-router          v3.1.0.4            81c50204da05        4 weeks ago         410.2 MB
registry.access.redhat.com/openshift3/ose-deployer                v3.1.0.4            f8f3ccba9dd4        4 weeks ago         395.3 MB
<none>                                                            <none>              9380de59a89c        4 weeks ago         284.4 MB
registry.access.redhat.com/openshift3/ose-keepalived-ipfailover   v3.1.0.4            bb53dc265861        4 weeks ago         291.7 MB
<none>                                                            <none>              2eb816e9a7d0        4 weeks ago         395.3 MB
registry.access.redhat.com/openshift3/ose-pod                     v3.1.0.4            092ca40663d5        4 weeks ago         327.4 MB
<none>                                                            <none>              f38f7d43a5da        4 weeks ago         270.4 MB
<none>                                                            <none>              6883d5422f4e        4 weeks ago         201.7 MB

Still about 98GB of space is used:

[root@ose-node2 ~]# docker info
Containers: 9
Images: 10
Storage Driver: devicemapper
 Pool Name: docker--vg-docker--pool
 Pool Blocksize: 524.3 kB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 98.6 GB
 Data Space Total: 106.8 GB
 Data Space Available: 8.168 GB
 Metadata Space Used: 6.742 MB
 Metadata Space Total: 109.1 MB
 Metadata Space Available: 102.3 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Library Version: 1.02.107-RHEL7 (2015-10-14)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-327.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.2 (Maipo)
CPUs: 2
Total Memory: 7.64 GiB
Name: ose-node2.ewu.tb.noris.de
ID: FTHB:UJOW:SUYJ:GXFC:4INH:M5AR:27HB:6HJJ:GKF4:XPR3:HI5Q:LSBM
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled


Version-Release number of selected component (if applicable):
3.1

How reproducible:
Reproducible on customer end

Steps to Reproduce:
1.Mentioned in the description
2.
3.

Actual results:
It does not free up the space released after deleting images

Expected results:
It should free up the space released after deleting images

Additional info:

Comment 1 Vivek Goyal 2015-12-21 19:43:23 UTC
What about containers? Are there any containers in the system. (docker ps -a)

Comment 2 Alexander Koksharov 2015-12-22 07:53:17 UTC
containers were deleted. Customer run "docker rm $(docker ps -a -q)" command to delete them all.

Comment 3 Vivek Goyal 2015-12-22 12:59:58 UTC
In docker info output, it seems to say there are 9 containers.

[root@ose-node2 ~]# docker info
Containers: 9
Images: 10

It will be a good idea to check with customer again the output of 'docker ps' and make sure all containers have been deleted.

Comment 7 Wang Haoran 2016-01-27 01:59:23 UTC
Mybe this will helpfull, the GC of OSE
https://docs.openshift.com/enterprise/3.1/admin_guide/garbage_collection.html

Comment 8 Michal Minar 2016-01-27 09:59:04 UTC
Can anybody explain, how a deletion of volumes on the rootfs can reclaim the space on the thin pool? Sure, we can add periodic removal of orphaned volumes to the garbage collector and remove containers with `-v`, but I don't think it solves the original issue.

I'll try to reproduce on rhel.

Comment 9 Michal Minar 2016-01-27 13:57:50 UTC
With docker-1.9.1-10.el7 with thinpool storage I created around 400 containers with volumes. With deletion of the containers, some of the space was freed. There were 165 orphaned volumes left. Their deletion freed space on rootfs without any impact on the thinpool. After deletion of all images, the free space on thinpool was back to original value.

Shortened Docker info:

    $ docker info
    Containers: 0
    Images: 0
    Server Version: 1.9.1-el7
    Storage Driver: devicemapper
     Pool Name: vgdocker-tp
     Pool Blocksize: 65.54 kB
     Base Device Size: 107.4 GB
     Backing Filesystem: xfs
     Data file: 
     Metadata file: 
     Data Space Used: 53.74 MB
     Data Space Total: 32.14 GB
     Data Space Available: 32.09 GB
     Metadata Space Used: 188.4 kB
     Metadata Space Total: 33.55 MB
     Metadata Space Available: 33.37 MB
     Udev Sync Supported: true
     Deferred Removal Enabled: false
     Deferred Deletion Enabled: false
     Deferred Deleted Device Count: 0
     Library Version: 1.02.107-RHEL7 (2015-10-14)
    Execution Driver: native-0.2
    Logging Driver: json-file
    Kernel Version: 3.10.0-327.el7.x86_64
    Operating System: Red Hat Enterprise Linux Server 7.3 Beta (Maipo)

I'm not sure what needs to be fixed here. Shall we investigate further in devicemapper not freeing the space after images being removed (which I failed to reproduce on the first try with just a Docker), or shall we make sure that all the orphaned volumes are deleted?

Miheer or Alexander, any thoughts?

Comment 10 Vivek Goyal 2016-01-27 16:11:42 UTC
Michael, agreed that volumes are on rootfs and removing volumes does not free space in thin pool.

I think in the example above there were still some containers in the system, and that implies there must have been some images. I suspect that these containers might have written lot of data of their own and might be consuming significant space in thin pool.

Comment 18 Michal Fojtik 2016-10-27 07:54:31 UTC
Setting as upcoming release as there were not significant improvements in pruning in 3.4, however you can now use scheduled jobs (alpha) to automatically prune the image.

Comment 21 Michal Fojtik 2017-04-04 10:36:25 UTC
Documentation PR: https://github.com/openshift/origin/pull/11317

Comment 22 ge liu 2017-04-17 07:47:44 UTC
@mfojtik, this bug's original target is docker volume space reclaim issue, and the last fix is cronjob to auto-prune images topic, could you help to give more clue for this issue, and how to verify it? many thanks in advance!

Comment 28 Ryan Howe 2018-11-12 21:22:19 UTC
Closing Not a bug. 

Documentation for OpenShift shows how to configure garbage collection. 

  https://docs.openshift.com/container-platform/3.7/admin_guide/garbage_collection.html



Docker storage is meant to be ephemeral, one can manually wipe and recreate the docker storage or delete all images and containers with docker commands. 

Delete all containers
# docker rm -f $(docker ps -aq)
Delete all images
# docker rmi $(docker images -q)

Comment 31 Red Hat Bugzilla 2023-09-14 23:58:48 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days