Description of problem: On busy nodes it seems the docker space (docker vg) is constantly building up and event the cleanup playbook ( https://github.com/openshift/openshift-ansible/blob/master/playbooks/adhoc/docker_storage_cleanup/docker_storage_cleanup.yml ) does not fix it. [root@ose-node2 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool docker-vg twi-aot--- 99.43g 92.35 6.18 root rhel_ose-node2 -wi-ao---- 17.47g swap rhel_ose-node2 -wi-a----- 2.00g We tried evicting the node: oadm manage-node --schedulable=false --selector='env=ewu,kubernetes.io/hostname=ose-node2.example.com,router=ewu' oadm manage-node --evacuate --selector='env=ewu,kubernetes.io/hostname=ose-node2.example.com,router=ewu' We tried the playbook which seems to reclaim minimal space, at max a few GB: "docker ps -a | awk '/Exited|Dead/ {print $1}' | xargs --no-run-if-empty docker rm" "docker images -q -f dangling=true | xargs --no-run-if-empty docker rmi" "docker images | grep -v -e registry.access.redhat.com -e docker-registry.usersys.redhat.com -e docker-registry.ops.rhcloud.com | awk '{print $3}' | xargs --no-run-if-empty docker rmi 2>/dev/null" We tried rebooting afterwards. Docker images show about 4GB of images: [root@ose-node2 ~]# docker images -a REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE registry.access.redhat.com/openshift3/ose-docker-builder v3.1.0.4 90857752fab7 4 weeks ago 395.3 MB registry.access.redhat.com/openshift3/ose-sti-builder v3.1.0.4 27fb9b206f9c 4 weeks ago 395.3 MB registry.access.redhat.com/openshift3/ose-haproxy-router v3.1.0.4 81c50204da05 4 weeks ago 410.2 MB registry.access.redhat.com/openshift3/ose-deployer v3.1.0.4 f8f3ccba9dd4 4 weeks ago 395.3 MB <none> <none> 9380de59a89c 4 weeks ago 284.4 MB registry.access.redhat.com/openshift3/ose-keepalived-ipfailover v3.1.0.4 bb53dc265861 4 weeks ago 291.7 MB <none> <none> 2eb816e9a7d0 4 weeks ago 395.3 MB registry.access.redhat.com/openshift3/ose-pod v3.1.0.4 092ca40663d5 4 weeks ago 327.4 MB <none> <none> f38f7d43a5da 4 weeks ago 270.4 MB <none> <none> 6883d5422f4e 4 weeks ago 201.7 MB Still about 98GB of space is used: [root@ose-node2 ~]# docker info Containers: 9 Images: 10 Storage Driver: devicemapper Pool Name: docker--vg-docker--pool Pool Blocksize: 524.3 kB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 98.6 GB Data Space Total: 106.8 GB Data Space Available: 8.168 GB Metadata Space Used: 6.742 MB Metadata Space Total: 109.1 MB Metadata Space Available: 102.3 MB Udev Sync Supported: true Deferred Removal Enabled: false Library Version: 1.02.107-RHEL7 (2015-10-14) Execution Driver: native-0.2 Logging Driver: json-file Kernel Version: 3.10.0-327.el7.x86_64 Operating System: Red Hat Enterprise Linux Server 7.2 (Maipo) CPUs: 2 Total Memory: 7.64 GiB Name: ose-node2.ewu.tb.noris.de ID: FTHB:UJOW:SUYJ:GXFC:4INH:M5AR:27HB:6HJJ:GKF4:XPR3:HI5Q:LSBM WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled Version-Release number of selected component (if applicable): 3.1 How reproducible: Reproducible on customer end Steps to Reproduce: 1.Mentioned in the description 2. 3. Actual results: It does not free up the space released after deleting images Expected results: It should free up the space released after deleting images Additional info:
What about containers? Are there any containers in the system. (docker ps -a)
containers were deleted. Customer run "docker rm $(docker ps -a -q)" command to delete them all.
In docker info output, it seems to say there are 9 containers. [root@ose-node2 ~]# docker info Containers: 9 Images: 10 It will be a good idea to check with customer again the output of 'docker ps' and make sure all containers have been deleted.
Mybe this will helpfull, the GC of OSE https://docs.openshift.com/enterprise/3.1/admin_guide/garbage_collection.html
Can anybody explain, how a deletion of volumes on the rootfs can reclaim the space on the thin pool? Sure, we can add periodic removal of orphaned volumes to the garbage collector and remove containers with `-v`, but I don't think it solves the original issue. I'll try to reproduce on rhel.
With docker-1.9.1-10.el7 with thinpool storage I created around 400 containers with volumes. With deletion of the containers, some of the space was freed. There were 165 orphaned volumes left. Their deletion freed space on rootfs without any impact on the thinpool. After deletion of all images, the free space on thinpool was back to original value. Shortened Docker info: $ docker info Containers: 0 Images: 0 Server Version: 1.9.1-el7 Storage Driver: devicemapper Pool Name: vgdocker-tp Pool Blocksize: 65.54 kB Base Device Size: 107.4 GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 53.74 MB Data Space Total: 32.14 GB Data Space Available: 32.09 GB Metadata Space Used: 188.4 kB Metadata Space Total: 33.55 MB Metadata Space Available: 33.37 MB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Library Version: 1.02.107-RHEL7 (2015-10-14) Execution Driver: native-0.2 Logging Driver: json-file Kernel Version: 3.10.0-327.el7.x86_64 Operating System: Red Hat Enterprise Linux Server 7.3 Beta (Maipo) I'm not sure what needs to be fixed here. Shall we investigate further in devicemapper not freeing the space after images being removed (which I failed to reproduce on the first try with just a Docker), or shall we make sure that all the orphaned volumes are deleted? Miheer or Alexander, any thoughts?
Michael, agreed that volumes are on rootfs and removing volumes does not free space in thin pool. I think in the example above there were still some containers in the system, and that implies there must have been some images. I suspect that these containers might have written lot of data of their own and might be consuming significant space in thin pool.
Setting as upcoming release as there were not significant improvements in pruning in 3.4, however you can now use scheduled jobs (alpha) to automatically prune the image.
Documentation PR: https://github.com/openshift/origin/pull/11317
@mfojtik, this bug's original target is docker volume space reclaim issue, and the last fix is cronjob to auto-prune images topic, could you help to give more clue for this issue, and how to verify it? many thanks in advance!
Closing Not a bug. Documentation for OpenShift shows how to configure garbage collection. https://docs.openshift.com/container-platform/3.7/admin_guide/garbage_collection.html Docker storage is meant to be ephemeral, one can manually wipe and recreate the docker storage or delete all images and containers with docker commands. Delete all containers # docker rm -f $(docker ps -aq) Delete all images # docker rmi $(docker images -q)
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days