Description of problem: Following https://docs.openshift.org/latest/install_config/install/prerequisites.html#configuring-docker-storage to set up docker-pool volume as docker back-end storage. Here is my env info: # cat /etc/sysconfig/docker-storage DOCKER_STORAGE_OPTIONS=--storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/rhel72-docker--pool --storage-opt dm.use_deferred_removal=true # ps -ef|grep docker root 15538 1 0 Jan20 ? 00:08:01 /usr/bin/docker daemon --insecure-registry=172.31.0.0/16 --selinux-enabled --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/rhel72-docker--pool --storage-opt dm.use_deferred_removal=true -b=lbr0 --mtu=1450 --add-registry rcm-img-docker01.build.eng.bos.redhat.com:5001 --add-registry registry.access.redhat.com --insecure-registry 0.0.0.0/0 # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool rhel72 twi-aot--- 17.44g 73.35 9.57 root rhel72 -wi-ao---- 10.00g swap rhel72 -wi-ao---- 2.00g # df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel72-root 10G 2.7G 7.4G 27% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 191M 1.7G 11% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda1 497M 214M 284M 43% /boot If deploy docker-registry with an external storage. e.g: NFS, that means docker images will be pulled in docker-pool volume, sti build will push image to external storage, root partition will be used less and less. After setup env, and follow https://docs.openshift.org/latest/admin_guide/garbage_collection.html to configure "Image Garbage Collection", then restart node service. kubeletArguments: image-gc-high-threshold: - '20' image-gc-low-threshold: - '10' Seen from node log, found ImageManager is monitoring root disk usage by default. <--snip--> Jan 20 18:49:20 openshift-149 atomic-openshift-node: I0120 18:49:20.682843 32634 image_manager.go:202] [ImageManager]: Disk usage on "/dev/mapper/rhel72-root" (/) is at 26% which is over the high threshold (20%). Trying to free 672878592 bytes Jan 20 18:49:20 openshift-149 docker: time="2016-01-20T18:49:20.683168834+08:00" level=info msg="GET /images/json" Jan 20 18:49:20 openshift-149 docker: time="2016-01-20T18:49:20.768170253+08:00" level=info msg="GET /containers/json?all=1" Jan 20 18:49:20 openshift-149 atomic-openshift-node: I0120 18:49:20.770445 32634 image_manager.go:254] [ImageManager]: Removing image "05f86996004c05346d746261b53a406a43e9016753f0ab3bd3a62756828db551" to free 235319119 bytes <--snip--> Sometimes, to do image garbage on root disk is NOT what user want to do. In the above scenarios, I want to do image garbage on docker-pool volume. "Image garbage collection" configuration should allow user to specify which disk partition to be done against. Version-Release number of selected component (if applicable): # openshift version openshift v3.1.1.5 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Note: the registry has no impact on Kubelet image garbage collection. Kubelet image garbage collection only targets images stored in the Docker daemon's graph storage. You don't need to have a registry deployed to test image garbage collection, and if you do have a registry deployed, its storage configuration (emptyDir vs hostPath vs NFS vs PV) does not affect node image storage or garbage collection.
Possible fix here: https://github.com/google/cadvisor/issues/944#issuecomment-173665149. Waiting on more discussion with upstream before I proceed further.
cadvisor PR: https://github.com/google/cadvisor/pull/1070 Once this is merged, we'll need PRs for Kubernetes and Origin to pull in the updated cadvisor. Or we may cherry-pick the fix into Origin in the short term if we aren't comfortable bumping all of cadvisor for this fix.
Kube PRs: https://github.com/kubernetes/kubernetes/pull/19354, https://github.com/kubernetes/kubernetes/pull/20395
https://github.com/kubernetes/kubernetes/pull/19354 - merged 1/29 https://github.com/kubernetes/kubernetes/pull/20395 - not yet merged, but tagged lgtm
Taking bug in Andy's absence and will look to cherry-pick.
https://github.com/kubernetes/kubernetes/pull/20395 just merged upstream. This will be picked up in the next rebase into Origin.
Will be in next puddle
Verify on openshift v3.1.1.904 [root@openshift-135 ~]# openshift version openshift v3.1.1.904 kubernetes v1.2.0-alpha.7-703-gbc4550d etcd 2.2.5 [root@openshift-129 ~]# cat /etc/sysconfig/docker-storage DOCKER_STORAGE_OPTIONS=--storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/rhel72-docker--pool --storage-opt dm.use_deferred_removal=true [root@openshift-129 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool rhel72 twi-aot--- 17.44g 20.99 5.33 root rhel72 -wi-ao---- 10.00g swap rhel72 -wi-ao---- 2.00g atomic-openshift-node logs: I0222 15:01:45.411470 27751 image_manager.go:230] [ImageManager]: Disk usage on "rhel72-docker--pool" () is at 70% which is over the high threshold (20%). Trying to free 11319902208 bytes I0222 15:01:45.508647 27751 docker.go:357] Docker Container: /atomic-openshift-node is not managed by kubelet. I0222 15:01:45.508679 27751 docker.go:357] Docker Container: /openvswitch is not managed by kubelet. I0222 15:01:45.508687 27751 docker.go:357] Docker Container: /small_wozniak is not managed by kubelet. I0222 15:01:45.508914 27751 image_manager.go:287] [ImageManager]: Removing image "0192cfcebeb04ff778cf44aa7f6d336e43ee9fdc6cfb02de091364248a700cfc" to free 490397531 bytes I0222 15:01:45.651197 27751 docker.go:357] Docker Container: /atomic-openshift-node is not managed by kubelet. I0222 15:01:45.651227 27751 docker.go:357] Docker Container: /openvswitch is not managed by kubelet. I0222 15:01:45.651235 27751 docker.go:357] Docker Container: /small_wozniak is not managed by kubelet. I0222 15:01:45.655037 27751 kubelet.go:2409] SyncLoop (housekeeping) I0222 15:01:45.668928 27751 image_manager.go:287] [ImageManager]: Removing image "0bbc57b809f12a1a21c8105fa428f714bee4c588e9d47beb4b053bccffb68416" to free 603628030 bytes
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064