Description of problem: Image garbage collection is not cleaning up dangling images Version-Release number of selected component (if applicable): 3.11.216 How reproducible: always Steps to Reproduce: 1. Verify the list of dangling images after image garbage collection execution finished on the node. Actual results: Is not clear if Image Garbage collector should delete or not the dangling images. Expected results: No dangling image on the node Additional info: This is causing over usage of the disk.
Hi, could you please provide further information? rgds,
hi Joel, Customer has tested the variable "minimum-container-ttl-duration" to 0 but the issue persist. From the shared session today, I have collected information about an image that should be deleted by the image GC project but it's not: On the affected node, there is the dangling image: docker images -a | grep vass docker-registry.default.svc:5000/default/vass-netutils <none> 74327aedbb2c 22 months ago 342 MB the image RepoDigest information is: docker inspect 74327aedbb2c "Id": "sha256:74327aedbb2cd18a8a73b47d565929af154d88db47608701dd2abd0538805ab5", "RepoTags": [], "RepoDigests": [ "docker-registry.default.svc:5000/default/vass-netutils@sha256:495c416a7fd930d1ad244b077b3bc81f1824ef1afe0d8746ad026425f73721dd" ] .... Checking the project default, there is no container using the image in question: $ oc get pods -n default -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | sort docker-registry-6-4jm5l: registry.redhat.io/openshift3/ose-docker-registry:v3.11.216, docker-registry-6-tbdfz: registry.redhat.io/openshift3/ose-docker-registry:v3.11.216, docker-registry-6-zs45z: registry.redhat.io/openshift3/ose-docker-registry:v3.11.216, registry-console-5-z8l62: registry.redhat.io/openshift3/registry-console:v3.11.216, router-1-f7nwx: registry.redhat.io/openshift3/ose-haproxy-router:v3.11.216, router-1-grw2r: registry.redhat.io/openshift3/ose-haproxy-router:v3.11.216, router-1-qr5hp: registry.redhat.io/openshift3/ose-haproxy-router:v3.11.216, After activating the debug and checking in log, the image_gc_manager.go is adding the images to the currentImages list but is not deleting it: journalctl -u atomic-openshift-node.service --since "1 hour ago" -f | grep -i "image_gc_manager.go" | grep 74327aedbb2c Oct 01 15:03:54 ********** atomic-openshift-node[31041]: I1001 15:03:53.887799 31056 image_gc_manager.go:242] Image ID sha256:74327aedbb2cd18a8a73b47d565929af154d88db47608701dd2abd0538805ab5 is new Oct 01 15:03:54 ********** atomic-openshift-node[31041]: I1001 15:03:53.887806 31056 image_gc_manager.go:254] Image ID sha256:74327aedbb2cd18a8a73b47d565929af154d88db47608701dd2abd0538805ab5 has size 341959589 Oct 01 15:08:54 ********** atomic-openshift-node[31041]: I1001 15:08:54.126915 31056 image_gc_manager.go:237] Adding image ID sha256:74327aedbb2cd18a8a73b47d565929af154d88db47608701dd2abd0538805ab5 to currentImages Why this image was not collected by the imageGC process?, is there any additional verification that need to be performed? Looking forward for your reply. Regards,
hi! Could please provide a feedback on this issue? Have you been able to reproduce it? Sorry to push that hard but CU is quite worried about the disk usage due to the dangling images. Don't hesitate to contact in case of further information is needed. Many thanks in advance rgds,
Hi Joel, The test has been performed as requested in TEST environment and it worked, but on PROD no image has been clean up. All logs are now available in the drive. Please don't hesitate to contact in case further information is needed. Regards,
Hi Joel, Information requested is attached. Cheers
Hi, Joel Smith, must I verify this fix on RHEL Atomic Host? If my 3.11 cluster is running on RHEL 7.7 node, can I verify it?
Hi, Joel Smith I created a 3.11 cluster on openstack-upshift platform. Flexy job: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/128942/artifact/host.spec/*view*/ openstack console: https://rhos-d.infra.prod.upshift.rdu2.redhat.com/dashboard/project/instances/ (you can filter by name min1224-311) Thanks if you can add an Atomic Host node or tell me how to add it.
Hi, Joel Smith I created an atomic host cluster, but I can't find any Filesystem which mounted on /var/lib/docker. So how should I verify this bug? FYI: [root@minmli-0111311node-1 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Atomic Host release 7.7 [root@minmli-0111311node-1 ~]# df -h 文件系统 容量 已用 可用 已用% 挂载点 devtmpfs 3.8G 0 3.8G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 2.5M 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/mapper/atomicos-root 60G 7.2G 53G 12% /sysroot /dev/vda1 297M 115M 183M 39% /boot tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e745e6cb-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sync-token-cdqq5 tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e748d8df-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sdn-token-xcswj tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e74d5bfd-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sdn-token-xcswj overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/43fc458381b097bcd04478eb26586a9a14762dddcf1df4c0368029bf0cdd06c7/merged overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/2b67f299295400066fc20808d846362ae6db9da805d18bc7df13e80ed8f0ba12/merged shm 64M 0 64M 0% /var/lib/docker/containers/9ab5201235a39c4118a2ce44e7948a722f67249566c15c53e8d95f6e89d58868/shm shm 64M 0 64M 0% /var/lib/docker/containers/8e1a757cbafaa24fe0fcf8f26cc25508cc1a2f92adf8943c624a63987d8a464b/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/b513d4618b1bab852c9eaa996b831102ebda57ecffb9331c5ce8c11160eb50c7/merged overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/697c6be477d5d7e42168e0a87075657596001f47c6738f4543338c1b2be1de30/merged overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/3da666ecdbe36c4f5102162c4ddae7b7fe2fb0507454422db2bdd5b19b53a3c4/merged shm 64M 0 64M 0% /var/lib/docker/containers/785d4971aa1694236840870a5d42ec52893563f9fd95d245a36d67dfa1719a53/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/bfc91a1c4b7677d984e5709b72be93e1cee15e79a6f001a57d94c4c6405a0929/merged tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/b0a50d3d-53b9-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/node-exporter-token-b6c5z tmpfs 3.9G 8.0K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/b0a50d3d-53b9-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/node-exporter-tls overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/c575d0329009f8f92c260ba433b254461aa839010cf42cac48acb5be5e256123/merged shm 64M 0 64M 0% /var/lib/docker/containers/d179c88ba0bf13125c592e685da83a08dd918a7aee52905e6696f3376cad616c/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/d97104d9488de82a22b16bc94ac1c065efce6e569aa54faf5c37295d4d6d7b46/merged overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/9cc307e33dd151e2f9753d94c83ba44a906f69bcd895d11a769da0e14b1a4afa/merged tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/36462dbe-53ba-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-2bwb9 overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/44f0ef3c3dccc77e299f2898751f522d0ef2a24873a00a119c98273882c746ee/merged shm 64M 0 64M 0% /var/lib/docker/containers/05d479b5f1f4a46de48e092b24ea2e1f0ccd4ffa2da4d471960c4dcb1ca9b517/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/5b79a0dacbd438b99cb2e3c3da0279eda4c5eeeceda49f3b62a4a9745679cf17/merged tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/49f462e9-53ba-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-2bwb9 overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/4b1ad42610f3c0331508581d1f82462ab206fc71e765dcc8dfa289e56a6dc868/merged shm 64M 0 64M 0% /var/lib/docker/containers/4e3e14b054cb7e183eb1eeff5c15256afcc80f353746e41cc766ec0bea191731/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/a6da61a301aef0a1ef21ee9bdeeed89c6ec5a6670c18a3f365e79ca1d4456dea/merged tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/9f798941-53e7-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-m56p7 overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/1e00a077a53c069e70bcd206440b62bacd5f28f217735580877992aa094bf42b/merged shm 64M 0 64M 0% /var/lib/docker/containers/22210c5f01cd9f7bf0aa0e224dc9ee5bcecaf009c4f6384aa9b38af4c4f67b35/shm overlay 60G 7.2G 53G 12% /var/lib/docker/overlay2/2c804cf068a2a1fe5ef0480b1a9a04df599afd911cc0c82618a5215b57185259/merged tmpfs 783M 0 783M 0% /run/user/0 [root@minmli-0111311node-1 ~]# lsblk -fs NAME FSTYPE LABEL UUID MOUNTPOINT vda1 xfs e6db7aea-4e85-4d5d-b9d4-f262eef3baab /boot └─vda atomicos-root xfs b10b2665-e2a5-4dd6-a1b7-57a708d11399 /sysroot └─vda2 LVM2_member waOc6G-LE2o-HORr-qBHh-gIlP-eANU-vVdfxI └─vda [root@minmli-0111311node-1 ~]# docker info|grep Root.Dir WARNING: You're not using the default seccomp profile WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled Docker Root Dir: /var/lib/docker # oc get --raw /api/v1/nodes/minmli-0111311node-1/proxy/metrics/cadvisor | grep 'container_fs_\(usage\|limit\)_bytes.*,name=""' container_fs_limit_bytes{container_name="",device="/dev/mapper/atomicos-root",id="/",image="",name="",namespace="",pod_name=""} 6.4095256576e+10 container_fs_limit_bytes{container_name="",device="/dev/vda1",id="/",image="",name="",namespace="",pod_name=""} 3.1107072e+08 container_fs_limit_bytes{container_name="",device="shm",id="/",image="",name="",namespace="",pod_name=""} 6.7108864e+07 container_fs_limit_bytes{container_name="",device="tmpfs",id="/",image="",name="",namespace="",pod_name=""} 6.7108864e+07 container_fs_usage_bytes{container_name="",device="/dev/mapper/atomicos-root",id="/",image="",name="",namespace="",pod_name=""} 7.647141888e+09 container_fs_usage_bytes{container_name="",device="/dev/vda1",id="/",image="",name="",namespace="",pod_name=""} 1.1993088e+08 container_fs_usage_bytes{container_name="",device="shm",id="/",image="",name="",namespace="",pod_name=""} 0 container_fs_usage_bytes{container_name="",device="tmpfs",id="/",image="",name="",namespace="",pod_name=""} 0 # oc get --raw /api/v1/nodes/minmli-0111311node-1/proxy/metrics/cadvisor | grep 'container_fs_usage_bytes.*,name="[^"]' | head -1 container_fs_usage_bytes{container_name="POD",device="/dev/mapper/atomicos-root",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod9f798941_53e7_11eb_9b15_fa163ee44674.slice/docker-22210c5f01cd9f7bf0aa0e224dc9ee5bcecaf009c4f6384aa9b38af4c4f67b35.scope",image="registry.access.stage.redhat.com/openshift3/ose-pod:v3.11.346",name="k8s_POD_httpd-1-sgqcp_httpd_9f798941-53e7-11eb-9b15-fa163ee44674_0",namespace="httpd",pod_name="httpd-1-sgqcp"} 61440 Also I attach the output of command "cat /proc/$(systemctl show --property MainPID atomic-openshift-node.service | sed 's/.*=//')/mountinfo"
Created attachment 1746225 [details] /proc/self/mountinfo
[root@minmli-0111311node-1 ~]# runc exec atomic-openshift-node df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 60G 7.2G 53G 12% / devtmpfs 3.8G 0 3.8G 0% /dev shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup tmpfs 3.9G 2.5M 3.9G 1% /run tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e745e6cb-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sync-token-cdqq5 tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e748d8df-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sdn-token-xcswj tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/e74d5bfd-53b8-11eb-9b1a-fa163ee44674/volumes/kubernetes.io~secret/sdn-token-xcswj tmpfs 3.9G 0 3.9G 0% /rootfs/dev/shm shm 64M 0 64M 0% /rootfs/var/lib/docker/containers/9ab5201235a39c4118a2ce44e7948a722f67249566c15c53e8d95f6e89d58868/shm shm 64M 0 64M 0% /rootfs/var/lib/docker/containers/8e1a757cbafaa24fe0fcf8f26cc25508cc1a2f92adf8943c624a63987d8a464b/shm overlay 60G 7.2G 53G 12% /rootfs/var/lib/docker/overlay2/43fc458381b097bcd04478eb26586a9a14762dddcf1df4c0368029bf0cdd06c7/merged overlay 60G 7.2G 53G 12% /rootfs/var/lib/docker/overlay2/2b67f299295400066fc20808d846362ae6db9da805d18bc7df13e80ed8f0ba12/merged /dev/vda1 297M 115M 183M 39% /rootfs/boot tmpfs 64M 0 64M 0% /tmp tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/b0a50d3d-53b9-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/node-exporter-token-b6c5z tmpfs 3.9G 8.0K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/b0a50d3d-53b9-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/node-exporter-tls tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/36462dbe-53ba-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-2bwb9 tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/49f462e9-53ba-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-2bwb9 tmpfs 3.9G 32K 3.9G 1% /var/lib/origin/openshift.local.volumes/pods/9f798941-53e7-11eb-9b15-fa163ee44674/volumes/kubernetes.io~secret/default-token-m56p7 tmpfs 783M 0 783M 0% /run/user/0
on atomic host, checked /var/lib/containers/atomic/atomic-openshift-node.0/config.json: { "type": "bind", "source": "/var/lib/docker", "destination": "/var/lib/docker", "options": [ "bind", "slave", "rw", "mode=755" ] }, and verified as Comment 14, when image volume usage reach gc-high-threshold, the dangling images are deleted.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 3.11.374 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0079