Checked with # oc version oc v3.9.30 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-3-197.ec2.internal:8443 openshift v3.9.30 kubernetes v1.9.1+a0ce1bc657 And the issue can not be reproduced. # journalctl -u atomic-openshift-node|grep -i "image_gc_manager"|grep -i " used" May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.641943 108519 image_gc_manager.go:334] Image ID sha256:45e0e3dae5ec197a44fe104bf30f9341a6e3d29faeff1c6da30399fb925a7679 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.641959 108519 image_gc_manager.go:334] Image ID sha256:adf66bf8d4cc4e7f7555378452767949b23d5608e9cadcbf0b7e97a2e47d7252 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.641972 108519 image_gc_manager.go:334] Image ID sha256:4eca8aeae35d502fb560f8bd95c09d569adf7e9b907745cdac116344d659a1df is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.641986 108519 image_gc_manager.go:334] Image ID sha256:a813b03690b5b20bbaaed50aae05f775d92f183af0a5a1b092f741274d24b4f8 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.642003 108519 image_gc_manager.go:334] Image ID sha256:bb05bf5ecdfa35ca58f6e6d2790611869c97cc05c6cacf448c79e1deef241940 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.642019 108519 image_gc_manager.go:334] Image ID sha256:41f631bcc32083027c523935b78fd2f9a3c668c09855a7848dad71d2fa584ea6 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.642033 108519 image_gc_manager.go:334] Image ID sha256:75e79260a34f5da432b408f596c4179f750cc22757b96405b47fc572658cba56 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.642046 108519 image_gc_manager.go:334] Image ID sha256:c9499ed94d429dbfbe1396ab71383778ed34e47b714a14f441890fe889783fa9 is being used May 30 02:43:06 ip-172-18-6-28.ec2.internal atomic-openshift-node[108519]: I0530 02:43:06.642060 108519 image_gc_manager.go:334] Image ID sha256:a721a89b2b9b8078974c469b3e81957465f5b618135c6d49b951eb347cf56102 is being used
Reopen the bug, In container env, the imagegc try to remove "openshift3/openvswitch" and "openshift3/node" oc v3.9.30 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-stage39master-etcd-nfs-1:8443 openshift v3.9.30 kubernetes v1.9.1+a0ce1bc657 [root@qe-stage39master-etcd-nfs-1 ~]# oc describe no qe-stage39node-registry-router-1 Name: qe-stage39node-registry-router-1 Roles: compute Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=431ac1fb-1463-4527-b3d1-79245dd698e1 beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=regionOne failure-domain.beta.kubernetes.io/zone=nova kubernetes.io/hostname=qe-stage39node-registry-router-1 logging-infra-fluentd=true node-role.kubernetes.io/compute=true registry=enabled role=node router=enabled Annotations: volumes.kubernetes.io/controller-managed-attach-detach=true Taints: <none> CreationTimestamp: Mon, 04 Jun 2018 22:39:26 -0400 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- OutOfDisk False Tue, 05 Jun 2018 03:27:02 -0400 Mon, 04 Jun 2018 22:39:19 -0400 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Tue, 05 Jun 2018 03:27:02 -0400 Mon, 04 Jun 2018 22:39:19 -0400 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Tue, 05 Jun 2018 03:27:02 -0400 Mon, 04 Jun 2018 22:39:19 -0400 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Tue, 05 Jun 2018 03:27:02 -0400 Tue, 05 Jun 2018 03:14:53 -0400 KubeletReady kubelet is posting ready status Addresses: InternalIP: 172.16.120.48 ExternalIP: 10.8.248.170 Hostname: qe-stage39node-registry-router-1 Capacity: cpu: 4 memory: 8009420Ki pods: 250 Allocatable: cpu: 4 memory: 7907020Ki pods: 250 System Info: Machine ID: 16100d3c3dae46ad8a4ff7fbc9fa554b System UUID: 15694DD8-A91A-4A73-AC7F-AF23A21B7633 Boot ID: 7a5bf782-57ab-41c5-8dff-129e23388157 Kernel Version: 3.10.0-862.3.2.el7.x86_64 OS Image: Red Hat Enterprise Linux Server 7.5 (Maipo) Operating System: linux Architecture: amd64 Container Runtime Version: docker://1.13.1 Kubelet Version: v1.9.1+a0ce1bc657 Kube-Proxy Version: v1.9.1+a0ce1bc657 ExternalID: 15694dd8-a91a-4a73-ac7f-af23a21b7633 Non-terminated Pods: (10 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits --------- ---- ------------ ---------- --------------- ------------- default docker-registry-1-94jp5 100m (2%) 0 (0%) 256Mi (3%) 0 (0%) default router-1-xfswm 100m (2%) 0 (0%) 256Mi (3%) 0 (0%) hasha postgresql-1-mznj5 0 (0%) 0 (0%) 512Mi (6%) 512Mi (6%) openshift-ansible-service-broker asb-etcd-1-ctqsh 0 (0%) 0 (0%) 0 (0%) 0 (0%) openshift-infra heapster-h8fsc 0 (0%) 0 (0%) 937500k (11%) 3750M (46%) openshift-metrics prometheus-node-exporter-zm62n 100m (2%) 200m (5%) 30Mi (0%) 50Mi (0%) openshift-template-service-broker apiserver-hwwrr 0 (0%) 0 (0%) 0 (0%) 0 (0%) wen django-psql-example-1-qhwkl 0 (0%) 0 (0%) 512Mi (6%) 512Mi (6%) wen frontend-1-wx6wp 0 (0%) 0 (0%) 0 (0%) 0 (0%) wen postgresql-1-674qq 0 (0%) 0 (0%) 512Mi (6%) 512Mi (6%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits ------------ ---------- --------------- ------------- 300m (7%) 200m (5%) 3116440928 (38%) 5413041536 (66%) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 12m kubelet, qe-stage39node-registry-router-1 Starting kubelet. Normal NodeAllocatableEnforced 12m kubelet, qe-stage39node-registry-router-1 Updated Node Allocatable limit across pods Normal NodeNotReady 12m kubelet, qe-stage39node-registry-router-1 Node qe-stage39node-registry-router-1 status is now: NodeNotReady Normal NodeHasSufficientDisk 12m (x3 over 12m) kubelet, qe-stage39node-registry-router-1 Node qe-stage39node-registry-router-1 status is now: NodeHasSufficientDisk Normal NodeHasSufficientMemory 12m (x3 over 12m) kubelet, qe-stage39node-registry-router-1 Node qe-stage39node-registry-router-1 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 12m (x3 over 12m) kubelet, qe-stage39node-registry-router-1 Node qe-stage39node-registry-router-1 status is now: NodeHasNoDiskPressure Normal NodeReady 12m kubelet, qe-stage39node-registry-router-1 Node qe-stage39node-registry-router-1 status is now: NodeReady Warning ImageGCFailed 7m kubelet, qe-stage39node-registry-router-1 wanted to free 8134389760 bytes, but freed 8978280283 bytes space with errors in image deletion: [rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 98871f35af21 (cannot be forced) - image has dependent child images, rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete a8fd5c530c44 (cannot be forced) - image has dependent child images] Warning ImageGCFailed 2m kubelet, qe-stage39node-registry-router-1 wanted to free 4316168192 bytes, but freed 4918712134 bytes space with errors in image deletion: [rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete a8fd5c530c44 (cannot be forced) - image has dependent child images, rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 1fea394aac80 (cannot be forced) - image is being used by running container 15711776cedd, rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete e42d0dccf073 (cannot be forced) - image has dependent child images, rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 0dbd08ad57f2 (cannot be forced) - image has dependent child images, rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete e37239ae2fa3 (cannot be forced) - image is being used by running container 6e6475a7a625] //On node [root@qe-stage39node-registry-router-1 ~]# docker images |grep 'a8fd5c530c44 \| 1fea394aac80 \| 0dbd08ad57f2 \| e42d0dccf073 \| e37239ae2fa3' docker.io/centos/ruby-22-centos7 <none> e42d0dccf073 3 days ago 566 MB registry.access.stage.redhat.com/openshift3/openvswitch v3.9.30 e37239ae2fa3 5 days ago 1.46 GB registry.access.stage.redhat.com/openshift3/node v3.9.30 1fea394aac80 5 days ago 1.46 GB registry.access.stage.redhat.com/rhscl/python-35-rhel7 <none> 0dbd08ad57f2 13 days ago 627 MB registry.access.stage.redhat.com/rhscl/nodejs-4-rhel7 <none> a8fd5c530c44 13 days ago 533 MB
Created attachment 1447750 [details] node.log
Checked with v3.9.31 and the issue can not be reproduced. since the containerized env is not in this, so move to verified.