Bug 1584850
| Summary: | [3.6] Image garbage collection tries to delete ose-pod image from node | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Seth Jennings <sjenning> | ||||
| Component: | Node | Assignee: | Seth Jennings <sjenning> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | weiwei jiang <wjiang> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.6.0 | CC: | aos-bugs, dma, jokerman, mmccomas, vlaad, wsun | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.6.z | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-10-08 13:12:05 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Seth Jennings
2018-05-31 19:30:21 UTC
Checked and issue can not be reproduced. # oc version oc v3.6.173.0.123 kubernetes v1.6.1+5115d708d7 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-wjiang-36-master-etcd-1:8443 openshift v3.6.173.0.123 kubernetes v1.6.1+5115d708d7 # journalctl -f -u atomic-openshift-node | grep -i image_gc_manager | grep -i "391abeee98a1" May 31 22:56:17 qe-wjiang-36-node-registry-router-1 atomic-openshift-node[27715]: I0531 22:56:17.524278 27715 image_gc_manager.go:224] Adding image ID sha256:391abeee98a163bd213df4f66e0fe8abe642c66f225b2f4ab9c2569e05b4299d to currentImages May 31 22:56:17 qe-wjiang-36-node-registry-router-1 atomic-openshift-node[27715]: I0531 22:56:17.524285 27715 image_gc_manager.go:237] Setting Image ID sha256:391abeee98a163bd213df4f66e0fe8abe642c66f225b2f4ab9c2569e05b4299d lastUsed to 2018-05-31 22:56:17.524193864 -0400 EDT May 31 22:56:17 qe-wjiang-36-node-registry-router-1 atomic-openshift-node[27715]: I0531 22:56:17.524294 27715 image_gc_manager.go:241] Image ID sha256:391abeee98a163bd213df4f66e0fe8abe642c66f225b2f4ab9c2569e05b4299d has size 213851187 May 31 22:56:17 qe-wjiang-36-node-registry-router-1 atomic-openshift-node[27715]: I0531 22:56:17.524356 27715 image_gc_manager.go:319] Image ID sha256:391abeee98a163bd213df4f66e0fe8abe642c66f225b2f4ab9c2569e05b4299d is being used # docker images REPOSITORY TAG IMAGE ID CREATED SIZE ...... registry.reg-aws.openshift.com:443/openshift3/ose-pod v3.6.173.0.123 391abeee98a1 5 hours ago 213.9 MB ...... Reopen the bug as in container env, image_gc still try to remove the 'openshift3/node' image
[root@host-8-250-12 ~]# oc describe no host-8-248-76.host.centralci.eng.rdu2.redhat.com
Name: host-8-248-76.host.centralci.eng.rdu2.redhat.com
Role:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=host-8-248-76.host.centralci.eng.rdu2.redhat.com
registry=enabled
role=node
router=enabled
Annotations: volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: <none>
CreationTimestamp: Tue, 05 Jun 2018 03:58:26 +0000
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Tue, 05 Jun 2018 05:23:50 +0000 Tue, 05 Jun 2018 03:58:25 +0000 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 05 Jun 2018 05:23:50 +0000 Tue, 05 Jun 2018 03:58:25 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 05 Jun 2018 05:23:50 +0000 Tue, 05 Jun 2018 03:58:25 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Tue, 05 Jun 2018 05:23:50 +0000 Tue, 05 Jun 2018 05:18:09 +0000 KubeletReady kubelet is posting ready status
Addresses: 10.8.248.76,10.8.248.76,host-8-248-76.host.centralci.eng.rdu2.redhat.com
Capacity:
cpu: 4
memory: 8010112Ki
pods: 40
Allocatable:
cpu: 4
memory: 7907712Ki
pods: 40
System Info:
Machine ID: b22028a5947d45bdb80fc81401f32472
System UUID: B22028A5-947D-45BD-B80F-C81401F32472
Boot ID: db465c71-5f3f-4bbb-b7ca-0067ce99ed0b
Kernel Version: 3.10.0-693.17.1.el7.x86_64
OS Image: Red Hat Enterprise Linux Server 7.5 (Maipo)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.12.6
Kubelet Version: v1.6.1+5115d708d7
Kube-Proxy Version: v1.6.1+5115d708d7
ExternalID: host-8-248-76.host.centralci.eng.rdu2.redhat.com
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default docker-registry-1-b7zqk 100m (2%) 0 (0%) 256Mi (3%) 0 (0%)
default registry-console-1-r2v83 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default router-1-b00c2 100m (2%) 0 (0%) 256Mi (3%) 0 (0%)
install-test mongodb-1-tqmdh 0 (0%) 0 (0%) 512Mi (6%) 512Mi (6%)
install-test nodejs-mongodb-example-1-f8tkt 0 (0%) 0 (0%) 512Mi (6%) 512Mi (6%)
openshift-ansible-service-broker asb-3851391291-v1xkm 0 (0%) 0 (0%) 0 (0%) 0 (0%)
openshift-ansible-service-broker etcd-1487946270-mj9ft 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
200m (5%) 0 (0%) 1536Mi (19%) 1Gi (13%)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 1h 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal Starting Starting kubelet.
1h 1h 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Warning ImageGCFailed unable to find data for container /
1h 1h 2 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientDisk Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientDisk
1h 1h 2 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientMemory Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientMemory
1h 1h 2 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasNoDiskPressure Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasNoDiskPressure
1h 1h 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeReady Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeReady
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Warning ImageGCFailed unable to find data for container /
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal Starting Starting kubelet.
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientDisk Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientDisk
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientMemory Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientMemory
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeHasNoDiskPressure Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeHasNoDiskPressure
6m 6m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeNotReady Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeNotReady
5m 5m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Normal NodeReady Node host-8-248-76.host.centralci.eng.rdu2.redhat.com status is now: NodeReady
1m 1m 1 kubelet, host-8-248-76.host.centralci.eng.rdu2.redhat.com Warning ImageGCFailed wanted to free 4296435302, but freed 4735317795 space with errors in image deletion: [rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete cb62aa831578 (cannot be forced) - image is being used by running container f6df88e8e41f"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete cb5a29f4d401 (cannot be forced) - image has dependent child images"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 722e3b173b3f (cannot be forced) - image is being used by running container 1f26b39dd6f4"}]
[root@host-8-248-76 ~]# docker ps|grep 1f26b39dd6f4
1f26b39dd6f4 openshift3/node:v3.6.173.0.123 "/usr/local/bin/origi" 7 minutes ago Up 7 minutes atomic-openshift-node
Created attachment 1447710 [details]
node.log
Ok, this bz has become a mess. This change is in a shipped version of 3.6 now but the errata does not reflect it. QE should only verify that attempts to remove the ose-pod are not made. Attempts the remove the node image are outside the realm of this fix. As comment 3, for ose-pod already verified, move to verified. |