Hide Forgot
Created attachment 1829973 [details] Reproducer Description of problem: Kubelet rejects pods that use resources that should be freed by completed pods. Version-Release number of selected component (if applicable): 4.9.0-0.ci-2021-10-06-085105 How reproducible: Always Steps to Reproduce: 1. Create an SNO cluster or a cluster with 1 single worker node. Use 8 vCPU node worker (or adjust the attached reproducer complete.sh script). 2. run the ./complete.sh reproducer script 3. reboot the worker node Actual results: The pods that were running prior the reboot will end up in OutOfcpu state. $ oc get po NAME READY STATUS RESTARTS AGE complete1 0/1 Completed 0 100m complete2 0/1 Completed 0 100m complete3 0/1 Completed 0 100m complete4 0/1 Completed 0 100m complete5 0/1 Completed 0 100m complete6 0/1 Completed 0 100m complete7 0/1 OutOfcpu 0 100m complete8 0/1 OutOfcpu 0 100m running1 0/1 OutOfcpu 0 100m running2 0/1 OutOfcpu 0 100m running3 0/1 OutOfcpu 0 100m running4 0/1 OutOfcpu 0 100m running5 0/1 OutOfcpu 0 100m Expected results: The pods that were running prior the reboot will be Running again. Additional info: https://bugzilla.redhat.com/show_bug.cgi?id=1997657#c34
I have been able to reproduce this on an upstream single node local-up-cluster.sh - filed https://github.com/kubernetes/kubernetes/issues/105523
Upstream fix PR: https://github.com/kubernetes/kubernetes/pull/105527 E2E test that verifies behaviour is broken on HEAD: https://github.com/kubernetes/kubernetes/pull/105552 Cherry-pick to verify behaviour is working on 1.21: https://github.com/kubernetes/kubernetes/pull/105553 Test name is "[sig-node] Restart [Serial] [Slow] [Disruptive] Kubelet should correctly account for terminated pods after restart" and is part of the node serial suite. We are waiting to get final CI results back before LGTM/approval. Only the first PR should merge, and then backported to 1.22. I created a "/test pull-kubernetes-node-kubelet-serial-122" job for testing this against the 1.22 branch.
Issue not fixed on 4.10.0-0.nightly-2021-10-08-050801, which is still in Ready state(not Accepted) at 2021-10-08T05:08:01Z. It may not have the PR included. Waiting for next build to check. oc get po NAME READY STATUS RESTARTS AGE complete1 0/1 Completed 0 3m56s complete2 0/1 Completed 0 3m51s complete3 0/1 Completed 0 3m46s complete4 0/1 Completed 0 3m40s complete5 0/1 Completed 0 3m37s complete6 0/1 Completed 0 3m32s complete7 0/1 Completed 0 3m26s complete8 0/1 Completed 0 3m21s running1 1/1 Running 1 3m16s running2 0/1 OutOfcpu 0 3m15s running3 0/1 OutOfcpu 0 3m14s running4 0/1 OutOfcpu 0 3m13s running5 0/1 OutOfcpu 0 3m12s running6 0/1 OutOfcpu 0 3m11s
Hey Weinan, the fix is not in 4.10.0-0.nightly-2021-10-08-050801 yet, thanks for checking! One way to check is git log for openshift/kubernetes and compare it with kubelet version. You need commit equal or higher than 931224322c58da67eb8b3e9d4d3ff0e7dbf81cf2. You can get the kubelet version by checking the output of $ oc get no NAME STATUS ROLES AGE VERSION jmencak-fxfd2-master-0.c.openshift-gce-devel.internal Ready master 5m27s v1.22.1+4d7e196 jmencak-fxfd2-master-1.c.openshift-gce-devel.internal Ready master 5m39s v1.22.1+4d7e196 jmencak-fxfd2-master-2.c.openshift-gce-devel.internal Ready master 5m40s v1.22.1+4d7e196 4d7e196 indicates the kubelet version that doesn't have the fix.
NAME STATUS ROLES AGE VERSION ip-10-0-128-201.us-east-2.compute.internal Ready worker 48m v1.22.1+4d7e196 ip-10-0-142-115.us-east-2.compute.internal Ready master 57m v1.22.1+4d7e196 ip-10-0-165-183.us-east-2.compute.internal Ready master 58m v1.22.1+4d7e196 ip-10-0-206-28.us-east-2.compute.internal Ready master 57m v1.22.1+4d7e196 oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-10-08-090421 True False 35m Cluster version is 4.10.0-0.nightly-2021-10-08-090421 @Jiří, thanks! 4.10.0-0.nightly-2021-10-08-090421 does not have the fix yet.
*** Bug 2005647 has been marked as a duplicate of this bug. ***
*** Bug 2009092 has been marked as a duplicate of this bug. ***
verified to be fixed $ ./complete.sh pod/complete1 created "Pending" "Pending" "Succeeded" pod/complete2 created "Pending" "Pending" "Succeeded" pod/complete3 created "Pending" "Pending" "Succeeded" pod/complete4 created "Pending" "Pending" "Pending" "Succeeded" pod/complete5 created "Pending" "Pending" "Succeeded" pod/complete6 created "Pending" "Pending" "Succeeded" pod/complete7 created "Pending" "Pending" "Succeeded" pod/complete8 created "Pending" "Pending" "Succeeded" pod/running1 created pod/running2 created pod/running3 created pod/running4 created pod/running5 created pod/running6 created $ oc get po NAME READY STATUS RESTARTS AGE complete1 0/1 Completed 0 6m29s complete2 0/1 Completed 0 6m23s complete3 0/1 Completed 0 6m17s complete4 0/1 Completed 0 6m11s complete5 0/1 Completed 0 6m4s complete6 0/1 Completed 0 5m58s complete7 0/1 Completed 0 5m52s complete8 0/1 Completed 0 5m46s running1 1/1 Running 0 5m40s running2 1/1 Running 0 5m39s running3 0/1 Pending 0 5m38s running4 0/1 Pending 0 5m37s running5 0/1 Pending 0 5m36s running6 0/1 Pending 0 5m35s $ oc get no NAME STATUS ROLES AGE VERSION ci-ln-xxnd56k-f76d1-dx979-master-0 Ready master,worker 26m v1.22.1+9312243 [weinliu@rhel8 verification-tests]$ oc debug node/ci-ln-xxnd56k-f76d1-dx979-master-0 Starting pod/ci-ln-xxnd56k-f76d1-dx979-master-0-debug ... To use host binaries, run `chroot /host` chroot /host Pod IP: 10.0.0.3 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# reboot Removing debug pod ... error: unable to delete the debug pod "ci-ln-xxnd56k-f76d1-dx979-master-0-debug": Delete https://api.ci-ln-xxnd56k-f76d1.origin-ci-int-gce.dev.openshift.com:6443/api/v1/namespaces/default/pods/ci-ln-xxnd56k-f76d1-dx979-master-0-debug: unexpected EOF [weinliu@rhel8 verification-tests]$ oc get no NAME STATUS ROLES AGE VERSION ci-ln-xxnd56k-f76d1-dx979-master-0 Ready master,worker 29m v1.22.1+9312243 $ oc get po NAME READY STATUS RESTARTS AGE ci-ln-xxnd56k-f76d1-dx979-master-0-debug 0/1 Completed 1 4m17s complete1 0/1 Completed 0 11m complete2 0/1 Completed 0 11m complete3 0/1 Completed 0 10m complete4 0/1 Completed 0 10m complete5 0/1 Completed 0 10m complete6 0/1 Completed 0 10m complete7 0/1 Completed 0 10m complete8 0/1 Completed 0 10m running1 0/1 ContainerCreating 1 10m running2 0/1 ContainerCreating 1 10m running3 0/1 Pending 0 10m running4 0/1 Pending 0 10m running5 0/1 Pending 0 10m running6 0/1 Pending 0 10m $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-10-16-173656 True False 14m Cluster version is 4.10.0-0.nightly-2021-10-16-173656
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056