|Summary:||Changing the namespace of a static pod manifest may not ensure teardown of pod in old namespace|
|Product:||OpenShift Container Platform||Reporter:||Maru Newby <mnewby>|
|Component:||Node||Assignee:||Elana Hashman <ehashman>|
|Node sub component:||Kubelet||QA Contact:||Sunil Choudhary <schoudha>|
|Status:||CLOSED INSUFFICIENT_DATA||Docs Contact:|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2021-05-10 16:44:49 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Maru Newby 2021-05-03 17:59:20 UTC
As per https://bugzilla.redhat.com/show_bug.cgi?1956081, updating a static pod manifest defining a pod in one namespace to define a pod in a different namespace did not appear to consistently ensure the teardown of the old pod's containers. It might be useful to understand if/why the kubelet would continue to run containers for a static pod that is no longer present in the manifest path.
Comment 1 Elana Hashman 2021-05-03 21:44:54 UTC
Just tried this with a test static pod manifest on upstream Kubernetes (HEAD = v1.22.0-alpha.1.22+19c7089245268e) with local-up-cluster and was unable to reproduce. When I changed the namespace of a static pod manifest, it deleted and recreated the pod with no issue: from /tmp/kubelet.log I0503 14:17:40.470278 46569 kubelet.go:1932] "SyncLoop ADD" source="file" pods=[kube-system/test-pod-127.0.0.1] *changed pod manifest namespace* I0503 14:20:56.117070 46569 kubelet.go:1932] "SyncLoop ADD" source="file" pods=[default/test-pod-127.0.0.1] I0503 14:21:00.334319 46569 kubelet.go:1942] "SyncLoop REMOVE" source="file" pods=[kube-system/test-pod-127.0.0.1] It's possible in this particular instance we hit a race condition. I'll keep this bug open in case we encounter this again but unless I can reproduce I won't be able to fix this. I can try to reproduce on OpenShift as well.
Comment 2 Elana Hashman 2021-05-10 16:44:49 UTC
I haven't encountered any other reports/instances of this issue. Will close unless we can reproduce.
Comment 3 Maru Newby 2021-05-10 20:49:03 UTC
How many containers were present in the pod you tested with? The flakey behavior in question was triggered by a static pod containing 2 containers. The first container was consistently shut down when the manifest was overwritten by a pod with a different namespace, but the second (insecure-readyz) would frequently remain running: https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/bootkube/bootstrap-manifests/kube-apiserver-pod.yaml