1956455 – Changing the namespace of a static pod manifest may not ensure teardown of pod in old namespace

Bug 1956455 - Changing the namespace of a static pod manifest may not ensure teardown of pod in old namespace

Summary: Changing the namespace of a static pod manifest may not ensure teardown of po...

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Elana Hashman
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-05-03 17:59 UTC by Maru Newby
Modified:	2021-07-14 23:37 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-10 16:44:49 UTC
Target Upstream Version:
Embargoed:
Flags:	ehashman: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1956081	1	urgent	CLOSED	kube-apiserver setup fail while installing SNO due to port being used	2021-10-18 17:30:45 UTC

Description Maru Newby 2021-05-03 17:59:20 UTC

As per https://bugzilla.redhat.com/show_bug.cgi?1956081, updating a static pod manifest defining a pod in one namespace to define a pod in a different namespace did not appear to consistently ensure the teardown of the old pod's containers. It might be useful to understand if/why the kubelet would continue to run containers for a static pod that is no longer present in the manifest path.

Comment 1 Elana Hashman 2021-05-03 21:44:54 UTC

Just tried this with a test static pod manifest on upstream Kubernetes (HEAD = v1.22.0-alpha.1.22+19c7089245268e) with local-up-cluster and was unable to reproduce.

When I changed the namespace of a static pod manifest, it deleted and recreated the pod with no issue:

from /tmp/kubelet.log

I0503 14:17:40.470278   46569 kubelet.go:1932] "SyncLoop ADD" source="file" pods=[kube-system/test-pod-127.0.0.1]

*changed pod manifest namespace*

I0503 14:20:56.117070   46569 kubelet.go:1932] "SyncLoop ADD" source="file" pods=[default/test-pod-127.0.0.1]
I0503 14:21:00.334319   46569 kubelet.go:1942] "SyncLoop REMOVE" source="file" pods=[kube-system/test-pod-127.0.0.1]

It's possible in this particular instance we hit a race condition.

I'll keep this bug open in case we encounter this again but unless I can reproduce I won't be able to fix this. I can try to reproduce on OpenShift as well.

Comment 2 Elana Hashman 2021-05-10 16:44:49 UTC

I haven't encountered any other reports/instances of this issue. Will close unless we can reproduce.

Comment 3 Maru Newby 2021-05-10 20:49:03 UTC

How many containers were present in the pod you tested with? The flakey behavior in question was triggered by a static pod containing 2 containers. The first container was consistently shut down when the manifest was overwritten by a pod with a different namespace, but the second (insecure-readyz) would frequently remain running:

https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/bootkube/bootstrap-manifests/kube-apiserver-pod.yaml

Note You need to log in before you can comment on or make changes to this bug.