2078387 – delete pod without considering terminationGracePeriodSeconds when restarting container failed.

Bug 2078387 - delete pod without considering terminationGracePeriodSeconds when restarting container failed.

Summary: delete pod without considering terminationGracePeriodSeconds when restarting ...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.10
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Sai Ramesh Vanka
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-04-25 07:58 UTC by MinLi
Modified:	2022-11-24 09:27 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-05-16 10:57:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description MinLi 2022-04-25 07:58:47 UTC

Description of problem:
The pod is deleted without considering terminationGracePeriodSeconds when restarting container failed due to liveness probe failure.

Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2022-04-24-083512

How reproducible:


Steps to Reproduce:
1.create a pod with liveness probe.
# oc create -f testpod_probe.yaml 
pod/test-pod created

testpod_probe.yaml :
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  annotations:
    "unsupported.do-not-use.openshift.io/override-liveness-grace-period-seconds": "10"
spec:
  terminationGracePeriodSeconds: 500
  containers:
    - name: test
      image: nginx
      imagePullPolicy: IfNotPresent
      command: [bash, -c, "sleep 100000000"]
      ports:
        - containerPort: 8080
      livenessProbe:
        httpGet:
          path: /healthz
          port: 8080
        failureThreshold: 1
        periodSeconds: 60

2.wait for a while, the liveness probe failed, and the pod restart in 10 seconds
# oc get pod
NAME                                            READY   STATUS    RESTARTS      AGE
test-pod                                        1/1     Running   3 (15s ago)   3m25s

# oc describe pod test-pod
Name:         test-pod
Namespace:    default
Priority:     0
Node:         ip-10-0-149-123.us-east-2.compute.internal/10.0.149.123
Start Time:   Mon, 25 Apr 2022 03:20:29 -0400
Labels:       <none>
...
Containers:
  test:
    Container ID:  cri-o://22bf49750397e98911546b242d9f29038779f3327bb873550dee1194491480d3
    Image:         nginx
    Image ID:      docker.io/library/nginx@sha256:61face6bf030edce7ef6d7dd66fe452298d6f5f7ce032afdd01683ef02b2b841
    Port:          8080/TCP
    Host Port:     0/TCP
    Command:
      bash
      -c
      sleep 100000000
    State:          Running
      Started:      Mon, 25 Apr 2022 03:23:40 -0400
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Mon, 25 Apr 2022 03:22:40 -0400
      Finished:     Mon, 25 Apr 2022 03:23:39 -0400
    Ready:          True
    Restart Count:  3
    Liveness:       http-get http://:8080/healthz delay=0s timeout=1s period=60s #success=1 #failure=1
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8hqt9 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  kube-api-access-8hqt9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute for 300s
                             node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason          Age                  From                                                 Message
  ----     ------          ----                 ----                                                 -------
  Normal   Scheduled       <unknown>                                                                 Successfully assigned default/test-pod to ip-10-0-149-123.us-east-2.compute.internal
  Normal   AddedInterface  3m42s                multus                                               Add eth0 [10.131.0.30/23] from openshift-sdn
  Normal   Pulled          3m38s                kubelet, ip-10-0-149-123.us-east-2.compute.internal  Successfully pulled image "nginx" in 3.810190275s
  Normal   Pulled          2m33s                kubelet, ip-10-0-149-123.us-east-2.compute.internal  Successfully pulled image "nginx" in 440.530889ms
  Normal   Pulled          93s                  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Successfully pulled image "nginx" in 512.481887ms
  Warning  Unhealthy       44s (x3 over 2m44s)  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Liveness probe failed: Get "http://10.131.0.30:8080/healthz": dial tcp 10.131.0.30:8080: connect: connection refused
  Normal   Killing         44s (x3 over 2m44s)  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Container test failed liveness probe, will be restarted
  Normal   Pulling         34s (x4 over 3m42s)  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Pulling image "nginx"
  Normal   Created         33s (x4 over 3m38s)  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Created container test
  Normal   Started         33s (x4 over 3m38s)  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Started container test
  Normal   Pulled          33s                  kubelet, ip-10-0-149-123.us-east-2.compute.internal  Successfully pulled image "nginx" in 489.994881ms


3.# oc delete pod  test-pod

Actual results:
3.the pod is deleted immediately without applying terminationGracePeriodSeconds.

Expected results:
3.the pod is deleted after 500 seconds.

Additional info:
the kubelet log show:
Apr 25 03:47:37 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:37.526449    1378 kubelet.go:2217] "SyncLoop (probe)" probe="liveness" status="unhealthy" pod="default/test-pod"
Apr 25 03:47:37 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:37.526884    1378 kuberuntime_manager.go:710] "Message for Container of pod" containerName="test" containerStatusID={Type:cri-o ID:09400e0c5ea9b0cdf885ff4807bae6bfef98cfb938b954e4f65482f95de0501c} pod="default/test-pod" containerMessage="Container test failed liveness probe, will be restarted"
Apr 25 03:47:37 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:37.526921    1378 kuberuntime_container.go:729] "Killing container with a grace period" pod="default/test-pod" podUID=57fcf2de-9c4c-4b1c-8772-f352049e8ae6 containerName="test" containerID="cri-o://09400e0c5ea9b0cdf885ff4807bae6bfef98cfb938b954e4f65482f95de0501c" gracePeriod=10
Apr 25 03:47:41 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:41.654486    1378 kubelet.go:2123] "SyncLoop DELETE" source="api" pods=[default/test-pod]
Apr 25 03:47:47 ip-10-0-205-106 hyperkube[1378]: E0425 03:47:47.706769    1378 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"test\" with CrashLoopBackOff: \"back-off 1m20s restarting failed container=test pod=test-pod_default(57fcf2de-9c4c-4b1c-8772-f352049e8ae6)\"" pod="default/test-pod" podUID=57fcf2de-9c4c-4b1c-8772-f352049e8ae6
Apr 25 03:47:48 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:48.346105    1378 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:57fcf2de-9c4c-4b1c-8772-f352049e8ae6 Type:ContainerDied Data:09400e0c5ea9b0cdf885ff4807bae6bfef98cfb938b954e4f65482f95de0501c}
Apr 25 03:47:49 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:49.348929    1378 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:57fcf2de-9c4c-4b1c-8772-f352049e8ae6 Type:ContainerDied Data:eef2a463e5cc918a3b726c8224f96119a677e94ff696a49b950209bbd20b8f8c}
Apr 25 03:47:49 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:49.371989    1378 kubelet.go:2123] "SyncLoop DELETE" source="api" pods=[default/test-pod]
Apr 25 03:47:49 ip-10-0-205-106 hyperkube[1378]: I0425 03:47:49.380464    1378 kubelet.go:2117] "SyncLoop REMOVE" source="api" pods=[default/test-pod]

Comment 1 MinLi 2022-04-25 08:01:09 UTC

Indeed, sometimes the pod deletion is expected with correct terminationGracePeriodSeconds.
the logs show:

Apr 25 07:23:29 ip-10-0-149-123 hyperkube[1387]: I0425 07:23:29.424151    1387 kubelet.go:2217] "SyncLoop (probe)" probe="liveness" status="unhealthy" pod="default/test-pod"
Apr 25 07:23:29 ip-10-0-149-123 hyperkube[1387]: I0425 07:23:29.424624    1387 kuberuntime_manager.go:710] "Message for Container of pod" containerName="test" containerStatusID={Type:cri-o ID:9845a358af080b2e197aaca44b296efc25d14b28db19f152a323a87d86da5009} pod="default/test-pod" containerMessage="Container test failed liveness probe, will be restarted"
Apr 25 07:23:29 ip-10-0-149-123 hyperkube[1387]: I0425 07:23:29.424662    1387 kuberuntime_container.go:729] "Killing container with a grace period" pod="default/test-pod" podUID=1887263e-b2a9-4316-9ca0-fa9cb23ba272 containerName="test" containerID="cri-o://9845a358af080b2e197aaca44b296efc25d14b28db19f152a323a87d86da5009" gracePeriod=10
Apr 25 07:23:40 ip-10-0-149-123 hyperkube[1387]: I0425 07:23:40.259102    1387 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:1887263e-b2a9-4316-9ca0-fa9cb23ba272 Type:ContainerDied Data:9845a358af080b2e197aaca44b296efc25d14b28db19f152a323a87d86da5009}
Apr 25 07:23:40 ip-10-0-149-123 hyperkube[1387]: I0425 07:23:40.259126    1387 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:1887263e-b2a9-4316-9ca0-fa9cb23ba272 Type:ContainerStarted Data:22bf49750397e98911546b242d9f29038779f3327bb873550dee1194491480d3}
Apr 25 07:24:18 ip-10-0-149-123 hyperkube[1387]: I0425 07:24:18.209380    1387 kubelet.go:2123] "SyncLoop DELETE" source="api" pods=[default/test-pod]
Apr 25 07:24:18 ip-10-0-149-123 hyperkube[1387]: I0425 07:24:18.209611    1387 kuberuntime_container.go:729] "Killing container with a grace period" pod="default/test-pod" podUID=1887263e-b2a9-4316-9ca0-fa9cb23ba272 containerName="test" containerID="cri-o://22bf49750397e98911546b242d9f29038779f3327bb873550dee1194491480d3" gracePeriod=500
Apr 25 07:32:39 ip-10-0-149-123 hyperkube[1387]: I0425 07:32:39.300273    1387 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:1887263e-b2a9-4316-9ca0-fa9cb23ba272 Type:ContainerDied Data:22bf49750397e98911546b242d9f29038779f3327bb873550dee1194491480d3}
Apr 25 07:32:39 ip-10-0-149-123 hyperkube[1387]: I0425 07:32:39.300294    1387 kubelet.go:2145] "SyncLoop (PLEG): event for pod" pod="default/test-pod" event=&{ID:1887263e-b2a9-4316-9ca0-fa9cb23ba272 Type:ContainerDied Data:81fb4b66c527fae7d9aa3bdf34bd7d97c3ae17efdd93ca7c4bde1c33ab5cb229}
Apr 25 07:32:39 ip-10-0-149-123 hyperkube[1387]: I0425 07:32:39.343402    1387 kubelet.go:2123] "SyncLoop DELETE" source="api" pods=[default/test-pod]
Apr 25 07:32:39 ip-10-0-149-123 hyperkube[1387]: I0425 07:32:39.343445    1387 kubelet.go:2117] "SyncLoop REMOVE" source="api" pods=[default/test-pod]

Comment 2 Sai Ramesh Vanka 2022-05-13 08:34:50 UTC

Hello @minmli,

Whenever the liveness probe fails, the corresponding container is restarted and we can observe the following log
Log: "Killing container with a grace period" pod="default/test-pod" podUID=1887263e-b2a9-4316-9ca0-fa9cb23ba272 containerName="test" containerID="cri-o://9845a358af080b2e197aaca44b296efc25d14b28db19f152a323a87d86da5009" gracePeriod=10"

The above "gracePeriod=10" looks to be derived from the pod's annotation "unsupported.do-not-use.openshift.io/override-liveness-grace-period-seconds": "10" from the following piece of code [1]
Reference LOC:

When the container inside a pod is continuously restarting, according to the container restart policy [2], the kubelet restarts the exited container with an exponential backoff delay of 10s, 20s, 40s, 80s, 160s, 300s capped at a maximum delay of 5 min during which the Pod is in "CrashLoopBackOff" state.
So, When a pod is in the CrashLoopBackOff state, I think there is no need to honour the "terminationGracePeriodSeconds" config field as there is no process/container up/running/started.

Hence, when a Pod is in "Running" state(which means the container is also Running), the "terminationGracePeriodSeconds" field is honoured and when a Pod is in "CrashLoopBackOff" state (which means the container is not started), the field is not honoured.
The same behaviour is observed in your case and is as expected.


[1] https://github.com/openshift/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_container.go#L698-L701
[2] https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

Comment 3 MinLi 2022-05-16 10:15:22 UTC

Hi,  Sai Ramesh Vanka

Thanks for your explanation, I think it's acceptable that when a Pod is in "CrashLoopBackOff" state, we could ignore terminationGracePeriodSeconds for deleting operation.

Comment 4 kroxymanrio 2022-11-24 09:27:34 UTC

The explanation is quite good and complete, very convincing https://bitlife.online

Note You need to log in before you can comment on or make changes to this bug.