Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1372425

Summary:

PreStop hooks are no longer blocking

Product:

OpenShift Container Platform

Reporter:

Matt Wringe <mwringe>

Component:

Node

Assignee:

Avesh Agarwal <avagarwa>

Status:

CLOSED NOTABUG

QA Contact:

DeShuai Ma <dma>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

3.3.0

CC:

aos-bugs, ccoleman, decarr, jokerman, mmccomas, mwringe

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-09-01 19:12:36 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
sample pod template with rc	none

Description Matt Wringe 2016-09-01 16:51:55 UTC

Description of problem:
PreStop hooks are suppose to be blocking:

"PreStop
This hook is called immediately before a container is terminated. No parameters are passed to the handler. This event handler is blocking, and must complete before the call to delete the container is sent to the Docker daemon. The SIGTERM notification sent by Docker is also still sent. A more complete description of termination behavior can be found in Termination of Pods." [http://kubernetes.io/docs/user-guide/container-environment/]

But the pods are still being sent the sigkill signal after the timeout occurs, instead of the SIGTERM signal.

This can cause some serious problems, especially if you are relying on things like cleaning up disk usage on persistent storage using a preStop hook.

The PreStop hooks used to be blocking and functioning properly in 3.1.0, not sure about 3.2.0.

How reproducible:
Always

Steps to Reproduce:
1. create a pod
2. add a blocking prestop hook to the pod (eg make sure its catching the SIGTERM)
3. notice that the pod is still killed after 30seconds, even if the script has been configured to not be killable with a SIGTERM

Actual results:
Pod is killed

Expected results:
Pod should not be killed until the script is finished

Additional info:
Simple preStop script which should cause the pod to never exit:

#!/bin/bash

trap 'echo TRAP' EXIT INT TERM SIGTERM SIGINT

while : ;do
 echo "foo"
 sleep 10
done

Comment 1 Derek Carr 2016-09-01 17:16:47 UTC

Matt - is there a sample pod yaml that you can provide?

Avesh - can you attempt to reproduce?

Comment 2 Derek Carr 2016-09-01 17:24:30 UTC

see step5: http://kubernetes.io/docs/user-guide/pods/#termination-of-pods

this is probably expected behavior (30 seconds + 2 seconds)

Comment 3 Avesh Agarwal 2016-09-01 17:28:01 UTC

Working on reproducing this. Yes a sample pod yaml if available would help.

Comment 4 Matt Wringe 2016-09-01 17:39:27 UTC

Created attachment 1196910 [details]
sample pod template with rc

Comment 5 Matt Wringe 2016-09-01 17:41:55 UTC

Sample pod template is attached.

Note that a SIGTERM is suppose to be applied to the script after 30 seconds, but not a SIGKILL. The script traps the SIGTERM and should prevents it from terminating the script.

This pod should essentially not be stoppable from OpenShift, if you want to stop it you would then need to manually kill it with Docker.

Comment 6 Avesh Agarwal 2016-09-01 17:58:09 UTC

I think I am able to reproduce it as follows:

1. I modified the provided yaml to make it work with kube.
#cat test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: prestop-test
spec:
  containers:
    - image: mwringe/prestop-test:latest
      name: prestop-test
      lifecycle:
        preStop:
          exec:
            command:
            - "/tmp/prestop.sh"


2. Created the pod: kubectl create -f test-pod.yaml 
3. Once the pod was running successfully, I deleted the pod explicitly:
kubectl delete -f test-pod.yaml

And I see that after certain time, the pod is disappeared. So it does not block ok preStop script.


But as what Derek provided the link: http://kubernetes.io/docs/user-guide/pods/#termination-of-pods

It clearly says that the only difference with having prestop hooks is that
grace period is extended by 2 more seconds if needed, it does not say anything
about blocking on prestop hooks. So it seems to be working as expected.

Comment 7 Derek Carr 2016-09-01 19:12:36 UTC

This is working as designed.  

Allowing a pod to block deletion on a preStop hook would be a denial of service attack.

There is no event returned to the user when their preStop hook exceeds their grace-period, while there is an event generated when their hook fails.  We should unify the behavior to provide an event in either criteria.

Opened https://github.com/kubernetes/kubernetes/issues/31902 to track generation of an event in future release, but closing this as not a bug.

Comment 8 Matt Wringe 2016-09-01 19:30:23 UTC

Preventing deletion of pods is exactly what the preStop hook is suppose to do and is how it used to work. The postStart scripts also prevented pods from being deleted until the postStart script passed as well.

This is a regression from past behaviour in 3.1

Comment 9 Clayton Coleman 2016-09-01 19:38:45 UTC

This is the designed behavior of the system - the docs are wrong.

Comment 10 Clayton Coleman 2016-09-01 19:39:15 UTC

The system does not allow unbounded execution of any component, including pods.  Set gracePeriod to a long enough duration to allow your handler to complete.