Bug 1462067

Summary: web console doesn't support grace-period option to delete pods
Product: OpenShift Container Platform Reporter: Kenjiro Nakayama <knakayam>
Component: Management ConsoleAssignee: Samuel Padgett <spadgett>
Status: CLOSED ERRATA QA Contact: XiaochuanWang <xiaocwan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5.1CC: anli, aos-bugs, deads, hasha, jforrest, jokerman, knakayam, mmccomas, spadgett, xiaocwan, xtian, xxia
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The web console previously did not give you the option to delete a pod with grace period 0. This prevented users from deleting pods in the console when they were stuck in the `Terminating` state. In the dialog confirmation prompt, the web console now shows a checkbox that lets users delete a pod immediately, without waiting for it to gracefully terminate.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:28:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
message option shows when delete a pod
none
console DELETE request with header of graceful
none
the pods that is stuck terminating
none
force delete succeed on console
none
Safari developer tools network tab none

Description Kenjiro Nakayama 2017-06-16 05:09:39 UTC
When pods are stuck with Terminating state, users need to delete it with grace-period=0 or (--now=true). Webconsole doesn't have the option to specify for deleting pod.

Comment 2 openshift-github-bot 2017-06-16 16:05:54 UTC
Commit pushed to master at https://github.com/openshift/origin-web-console

https://github.com/openshift/origin-web-console/commit/635ca9e0ae96809e1a5c87c8936065fea88e31bf
Bug 1462067 - Give options to delete pod without grace period

Give the user an option to delete pods immediately with no grace period
in the delete pod confirmation dialog.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1462067

Comment 3 Jessica Forrester 2017-06-16 16:18:14 UTC
We added the option to force the pod to delete immediately in the pod Delete dialog.

Comment 5 XiaochuanWang 2017-06-29 05:13:49 UTC
Created attachment 1292785 [details]
message option shows when delete a pod

Option message shows when delete terminating pod, please refer to the screenshot.

Verified on OCP 3.6
OpenShift Master:  v3.6.126.1
Kubernetes Master: v1.6.1+5115d708d7

Comment 7 XiaochuanWang 2017-06-30 09:25:54 UTC
Created attachment 1293165 [details]
console DELETE request with header of graceful

Comment 8 XiaochuanWang 2017-06-30 09:29:43 UTC
Console's request has the related payload which is same as CLI as below, please refer to the attachment "console DELETE request with header of graceful":
I0630 17:21:59.686284   12109 request.go:991] Request Body: {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":1}

Checked on 
OpenShift Master:  v3.6.128
Kubernetes Master: v1.6.1+5115d708d7

Comment 9 XiaochuanWang 2017-07-03 01:52:02 UTC
As per CLI has updated, (please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1466720#c2) 
So Web-console needs an extra fix as CLI's "--force" here for the final deletion.

Comment 10 Samuel Padgett 2017-07-03 13:02:03 UTC
As far as I can tell, we're sending the same delete options as the CLI. The only difference is we're specifying Foreground propagation policy (which might be the problem?).

Here is what I see from oc when I bump up the log level.

```
$ oc delete pod hello-openshift-1-jxqb8 -n hello --grace-period=0 --force

[...]

I0703 08:46:43.239622   34230 request.go:991] Request Body: {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":0}
I0703 08:46:43.239665   34230 round_trippers.go:386] curl -k -v -XDELETE  -H "Authorization: Bearer 9rYqtcjBx118kb00ZFcD0LQqr_dFFAUtAPepu_GND5c" -H "User-Agent: oc/v1.6.1+5115d708d7 (darwin/amd64) kubernetes/314edd5" -H "Accept: application/json, */*" -H "Content-Type: application/json" https://127.0.0.1:8443/api/v1/namespaces/hello/pods/hello-openshift-1-jxqb8
I0703 08:46:43.252178   34230 round_trippers.go:405] DELETE https://127.0.0.1:8443/api/v1/namespaces/hello/pods/hello-openshift-1-jxqb8 200 OK in 12 milliseconds
```

Related CLI logic: 

https://github.com/spadgett/origin/blob/fc34104d2fcdebd84209845e4fe640fd7014b2d3/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/delete.go#L219-L229

Comment 11 Samuel Padgett 2017-07-03 14:14:15 UTC
(In reply to XiaochuanWang from comment #8)
> Console's request has the related payload which is same as CLI as below,
> please refer to the attachment "console DELETE request with header of
> graceful":
> I0630 17:21:59.686284   12109 request.go:991] Request Body:
> {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":1}
> 
> Checked on 
> OpenShift Master:  v3.6.128
> Kubernetes Master: v1.6.1+5115d708d7

Console is passing `gracePeriodSeconds: 0`, not `gracePeriodSeconds: 1`.

Console DELETE request body:

```
{kind: "DeleteOptions", apiVersion: "v1", propagationPolicy: "Foreground", gracePeriodSeconds: 0}
```

Comment 12 XiaochuanWang 2017-07-05 07:58:39 UTC
CLI could terminate promptly and successfully as below steps for example:

1. Put a pod in Terminationg status (below is for example):
1) oc run mypod --image=aosqe/hello-openshift --generator='run-pod/v1'
2) Stop node of the pod: systemctl stop atomic-openshift-node.service
3) oc delete pod mypod
Then the pod "mypod" is in Terminating status

2. Force delete the pod by CLI:
oc delete pod mypod --grace-period=0 --force # or use --now=true, which has same result
Then the pod is removed immediately. 

However console (v3.6.133) doesn't work, it still put pod in Terminating status. Then try again by CLI command and now pod can't be removed. Something went wrong when terminate it on console.

Not very sure about 'propagationPolicy: "Foreground"', could you help to look into it?

Comment 13 David Eads 2017-07-05 13:31:19 UTC
Do you have a yaml of one of the pods that is stuck terminating?  Terminating pods seems to mean different things in different areas of our codebase.

Comment 14 Samuel Padgett 2017-07-05 16:17:00 UTC
Moving this to modified since the following PR removes propagation policy from the DELETE request. The web console should now be consistent with the CLI.

https://github.com/openshift/origin-web-console/pull/1792

There is a related issue remaining, however, that's not specific to the web console:

https://github.com/openshift/origin/issues/15044

Comment 15 Jessica Forrester 2017-07-05 19:48:36 UTC
@xiaocwan we would still like to get the YAML of the pods that are stuck Terminating as requested in comment 13 so that we can try and do further debugging for issue 15044

Comment 17 XiaochuanWang 2017-07-10 02:30:22 UTC
Created attachment 1295674 [details]
the pods that is stuck terminating

Comment 18 XiaochuanWang 2017-07-10 02:33:30 UTC
Information is attached. Compare the pod output yaml between the previous Terminating status and stuck Terminating status:

$ diff previousTerminatingPod.yaml keepTerminatingPod.yaml
9,10c9,12
<     deletionGracePeriodSeconds: 30
<     deletionTimestamp: 2017-07-10T02:27:17Z
---
>     deletionGracePeriodSeconds: 0
>     deletionTimestamp: 2017-07-10T02:26:47Z
>     finalizers:
>     - foregroundDeletion
13c15
<     resourceVersion: "6121"
---
>     resourceVersion: "6136"

Comment 19 David Eads 2017-07-10 12:32:33 UTC
Thanks, I was able to find something: .

If you have  GC problem in the future, please attach the controller logs.  I got a bunch of these in my reproduction:

```
E0710 08:14:16.467285   19016 garbagecollector.go:167] Error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:v1.OwnerReference{APIVersion:"v1", Kind:"Pod", Name:"mypod", UID:"3f670242-6569-11e7-8a94-28d2447dc82b", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil)}, Namespace:"default"}, dependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, dependents:map[*garbagecollector.node]struct {}{}, deletingDependents:true, deletingDependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, beingDeleted:true, beingDeletedLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, owners:[]v1.OwnerReference{}}: pods "mypod" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{1000090000}: 1000090000 is not an allowed group seLinuxOptions.level: Invalid value: "s0:c10,c0": seLinuxOptions.level on mypod does not match required level.  Found s0:c10,c0, wanted s0:c5,c0 securityContext.runAsUser: Invalid value: 1000090000: UID on container mypod does not match required range.  Found 1000090000, required min: 1000020000 max: 1000029999 seLinuxOptions.level: Invalid value: "s0:c10,c0": seLinuxOptions.level on mypod does not match required level.  Found s0:c10,c0, wanted s0:c5,c0]
```

Comment 20 David Eads 2017-07-10 12:33:02 UTC
Link didn't paste.  This pull here: https://github.com/openshift/origin/pull/15112

Comment 21 XiaochuanWang 2017-07-14 06:30:53 UTC
Created attachment 1298136 [details]
force delete succeed on console

Comment 22 XiaochuanWang 2017-07-14 06:32:21 UTC
Verified on latest web console:
OpenShift Master:      v3.6.144
Kubernetes Master:     v1.6.1+5115d708d7 

Please refer to screenshot about the DELETE request on console.

Comment 23 shahan 2017-07-31 08:27:09 UTC
Just want confimation that whether the fix was supported on the Safari or not? Because I found still could not delete the terminating pod with grace-period option in Safari.

Comment 24 Samuel Padgett 2017-08-01 17:10:05 UTC
It should work on Safari. Can you enable developer tools in Safari and check the network tab for the DELETE request to the pod? The request body should look like

{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":null,"gracePeriodSeconds":0}

and it should have status 200.

Comment 25 Samuel Padgett 2017-08-01 17:56:58 UTC
Created attachment 1307729 [details]
Safari developer tools network tab

Here is what the delete request should look like in developer tools

Comment 26 Samuel Padgett 2017-08-01 17:58:04 UTC
@shahan Just confirming as well this isn't a websocket issue in Safari, which can block websockets when using self-signed certificates.

Comment 27 shahan 2017-08-03 10:37:49 UTC
@Samuel Padgett, It's my fault, could normally delete the terminating pod with grace-period option in safari, thanks!

Comment 29 errata-xmlrpc 2017-08-10 05:28:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716