Bug 1462067
Summary: | web console doesn't support grace-period option to delete pods | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Kenjiro Nakayama <knakayam> | ||||||||||||
Component: | Management Console | Assignee: | Samuel Padgett <spadgett> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | XiaochuanWang <xiaocwan> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 3.5.1 | CC: | anli, aos-bugs, deads, hasha, jforrest, jokerman, knakayam, mmccomas, spadgett, xiaocwan, xtian, xxia | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: |
The web console previously did not give you the option to delete a pod with grace period 0. This prevented users from deleting pods in the console when they were stuck in the `Terminating` state.
In the dialog confirmation prompt, the web console now shows a checkbox that lets users delete a pod immediately, without waiting for it to gracefully terminate.
|
Story Points: | --- | ||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2017-08-10 05:28:09 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Kenjiro Nakayama
2017-06-16 05:09:39 UTC
Commit pushed to master at https://github.com/openshift/origin-web-console https://github.com/openshift/origin-web-console/commit/635ca9e0ae96809e1a5c87c8936065fea88e31bf Bug 1462067 - Give options to delete pod without grace period Give the user an option to delete pods immediately with no grace period in the delete pod confirmation dialog. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1462067 We added the option to force the pod to delete immediately in the pod Delete dialog. Created attachment 1292785 [details]
message option shows when delete a pod
Option message shows when delete terminating pod, please refer to the screenshot.
Verified on OCP 3.6
OpenShift Master: v3.6.126.1
Kubernetes Master: v1.6.1+5115d708d7
Created attachment 1293165 [details]
console DELETE request with header of graceful
Console's request has the related payload which is same as CLI as below, please refer to the attachment "console DELETE request with header of graceful": I0630 17:21:59.686284 12109 request.go:991] Request Body: {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":1} Checked on OpenShift Master: v3.6.128 Kubernetes Master: v1.6.1+5115d708d7 As per CLI has updated, (please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1466720#c2) So Web-console needs an extra fix as CLI's "--force" here for the final deletion. As far as I can tell, we're sending the same delete options as the CLI. The only difference is we're specifying Foreground propagation policy (which might be the problem?). Here is what I see from oc when I bump up the log level. ``` $ oc delete pod hello-openshift-1-jxqb8 -n hello --grace-period=0 --force [...] I0703 08:46:43.239622 34230 request.go:991] Request Body: {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":0} I0703 08:46:43.239665 34230 round_trippers.go:386] curl -k -v -XDELETE -H "Authorization: Bearer 9rYqtcjBx118kb00ZFcD0LQqr_dFFAUtAPepu_GND5c" -H "User-Agent: oc/v1.6.1+5115d708d7 (darwin/amd64) kubernetes/314edd5" -H "Accept: application/json, */*" -H "Content-Type: application/json" https://127.0.0.1:8443/api/v1/namespaces/hello/pods/hello-openshift-1-jxqb8 I0703 08:46:43.252178 34230 round_trippers.go:405] DELETE https://127.0.0.1:8443/api/v1/namespaces/hello/pods/hello-openshift-1-jxqb8 200 OK in 12 milliseconds ``` Related CLI logic: https://github.com/spadgett/origin/blob/fc34104d2fcdebd84209845e4fe640fd7014b2d3/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/delete.go#L219-L229 (In reply to XiaochuanWang from comment #8) > Console's request has the related payload which is same as CLI as below, > please refer to the attachment "console DELETE request with header of > graceful": > I0630 17:21:59.686284 12109 request.go:991] Request Body: > {"kind":"DeleteOptions","apiVersion":"v1","gracePeriodSeconds":1} > > Checked on > OpenShift Master: v3.6.128 > Kubernetes Master: v1.6.1+5115d708d7 Console is passing `gracePeriodSeconds: 0`, not `gracePeriodSeconds: 1`. Console DELETE request body: ``` {kind: "DeleteOptions", apiVersion: "v1", propagationPolicy: "Foreground", gracePeriodSeconds: 0} ``` CLI could terminate promptly and successfully as below steps for example: 1. Put a pod in Terminationg status (below is for example): 1) oc run mypod --image=aosqe/hello-openshift --generator='run-pod/v1' 2) Stop node of the pod: systemctl stop atomic-openshift-node.service 3) oc delete pod mypod Then the pod "mypod" is in Terminating status 2. Force delete the pod by CLI: oc delete pod mypod --grace-period=0 --force # or use --now=true, which has same result Then the pod is removed immediately. However console (v3.6.133) doesn't work, it still put pod in Terminating status. Then try again by CLI command and now pod can't be removed. Something went wrong when terminate it on console. Not very sure about 'propagationPolicy: "Foreground"', could you help to look into it? Do you have a yaml of one of the pods that is stuck terminating? Terminating pods seems to mean different things in different areas of our codebase. Moving this to modified since the following PR removes propagation policy from the DELETE request. The web console should now be consistent with the CLI. https://github.com/openshift/origin-web-console/pull/1792 There is a related issue remaining, however, that's not specific to the web console: https://github.com/openshift/origin/issues/15044 @xiaocwan we would still like to get the YAML of the pods that are stuck Terminating as requested in comment 13 so that we can try and do further debugging for issue 15044 Created attachment 1295674 [details]
the pods that is stuck terminating
Information is attached. Compare the pod output yaml between the previous Terminating status and stuck Terminating status: $ diff previousTerminatingPod.yaml keepTerminatingPod.yaml 9,10c9,12 < deletionGracePeriodSeconds: 30 < deletionTimestamp: 2017-07-10T02:27:17Z --- > deletionGracePeriodSeconds: 0 > deletionTimestamp: 2017-07-10T02:26:47Z > finalizers: > - foregroundDeletion 13c15 < resourceVersion: "6121" --- > resourceVersion: "6136" Thanks, I was able to find something: . If you have GC problem in the future, please attach the controller logs. I got a bunch of these in my reproduction: ``` E0710 08:14:16.467285 19016 garbagecollector.go:167] Error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:v1.OwnerReference{APIVersion:"v1", Kind:"Pod", Name:"mypod", UID:"3f670242-6569-11e7-8a94-28d2447dc82b", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil)}, Namespace:"default"}, dependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, dependents:map[*garbagecollector.node]struct {}{}, deletingDependents:true, deletingDependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, beingDeleted:true, beingDeletedLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, owners:[]v1.OwnerReference{}}: pods "mypod" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{1000090000}: 1000090000 is not an allowed group seLinuxOptions.level: Invalid value: "s0:c10,c0": seLinuxOptions.level on mypod does not match required level. Found s0:c10,c0, wanted s0:c5,c0 securityContext.runAsUser: Invalid value: 1000090000: UID on container mypod does not match required range. Found 1000090000, required min: 1000020000 max: 1000029999 seLinuxOptions.level: Invalid value: "s0:c10,c0": seLinuxOptions.level on mypod does not match required level. Found s0:c10,c0, wanted s0:c5,c0] ``` Link didn't paste. This pull here: https://github.com/openshift/origin/pull/15112 Created attachment 1298136 [details]
force delete succeed on console
Verified on latest web console: OpenShift Master: v3.6.144 Kubernetes Master: v1.6.1+5115d708d7 Please refer to screenshot about the DELETE request on console. Just want confimation that whether the fix was supported on the Safari or not? Because I found still could not delete the terminating pod with grace-period option in Safari. It should work on Safari. Can you enable developer tools in Safari and check the network tab for the DELETE request to the pod? The request body should look like {"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":null,"gracePeriodSeconds":0} and it should have status 200. Created attachment 1307729 [details]
Safari developer tools network tab
Here is what the delete request should look like in developer tools
@shahan Just confirming as well this isn't a websocket issue in Safari, which can block websockets when using self-signed certificates. @Samuel Padgett, It's my fault, could normally delete the terminating pod with grace-period option in safari, thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716 |