Bug 1314270
Summary: | Canceling a deployment doesn't cancel a deployment | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Alexander Koksharov <akokshar> | ||||
Component: | openshift-controller-manager | Assignee: | Michail Kargakis <mkargaki> | ||||
Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.1.0 | CC: | akokshar, aos-bugs, mkargaki, tdawson | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-05-12 16:31:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Can you provide the yaml from that pod, its replication controller, and the deploymentconfig? Can you provide the yaml from that pod, its replication controller, and the deploymentconfig? I can notice cancelations lagging a bit but eventually the canceled deployment's pods are scaled down. Have you noticed something different than this? I haven't noticed any strange behavior other than the fact that old pods lagging for a while before scaled down when a deployment is marked as cancelled (it may also be related to other bugs such as https://bugzilla.redhat.com/show_bug.cgi?id=1281286). I don't consider this a blocker bug so I am marking this for the upcoming release. This happens due to the resync interval of the deploymentconfig controller (2 minutes currently). When a deployment is cancelled, its deploymentconfig is not resynced on the spot. There are two possible solutions to this: 1) reduce the dc controller interval or 2) force a reconcile of a dc from the deployer controller right after the canceled deployment is marked as failed. The latter option would immediately scale down the cancelled deployment but it would add another update site for deployment configs (the deployer controller), which slightly increases the update conflicts surface for dcs. Former option implemented in https://github.com/openshift/origin/pull/8147. Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/cd5302abc821dd23d5df1e6bf53e9fe576e82886 Bug 1314270: force dc reconcilation on canceled deployments Force a deploymentconfig reconcilation when its running deployment is canceled instead of relying on the deploymentconfig cache sync interval for rolling back. I confirmed on ami devenv_rhel_3849 , openshift v1.1.4-296-g8e98dcc kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 the steps like this: 1. Use command to start build: `oc process -f /data/src/github.com/openshift/origin/examples/sample-app/application-template-stibuild.json |oc create -f -` 2. Use command to logs the build: `oc build-logs ruby-sample-build-2` 3. When the push the image successfully, cancel the deployment immediately `oc deploy frontend --cancel` Then the deployment canceled, not reproduce the issue. Does my steps suitable ? > Then the deployment canceled, not reproduce the issue. Does my steps suitable ?
Yes they are ok, but it would be nice to also test with an older complete deployment so you can notice the old deployment being scaled back up as soon as the new is cancelled.
Confirmed on latest OSE, the issue has fixed. [root@openshift-147 ~]# openshift version openshift v3.2.0.8 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 latest: digest: sha256:d36d4166122ad206d23eb77a2a54db0a3e2e137c9ebdf0b666f468acaf82d6ca size: 89224 I0329 06:30:39.300583 1 sti.go:277] Successfully pushed 172.31.39.241:5000/zhouy/origin-ruby-sample:latest [root@zhouy testjson]# oc deploy frontend --cancel Cancelled deployment #2 [root@zhouy testjson]# oc get pods NAME READY STATUS RESTARTS AGE database-1-izax4 1/1 Running 0 24m frontend-1-53riz 1/1 Running 0 20m frontend-1-y74en 1/1 Running 0 20m frontend-2-deploy 0/1 DeadlineExceeded 0 <invalid> frontend-2-hook-pre 0/1 DeadlineExceeded 0 <invalid> ruby-sample-build-1-build 0/1 Completed 0 24m ruby-sample-build-2-build 0/1 Completed 0 1m [root@zhouy testjson]# oc get rc NAME DESIRED CURRENT AGE database-1 1 1 24m frontend-1 2 2 20m frontend-2 0 0 <invalid> Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064 |
Created attachment 1132718 [details] cancelled_deployment_still_a_pod.PNG Description of problem: I have a build config that pushes to an image stream, and a deploy config that listens to the stream as a trigger. I activated a build, then cancelled it immediately. It must have successfully pushed the image before the cancellation, because the deploy configuration started up, then that cancelled automatically moments later. however, it still deployed a pod, and that pod is still running, despite the fact that the deploy config is in a "cancelled" state. It really is a strange situation, so I have attached an image for clarity. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: