Red Hat Bugzilla – Bug 1314270
Canceling a deployment doesn't cancel a deployment
Last modified: 2016-05-12 12:31:18 EDT
Created attachment 1132718 [details]
Description of problem:
I have a build config that pushes to an image stream, and a deploy config that listens to the stream as a trigger. I activated a build, then cancelled it immediately. It must have successfully pushed the image before the cancellation, because the deploy configuration started up, then that cancelled automatically moments later. however, it still deployed a pod, and that pod is still running, despite the fact that the deploy config is in a "cancelled" state. It really is a strange situation, so I have attached an image for clarity.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Can you provide the yaml from that pod, its replication controller, and the deploymentconfig?
I can notice cancelations lagging a bit but eventually the canceled deployment's pods are scaled down. Have you noticed something different than this?
I haven't noticed any strange behavior other than the fact that old pods lagging for a while before scaled down when a deployment is marked as cancelled (it may also be related to other bugs such as https://bugzilla.redhat.com/show_bug.cgi?id=1281286). I don't consider this a blocker bug so I am marking this for the upcoming release.
This happens due to the resync interval of the deploymentconfig controller (2 minutes currently). When a deployment is cancelled, its deploymentconfig is not resynced on the spot. There are two possible solutions to this: 1) reduce the dc controller interval or 2) force a reconcile of a dc from the deployer controller right after the canceled deployment is marked as failed. The latter option would immediately scale down the cancelled deployment but it would add another update site for deployment configs (the deployer controller), which slightly increases the update conflicts surface for dcs. Former option implemented in https://github.com/openshift/origin/pull/8147.
Commit pushed to master at https://github.com/openshift/origin
Bug 1314270: force dc reconcilation on canceled deployments
Force a deploymentconfig reconcilation when its running deployment
is canceled instead of relying on the deploymentconfig cache sync
interval for rolling back.
I confirmed on ami devenv_rhel_3849 ,
the steps like this:
1. Use command to start build:
`oc process -f /data/src/github.com/openshift/origin/examples/sample-app/application-template-stibuild.json |oc create -f -`
2. Use command to logs the build:
`oc build-logs ruby-sample-build-2`
3. When the push the image successfully, cancel the deployment immediately
`oc deploy frontend --cancel`
Then the deployment canceled, not reproduce the issue. Does my steps suitable ?
> Then the deployment canceled, not reproduce the issue. Does my steps suitable ?
Yes they are ok, but it would be nice to also test with an older complete deployment so you can notice the old deployment being scaled back up as soon as the new is cancelled.
Confirmed on latest OSE, the issue has fixed.
[root@openshift-147 ~]# openshift version
latest: digest: sha256:d36d4166122ad206d23eb77a2a54db0a3e2e137c9ebdf0b666f468acaf82d6ca size: 89224
I0329 06:30:39.300583 1 sti.go:277] Successfully pushed 172.31.39.241:5000/zhouy/origin-ruby-sample:latest
[root@zhouy testjson]# oc deploy frontend --cancel
Cancelled deployment #2
[root@zhouy testjson]# oc get pods
NAME READY STATUS RESTARTS AGE
database-1-izax4 1/1 Running 0 24m
frontend-1-53riz 1/1 Running 0 20m
frontend-1-y74en 1/1 Running 0 20m
frontend-2-deploy 0/1 DeadlineExceeded 0 <invalid>
frontend-2-hook-pre 0/1 DeadlineExceeded 0 <invalid>
ruby-sample-build-1-build 0/1 Completed 0 24m
ruby-sample-build-2-build 0/1 Completed 0 1m
[root@zhouy testjson]# oc get rc
NAME DESIRED CURRENT AGE
database-1 1 1 24m
frontend-1 2 2 20m
frontend-2 0 0 <invalid>
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.