Description of problem: After cancel jenkinsPipeLine build, the stages are still going on and finally complete. Version-Release number of selected component (if applicable): openshift v1.3.0-alpha.2+8d2b7b2 kubernetes v1.3.0+57fb9ac etcd 2.3.0+git How reproducible: Always Steps to Reproduce: 1. Preparation $ oc create -f https://raw.githubusercontent.com/openshift/origin/master/examples/image-streams/image-streams-centos7.json -n openshift $ oc create -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/pipeline/jenkinstemplate.json -n openshift 2. Create jenkinsPipeLine bc (and build) $ oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/pipeline/samplepipeline.json 3. Start build $ oc start-build sample-pipeline 4. Cancel build, e.g. after 1st stage just begins running $ oc cancel-build sample-pipeline-1 5. Check build and stages in CLI or web console 1> $ oc get build sample-pipeline-1 2> Check build and deploy stages $ oc get pod 3> Check in web console Actual results: 5. 1> The build is cancelled NAME TYPE FROM STATUS STARTED DURATION sample-pipeline-1 JenkinsPipeline Cancelled 6 minutes ago 2m3s 2> The build and deploy stages are still going on and finally complete. NAME READY STATUS RESTARTS AGE ... frontend-2-358l7 1/1 Running 0 1m frontend-2-zzkpt 1/1 Running 0 1m ruby-sample-build-1-build 0/1 Completed 0 2m ... 3> See attachment Expected results: 5. 2> and 3> The stages should be cancelled in cascade with the jenkinsPipeLine build Additional info:
Created attachment 1181411 [details] jenkinspipeline_build_stages_not_cancelled
I've also opened this as a github issue against the sync plugin: https://github.com/fabric8io/openshift-jenkins-sync-plugin/issues/96
Created attachment 1183595 [details] snapshot1
Created attachment 1183598 [details] snapshot2
Tested with: openshift v1.3.0-alpha.2+5c862c0 kubernetes v1.3.0+57fb9ac etcd 2.3.0+git See results in attachments: In snapshot1, cancel the pipeline. In snapshot2, the left stages still go ahead.
BTW, the version in comment 5 is latest AMI devenv-rhel7_4656
Xingxing is this still an issue? it looks like the associated github issue was closed based on feedback from you: https://github.com/fabric8io/openshift-jenkins-sync-plugin/issues/96 https://trello.com/c/ntM08mDF/868-8-pipeline-job-synchronizer#comment-5791eecd3aa58fb35923b878 "@jdyson Yes, cancel build in jenkins webconsole will waste a while but will sync to openshift cli successfully."
Hi Ben Parees, (In reply to Ben Parees from comment #7) > Xingxing is this still an issue? it looks like the associated github issue > was closed based on feedback from you: It was not feedback from me, but that doesn't matter :) (seems was from xiuwang). I tested in devenv-rhel7_4849 today. Still reproduced. But found extra info: The cancellation of pipeline build cannot stop the build stage, as reported above. But if the cancellation is issued during, e.g. a step "sleep 30", the cancellation can stop the build stage (and the whole pipeline, of course). Following are details: After `oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/pipeline/samplepipeline.json` and the jenkins pod becomes running, try to edit bc sample-pipeline like: node('maven') { stage 'build' sleep 30 openshiftBuild(buildConfig: 'ruby-sample-build', showBuildLogs: 'true') stage 'deploy' sleep 30 openshiftDeploy(deploymentConfig: 'frontend') } Then try: A. Cancel the pipeline build when it comes at "sleep 30", then the whole job is stopped (not just showing "Cancelled"), see att1. Its logs on jenkins web console are: OpenShift Build xxia-proj/sample-pipeline-3 [Pipeline] node Running on maven-77b8ed73799 in /tmp/workspace/sample-pipeline [Pipeline] { [Pipeline] stage (build) Entering stage build Proceeding [Pipeline] sleep Aborted by Jenkins Admin [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline Finished: ABORTED B. (Same as comment 0 in fact, just with jenkins logs pasted) Cancel the pipeline build when it is building ruby-sample-build, then the whole pipeline build is not stopped (though it shows "Cancelled" in CLI / OpenShift web console), see att2. Its logs on jenkins web console are: OpenShift Build xxia-proj/sample-pipeline-6 [Pipeline] node Running on maven-a236ecd0f04 in /tmp/workspace/sample-pipeline [Pipeline] { [Pipeline] stage (build) Entering stage build Proceeding [Pipeline] sleep [Pipeline] openshiftBuild Starting the "Trigger OpenShift Build" step with build config "ruby-sample-build" from the project "xxia-proj". Started build "ruby-sample-build-4" and waiting for build completion ... Downloading "https://github.com/openshift/ruby-hello-world.git" ... ---> Installing application source ... ---> Building your Ruby application from source ... ---> Running 'bundle install --deployment' ... Fetching gem metadata from https://rubygems.org/.......... Installing rake 10.3.2 Installing i18n 0.6.11 Installing json 1.8.3 Installing minitest 5.4.2 Installing thread_safe 0.3.4 Installing tzinfo 1.2.2 Installing activesupport 4.1.7 Installing builder 3.2.2 Installing activemodel 4.1.7 Installing arel 5.0.1.20140414130214 Installing activerecord 4.1.7 Installing mysql2 0.3.16 Installing rack 1.5.2 Installing rack-protection 1.5.3 Installing tilt 1.4.1 Installing sinatra 1.4.5 Installing sinatra-activerecord 2.0.3 Using bundler 1.7.8 Your bundle is complete! It was installed into ./bundle ---> Cleaning up unused ruby gems ... Aborted by anonymous Aborted by anonymous Running post commit hook ... /opt/rh/rh-ruby22/root/usr/bin/ruby -I"lib" -I"/opt/app-root/src/bundle/ruby/gems/rake-10.3.2/lib" "/opt/app-root/src/bundle/ruby/gems/rake-10.3.2/lib/rake/rake_test_loader.rb" "test/*_test.rb" Run options: --seed 47730 # Running: . Finished in 0.002358s, 424.1641 runs/s, 424.1641 assertions/s. 1 runs, 1 assertions, 0 failures, 0 errors, 0 skips Pushing image 172.30.56.231:5000/xxia-proj/origin-ruby-sample:latest ... Pushed 3/10 layers, 30% complete Pushed 4/10 layers, 40% complete Pushed 5/10 layers, 50% complete Pushed 6/10 layers, 60% complete Pushed 7/10 layers, 70% complete Pushed 8/10 layers, 80% complete Pushed 9/10 layers, 90% complete Pushed 10/10 layers, 100% complete Push successful Exiting "Trigger OpenShift Build" successfully; build "ruby-sample-build-4" has completed with status: [Complete]. [Pipeline] stage (deploy) Entering stage deploy Proceeding [Pipeline] sleep Click here to forcibly terminate running steps Click here to forcibly terminate running steps [Pipeline] openshiftDeploy Starting "Trigger OpenShift Deployment" with deployment config "frontend" from the project "xxia-proj". Exiting "Trigger OpenShift Deployment" successfully; deployment "frontend-8" has completed with status: [Complete]. [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline Finished: ABORTED
Created attachment 1191552 [details] att1
Created attachment 1191553 [details] att2
I wonder if this is related to https://issues.jenkins-ci.org/browse/JENKINS-34637. We should try with updated pipeline plugins.
I've tried to fix this, but can't seem to. I've tried to force abort the job via the Jenkins console but that doesn't work either so it's unrelated to the sync plugin. There are two possibilities: 1. It's a jenkins bug that you can't force close build steps. 2. The build steps have to somehow be abortable. I'm not sure if this is possible but it would seem remiss if there was no way to interrupt any generic step.
@jimmi can you open a bug against jenkins/pipeline for this and we'll see what they say?
So I've done some more digging & it looks like it's the build steps swallowing InterruptedExceptions. Ultimately the openshiftBuild step runs in a separate thread. When a build is cancelled that thread is interrupted & in Java it is the responsibility of the running thread to notice that it has been interrupted by regularly checking Thread.currentThread().isInterrupted() or by catching InterruptedException if sleeping. Both of these indicate that the step should be interrupted (cancelled) & cleanup should take place. As you can see in IOpenShiftBuilder.waitOnBuild (https://github.com/openshift/jenkins-plugin/blob/master/src/main/java/com/openshift/jenkins/plugins/pipeline/model/IOpenShiftBuilder.java#L111-L171) InterruptedExceptions are swallowed & not handled properly, plus there are no checks for isInterrupted in the loops. Without this there is no way for the step to be cancelled. @bparees As this isn't something to do with the sync plugin I'd like this to be reassigned appropriately please. I'm also keeping https://github.com/fabric8io/openshift-jenkins-sync-plugin/issues/96 closed as it is not related to sync plugin.
@jimmi cool, thanks for the investigation.
The question will be whether it is worth driving this down to the openshift-restclient-java level. @Jeff Cantril - any benefit to the eclipse client in adding an onInterrupted() method to com.openshift.restclient.capability.resources.IPodLogRetrievalAsync.IPodLogListener ?
Forgot to cc: Jeff beforehand :-) @Jeff Cantril - see my question in #Comment 16 - thanks
Realized I should clarify ... I'll be adding the protection in our calls to sleep minimally ... just curious if we want to address further down the stack as well.
Jeff and I talked on IRC ... bottom line, "no" to surfacing this semantic in the rest client. Long term, as part of pipeline-plugin 2.0, we'll look at leveraging an restclient watch with a future for the max wait condition.
The pipeline-plugin has been updated via https://github.com/openshift/jenkins-plugin/pull/106 v1.0.30 is being cut on the jenkins download center as I type. When ready, we'll merge https://github.com/openshift/jenkins/pull/182 I'll move this bug to on qa when the jenkins centos images on docker hub are updated.
the jenkins-1-centos7 image is updated with v1.0.30 of the pipeline plugin and is up on docker hub. there has been an issue pushing the jenkins-2-centos7 image from ci.openshift. i'll track whether that changes and update this bug accordingly, but QE should be able to verify using the jenkins-1-centos7 image, so moving the bug to their attention.
Test with docker.io/openshift/jenkins-1-centos7@sha256:34c35866bb6dc9ddfbe098b35590313d0b3a1774e22ff716f5126b39d97be3da openshift-pipeline 1.0.30 openshift-sync 0.0.14 openshift v3.4.0.19+346a31d kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Actual result: Cancel pipeline build, nodejs-mongodb-example-1 build is also cancelled, but delay several minutes. $oc get build nodejs-mongodb-example-1 Source Git@69b359b Cancelled 4 minutes ago 1m30s nodejs-mongodb-example-2 Source Git Cancelled 2 minutes ago nodejs-mongodb-example-3 Source Git Cancelled sample-pipeline-1 JenkinsPipeline Cancelled 2 minutes ago 1s
FYI I've also added some more cancellation stuff into the sync plugin so even if steps don't handle the cancellation properly the Jenkins build will still be cancelled. This does mean that async steps that e.g. start other OpenShift builds will continue but that the Jenkins build itself will be cancelled as will the OpenShift build that caused the Jenkins build to be triggered.
That sounds great, thanks Jimmi!