Bug 1390491 - [devexp_public_878]job displays successful when setting the wrong "the container in which to execute the command" in openshift exec build of jenkins
Summary: [devexp_public_878]job displays successful when setting the wrong "the conta...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Justin Pierce
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On: 1387992 1390417
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-01 08:50 UTC by wewang
Modified: 2017-03-08 18:43 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Lack of clarity in what constituted "failure" in exec action. Also an NPE in certain exec invocations. Consequence: Fixes: https://github.com/openshift/jenkins-plugin/pull/105 https://github.com/openshift/jenkins-plugin/pull/107 Result: Modified behavior and improved documentation. The new behavior should be as follows: - Timeouts/failures/errors should cause traditional builds to fail. - DSL jobs will only fail because of exec timeouts. This means DSL scripts can check for non-timeout errors/failures programmatically and handle them accordingly.
Clone Of: 1390417
Environment:
Last Closed: 2017-03-08 18:31:10 UTC
Target Upstream Version:


Attachments (Terms of Use)

Comment 1 wewang 2016-11-01 09:01:30 UTC
and in DSL,set the wrong command in Script like command: 'sleep1',also build job success

Comment 2 Justin Pierce 2016-11-01 21:45:00 UTC
Success in your log means that the exec API was triggered and returned -- the result of that API is printed, but the Jenkins step was not trying to interpret the result. 

This PR should improve the behavior: https://github.com/openshift/jenkins-plugin/pull/105

The new behavior should be as follows:
- Timeouts/failures/errors should cause traditional builds to fail.
- DSL jobs will only fail because of exec timeouts. This means DSL scripts can check for non-timeout errors/failures programmatically and handle them accordingly.

Comment 3 Dongbo Yan 2016-11-03 10:03:50 UTC
Test with docker.io/openshift/jenkins-1-centos7@sha256:34c35866bb6dc9ddfbe098b35590313d0b3a1774e22ff716f5126b39d97be3da

openshift v3.4.0.19+346a31d
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

1.For freestyle job, when set the wrong value in "The pod in which to execute the command", job failed, and output:
Starting "OpenShift Exec" with project "dyan7".
ERROR: Build step failed with exception
java.lang.NullPointerException
	at com.openshift.jenkins.plugins.pipeline.model.IOpenShiftExec.coreLogic(IOpenShiftExec.java:48)
	at com.openshift.jenkins.plugins.pipeline.model.IOpenShiftPlugin.doItCore(IOpenShiftPlugin.java:303)
	at com.openshift.jenkins.plugins.pipeline.model.IOpenShiftPlugin.doIt(IOpenShiftPlugin.java:316)
	at com.openshift.jenkins.plugins.pipeline.OpenShiftBaseStep.perform(OpenShiftBaseStep.java:81)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782)
	at hudson.model.Build$BuildExecution.build(Build.java:205)
	at hudson.model.Build$BuildExecution.doRun(Build.java:162)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1738)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:410)
Build step 'OpenShift Exec' marked build as failure
Finished: FAILURE

2.When set an existing pod, the job still failed, and output the same error log.

3.For dsl step, must define namespace, and job build success, but output error like:
Starting "OpenShift Exec" with project "dyan7".
Operation will timeout after 180000 milliseconds
Connection opened for exec operation
stdout> 
stdout> rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"exec: \\\"echo ok\\\": executable file not found in $PATH\"\n"

ERROR: Error during exec: command terminated with non-zero exit code: Error executing in Docker Container: 126
Connection closed for exec operation [1000]: 


Exiting "OpenShift Exec" unsuccessfully; the API response included an error or failure.
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: SUCCESS

Comment 4 Justin Pierce 2016-11-03 15:23:53 UTC
(In reply to Dongbo Yan from comment #3)
> Test with
> docker.io/openshift/jenkins-1-centos7@sha256:
> 34c35866bb6dc9ddfbe098b35590313d0b3a1774e22ff716f5126b39d97be3da
> 
> openshift v3.4.0.19+346a31d
> kubernetes v1.4.0+776c994
> etcd 3.1.0-rc.0
> 
> 1.For freestyle job, when set the wrong value in "The pod in which to
> execute the command", job failed, and output:
> Starting "OpenShift Exec" with project "dyan7".
> ERROR: Build step failed with exception
> java.lang.NullPointerException
> 	at
> com.openshift.jenkins.plugins.pipeline.model.IOpenShiftExec.
> coreLogic(IOpenShiftExec.java:48)
> 	at
> com.openshift.jenkins.plugins.pipeline.model.IOpenShiftPlugin.
> doItCore(IOpenShiftPlugin.java:303)
> 	at
> com.openshift.jenkins.plugins.pipeline.model.IOpenShiftPlugin.
> doIt(IOpenShiftPlugin.java:316)
> 	at
> com.openshift.jenkins.plugins.pipeline.OpenShiftBaseStep.
> perform(OpenShiftBaseStep.java:81)
> 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
> 	at
> hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:
> 782)
> 	at hudson.model.Build$BuildExecution.build(Build.java:205)
> 	at hudson.model.Build$BuildExecution.doRun(Build.java:162)
> 	at
> hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
> 	at hudson.model.Run.execute(Run.java:1738)
> 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> 	at hudson.model.ResourceController.execute(ResourceController.java:98)
> 	at hudson.model.Executor.run(Executor.java:410)
> Build step 'OpenShift Exec' marked build as failure
> Finished: FAILURE
> 
> 2.When set an existing pod, the job still failed, and output the same error
> log.
> 
> 3.For dsl step, must define namespace, and job build success, but output
> error like:
> Starting "OpenShift Exec" with project "dyan7".
> Operation will timeout after 180000 milliseconds
> Connection opened for exec operation
> stdout> 
> stdout> rpc error: code = 13 desc = invalid header field value "oci runtime
> error: exec failed: container_linux.go:247: starting container process
> caused \"exec: \\\"echo ok\\\": executable file not found in $PATH\"\n"
> 
> ERROR: Error during exec: command terminated with non-zero exit code: Error
> executing in Docker Container: 126
> Connection closed for exec operation [1000]: 
> 
> 
> Exiting "OpenShift Exec" unsuccessfully; the API response included an error
> or failure.
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] End of Pipeline
> Finished: SUCCESS

For #1 & #2: PR https://github.com/openshift/jenkins-plugin/pull/107  for NullPointerException.

For #3, it appears you have written DSL like the following:
    command: 'echo ok'
Instead of passing 'ok' as an argument, you are instructing the runtime to execute a binary named 'echo ok', which does not exist.
There are three valid DSL forms to accomplish what you are trying to execute:
    command : [ 'echo', 'ok' ]
OR
    command : 'echo',
    arguments: [ 'ok' ]
OR
    command : 'echo'
    arguments : [ [ value : 'ok'] ]

Comment 5 Gabe Montero 2016-11-03 18:13:15 UTC
v1.0.31 of openshift-pipeline is cooking, and it has Justin's fix for the NPE

once it is ready, i'll work on updating the jenkins-*-centos7 images with this version of the plugin, and report back here when they are available on docker hub

Comment 6 Gabe Montero 2016-11-03 22:25:31 UTC
ok, the jenkins-*-centos7 images on docker hub include v1.0.31 of openshift-pipeline, which include Justin's latest version of the fix for this.

QE should be able to attempt to verify.

Comment 7 wewang 2016-11-04 02:37:47 UTC
tested in docker.io/openshift/jenkins-1-centos7  ba0552050043
OpenShift Pipeline Jenkins Plugin  v1.0.31

a. tested in free job,it works now, if set wrong containers, build will failed with :

Starting "OpenShift Exec" with project "wewang".
Operation will timeout after 180000 milliseconds
ERROR: Failure during exec: Expected HTTP 101 response but was '400 Bad Request'
Exiting "OpenShift Exec" unsuccessfully; the API response included an error or failure.
ERROR: "OpenShift Exec" failed
Finished: FAILURE

b. but tested in DSL, if set wrong container, build history also display sucessfully.

Starting "OpenShift Exec" with project "wewang".
Operation will timeout after 180000 milliseconds
ERROR: Failure during exec: Expected HTTP 101 response but was '400 Bad Request'

Exiting "OpenShift Exec" unsuccessfully; the API response included an error or failure.
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: SUCCESS

scripts:

Comment 8 Justin Pierce 2016-11-04 12:19:57 UTC
(In reply to wewang from comment #7)
> tested in docker.io/openshift/jenkins-1-centos7  ba0552050043
> OpenShift Pipeline Jenkins Plugin  v1.0.31
> 
> a. tested in free job,it works now, if set wrong containers, build will
> failed with :
> 
> Starting "OpenShift Exec" with project "wewang".
> Operation will timeout after 180000 milliseconds
> ERROR: Failure during exec: Expected HTTP 101 response but was '400 Bad
> Request'
> Exiting "OpenShift Exec" unsuccessfully; the API response included an error
> or failure.
> ERROR: "OpenShift Exec" failed
> Finished: FAILURE
> 
> b. but tested in DSL, if set wrong container, build history also display
> sucessfully.
> 
> Starting "OpenShift Exec" with project "wewang".
> Operation will timeout after 180000 milliseconds
> ERROR: Failure during exec: Expected HTTP 101 response but was '400 Bad
> Request'
> 
> Exiting "OpenShift Exec" unsuccessfully; the API response included an error
> or failure.
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] End of Pipeline
> Finished: SUCCESS
> 
> scripts:

Thanks. That is the expected behavior. Basically, any exec error during a freestyle job will mark the job as failing. In a pipeline DSL script, errors returned by the API can be handled by the DSL script itself, so they are not treated as failure.
The differing outcomes are described by this comment: https://github.com/openshift/jenkins-plugin/pull/105#issuecomment-257867055

And are described in the README.MD documentation changes from PR 105. https://github.com/openshift/jenkins-plugin/pull/105/files

Comment 9 wewang 2016-11-07 02:14:52 UTC
Thanks, and updated cases with expected results, so the bug already fixed.


Note You need to log in before you can comment on or make changes to this bug.