Created attachment 1512501 [details] sample declarative pipeline Description of problem: We have verified that declarative pipelines work with jenkins-client-plugin 1.0.12 and 1.0.16, but that they do not work with (and we tested each of these versions) 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, and 1.0.22. The error we get on the broken versions looks like: ``` [BFA] Scanning build for known causes... [BFA] No failure causes found [BFA] Done. 0s java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.128.4.1/10.128.4.1:58840 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741) at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357) at hudson.remoting.Channel$2.adapt(Channel.java:990) at hudson.remoting.Channel$2.adapt(Channel.java:986) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55) at com.openshift.jenkins.plugins.util.ClientCommandRunner.run(ClientCommandRunner.java:148) at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:198) at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:121) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47) at hudson.security.ACL.impersonate(ACL.java:290) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Caused: java.io.IOException: Cannot run program "oc" (in directory "/home/jenkins/workspace/yellowdog-cicd-ian/yellowdog-cicd-ian-vue-app-pipeline"): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at hudson.Proc$LocalProc.<init>(Proc.java:249) at hudson.Proc$LocalProc.<init>(Proc.java:218) at com.openshift.jenkins.plugins.util.ClientCommandRunner$OcCallable.invoke(ClientCommandRunner.java:99) at com.openshift.jenkins.plugins.util.ClientCommandRunner$OcCallable.invoke(ClientCommandRunner.java:80) at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3085) at hudson.remoting.UserRequest.perform(UserRequest.java:212) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93) Caused: java.util.concurrent.ExecutionException at hudson.remoting.Channel$2.adapt(Channel.java:992) at hudson.remoting.Channel$2.adapt(Channel.java:986) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55) at com.openshift.jenkins.plugins.util.ClientCommandRunner.run(ClientCommandRunner.java:148) at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:198) at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:121) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47) at hudson.security.ACL.impersonate(ACL.java:290) at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Finished: FAILURE ``` Version-Release number of selected component (if applicable): VERIFIED BROKEN - 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, and 1.0.22 VERIFIED WORKS - 1.0.12, 1.016 How reproducible: Always on the versions listed Steps to Reproduce: 1. attempt to use the jenkins-client-plugin in a declarative pipeline on any of the listed version Actual results: It does not work. Expected results: it works Additional info: attaching sample pipeline that this was tested with, the relivant broken step is the "Build Image".
As I ran into this too during hackday, I opened https://jira.coreos.com/browse/DEVEXP-217 as well
Ignore comment #2 ... wrong bug
I have been playing with this more trying to use 1.0.22 and noticed in blue ocean i get slightly more info: ``` java.io.IOException: Cannot run program "/bin//oc" (in directory "/tmp"): error=2, No such file or directory ```
Forgot to add, that error message then makes it seem like this would be related to https://github.com/openshift/jenkins-client-plugin/issues/180 in some way.
OK getting back to this. @Ian I suspect your #Comment 4 and #Comment 5 could be relevant here, though I'm also pursuing a couple of other angles. About to build a simplified version of https://github.com/openshift/jenkins-client-plugin/issues/201#issuecomment-445003352 to hopefully repro, then fix.
OK I've reproduced using one of openshift jenkins sample agent images: pipeline { agent { kubernetes { cloud 'openshift' label 'mypod' defaultContainer 'jnlp' yaml """ apiVersion: v1 kind: Pod metadata: labels: some-label: some-label-value spec: containers: - name: maven image: docker.io/openshift/jenkins-agent-maven-35-centos7:v4.0 command: - cat tty: true """ } } stages { stage('build') { steps { container('maven') { script { openshift.withCluster() { openshift.withProject() { def dcSelector = openshift.selector("dc", "jenkins") dcSelector.describe() } } } } } } } } BTW, the statement in https://github.com/jenkinsci/kubernetes-plugin#container-group-support that cites the `container` statement is alpha would give me pause as a consumer. *Think* the fix might be quick (fingers crossed)
Was not quick
The pod / container relationship has something to do with it. If I change my above example so that I only have the jnlp container (more in line with how our sample agents work), and change refs of maven to jnlp, i.e. containers: - name: jnlp # instead of maven and get rid of the command: - cat my example works fine. The process launching is able to find and exec the oc command. The pod of course has only 1 container in this case. That must have some bearing on things, though I'll need to dig more as to why the former way we launched processes tolerated it (per @Ian's observations). The container step in the k8s plugin does a bunch of integration with the step execution context which our plugin has no access to, but I don't see the durable task plugin (the prior execution mechanism) leveraging that in any way either, so *seems* orthogonal. I've cc:ed Ben in case he has any recollections from his original experimenting with jenkins agents on openshift and how the multicontainer vs. single container pod aspect could have some bearing here. Also, while I start comparing with v1.0.16, @Ian, would your agent image tolerate moving to a single container named jnlp (with cat command removed) and see if that works for you as well.
Gabe, > Also, while I start comparing with v1.0.16, @Ian, would your agent image tolerate moving to a single container named jnlp (with cat command removed) > and see if that works for you as well. two issues: a) we would need to build a new image that combines what our multiple images are doing today b) I did some other experimenting with trying to override the jnlp image because I had a hunch that the newer versions of the plugin was not obeying the `container()` statement but no matter what I put in the YAML i could not get the jnlp contianer to override with my custom image, it always deployed with the alpine image, it just like ignored my `- name: jnpl` container. Though I was only trying with multiple contianers and not just the jnlp container, but still an issue. (https://issues.jenkins-ci.org/browse/JENKINS-55096) for now we have our things working by using the 1.0.16 version of the plugin, but we have also had to work around bugs there in our pipeline, but it works, and i think works good enough until we solve this issue.
No problem / thanks for the info Ian After some revisiting of our old form of process launching, the k8s plugin container step, and the durable task plugin bourne shell, I think I know what is going on. Just need to figure out how to reconcile this issue with the issues that drove us to the current path. I *think* I have a solution that doesn't require any opt in for behaviors, but will need to test a few different things. If any opt-in's are needed, the other scenario will probably be the one that has to opt in.
My prototype seems to be handling all the concerns ... after some more testing/iterating, hope to have a PR up later today
OK PR https://github.com/openshift/jenkins-client-plugin/pull/207 as the set of changes that has addressed this problem across a pretty good subset of our various scenarios. Initiating PR e2e ci to cover a few more. Also creating some new tests which I still need to push.
The PR has merged and v1.0.23 of the plugin has been initiated
Note in https://bugzilla.redhat.com/show_bug.cgi?id=1657208#c7 the line serviceAccount: jenkins needs to be inserted before the "containers:" line in the pod template spec
OK PR https://github.com/openshift/jenkins/pull/765 will trigger the job to update the centos image, and https://buildvm.openshift.eng.bos.redhat.com:8443/job/devex/job/devex%252Fjenkins-plugins/84/ should update the plugin rpm used by the pre-release 4.0 rhel image. Moving on to QE to try out once https://buildvm.openshift.eng.bos.redhat.com:8443/job/devex/job/devex%252Fjenkins-plugins/84/ results in a brew image they can use.
Update openshift-client-jenkins plugin to 1.0.23 manually with brewregistry.stage.redhat.io/openshift/jenkins-2-rhel7:v4.0.0-0.101.0.0 didn't hit this error in comment #0 with declarative pipeline(same with comment #7). I would move this bug to verified. @Gabe But met other permission issue,the jenkinsfile is same with comment #7.Could you help to take a look? [describe] Error from server (Forbidden): deploymentconfigs.apps.openshift.io "jenkins" is forbidden: User "system:serviceaccount:xiu:default" cannot get deploymentconfigs.apps.openshift.io in the namespace "xiu": no RBAC policy matched [Pipeline] } [Pipeline] // script [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // container [Pipeline] } [Pipeline] // node [Pipeline] } [Pipeline] // podTemplate [Pipeline] End of Pipeline ERROR: Error during describe; {reference={}, err=, verb=describe, cmd=oc --server=https://172.30.0.1:443 --certificate-authority=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt --namespace=xiu --token=XXXXX describe deploymentconfig/jenkins , out=Error from server (Forbidden): deploymentconfigs.apps.openshift.io "jenkins" is forbidden: User "system:serviceaccount:xiu:default" cannot get deploymentconfigs.apps.openshift.io in the namespace "xiu": no RBAC policy matched, status=1} $oc get rolebinding -n xiu NAME AGE admin 48m jenkins_edit 47m system:deployers 48m system:image-builders 48m system:image-pullers 48m $oc get sa -n xiu NAME SECRETS AGE builder 2 48m default 2 48m deployer 2 48m jenkins 2 48m $ oc get sa default -o yaml -n xiu apiVersion: v1 imagePullSecrets: - name: default-dockercfg-zghg9 kind: ServiceAccount metadata: creationTimestamp: 2018-12-20T07:48:41Z name: default namespace: xiu resourceVersion: "2633512" selfLink: /api/v1/namespaces/xiu/serviceaccounts/default uid: ab7446a2-042b-11e9-b0a7-0e40ce4b124a secrets: - name: default-token-gtlwm - name: default-dockercfg-zghg9 $ oc get rolebinding -n xiu jenkins_edit -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: annotations: openshift.io/generated-by: OpenShiftNewApp creationTimestamp: 2018-12-20T07:49:20Z labels: app: jenkins-persistent template: jenkins-persistent-template name: jenkins_edit namespace: xiu resourceVersion: "2634196" selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/xiu/rolebindings/jenkins_edit uid: c2889ead-042b-11e9-b0a7-0e40ce4b124a roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: edit subjects: - kind: ServiceAccount name: jenkins
Thank you all.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758