Bug 1657208 - declarative pipeline not working with jenkins-client-plugin 1.0.17 and above
Summary: declarative pipeline not working with jenkins-client-plugin 1.0.17 and above
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.1.0
Assignee: Gabe Montero
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-07 13:27 UTC by Ian Tewksbury
Modified: 2019-06-04 10:41 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the process launching changes in v1.0.17 of the openshift jenkins client plugin broke multicontainer via the k8s plugin declarative pipeline API Consequence: IOExceptions occurred in the Pipeline runs when attempting multicontainer agent paths Fix: v1.0.23 of the openshift jenkins client plugin handles the multicontainer scenario Result: multicontainer k8s plugin declarative pipeline paths now work again
Clone Of:
Environment:
Last Closed: 2019-06-04 10:41:14 UTC
Target Upstream Version:


Attachments (Terms of Use)
sample declarative pipeline (3.85 KB, text/plain)
2018-12-07 13:27 UTC, Ian Tewksbury
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:41:20 UTC

Description Ian Tewksbury 2018-12-07 13:27:32 UTC
Created attachment 1512501 [details]
sample declarative pipeline

Description of problem:

We have verified that declarative pipelines work with jenkins-client-plugin 1.0.12 and 1.0.16, but that they do not work with (and we tested each of these versions) 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, and 1.0.22.

The error we get on the broken versions looks like:

```
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
	at java.lang.ProcessImpl.start(ProcessImpl.java:134)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.128.4.1/10.128.4.1:58840
		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
		at hudson.remoting.Channel$2.adapt(Channel.java:990)
		at hudson.remoting.Channel$2.adapt(Channel.java:986)
		at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
		at com.openshift.jenkins.plugins.util.ClientCommandRunner.run(ClientCommandRunner.java:148)
		at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:198)
		at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:121)
		at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
		at hudson.security.ACL.impersonate(ACL.java:290)
		at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Caused: java.io.IOException: Cannot run program "oc" (in directory "/home/jenkins/workspace/yellowdog-cicd-ian/yellowdog-cicd-ian-vue-app-pipeline"): error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
	at hudson.Proc$LocalProc.<init>(Proc.java:249)
	at hudson.Proc$LocalProc.<init>(Proc.java:218)
	at com.openshift.jenkins.plugins.util.ClientCommandRunner$OcCallable.invoke(ClientCommandRunner.java:99)
	at com.openshift.jenkins.plugins.util.ClientCommandRunner$OcCallable.invoke(ClientCommandRunner.java:80)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3085)
	at hudson.remoting.UserRequest.perform(UserRequest.java:212)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:369)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
Caused: java.util.concurrent.ExecutionException
	at hudson.remoting.Channel$2.adapt(Channel.java:992)
	at hudson.remoting.Channel$2.adapt(Channel.java:986)
	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
	at com.openshift.jenkins.plugins.util.ClientCommandRunner.run(ClientCommandRunner.java:148)
	at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:198)
	at com.openshift.jenkins.plugins.pipeline.OcAction$Execution.run(OcAction.java:121)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
	at hudson.security.ACL.impersonate(ACL.java:290)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Finished: FAILURE
```


Version-Release number of selected component (if applicable):
VERIFIED BROKEN - 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, and 1.0.22
VERIFIED WORKS - 1.0.12, 1.016

How reproducible:
Always on the versions listed


Steps to Reproduce:
1. attempt to use the jenkins-client-plugin in a declarative pipeline on any of the listed version

Actual results:
It does not work.

Expected results:
it works


Additional info:
attaching sample pipeline that this was tested with, the relivant broken step is the "Build Image".

Comment 2 Gabe Montero 2018-12-07 15:37:52 UTC
As I ran into this too during hackday, I opened https://jira.coreos.com/browse/DEVEXP-217 as well

Comment 3 Gabe Montero 2018-12-07 15:39:40 UTC
Ignore comment #2 ... wrong bug

Comment 4 Ian Tewksbury 2018-12-10 08:00:11 UTC
I have been playing with this more trying to use 1.0.22 and noticed in blue ocean i get slightly more info:

```
java.io.IOException: Cannot run program "/bin//oc" (in directory "/tmp"): error=2, No such file or directory
```

Comment 5 Ian Tewksbury 2018-12-10 08:01:07 UTC
Forgot to add, that error message then makes it seem like this would be related to https://github.com/openshift/jenkins-client-plugin/issues/180 in some way.

Comment 6 Gabe Montero 2018-12-11 18:11:47 UTC
OK getting back to this.  @Ian I suspect your #Comment 4 and #Comment 5 could be relevant here, though I'm also pursuing 
a couple of other angles.

About to build a simplified version of https://github.com/openshift/jenkins-client-plugin/issues/201#issuecomment-445003352
to hopefully repro, then fix.

Comment 7 Gabe Montero 2018-12-11 19:05:13 UTC
OK I've reproduced using one of openshift jenkins sample agent images:

pipeline {
  agent {
    kubernetes {
      cloud 'openshift'
      label 'mypod'
      defaultContainer 'jnlp'
      yaml """
apiVersion: v1
kind: Pod
metadata:
  labels:
    some-label: some-label-value
spec:
  containers:
  - name: maven
    image: docker.io/openshift/jenkins-agent-maven-35-centos7:v4.0
    command:
    - cat
    tty: true
"""
    }
  }
    stages {
        stage('build') {
            steps {
                container('maven') {
                    script {
                        openshift.withCluster() {
                            openshift.withProject() {
                                def dcSelector = openshift.selector("dc", "jenkins")
                                dcSelector.describe()
                            }
                        }
                    }
                }
            }
        }
    }
}



BTW, the statement in https://github.com/jenkinsci/kubernetes-plugin#container-group-support that cites the `container` statement is alpha
would give me pause as a consumer.

*Think* the fix might be quick (fingers crossed)

Comment 8 Gabe Montero 2018-12-11 19:22:56 UTC
Was not quick

Comment 9 Gabe Montero 2018-12-11 21:41:08 UTC
The pod / container relationship has something to do with it.

If I change my above example so that I only have the jnlp container (more in line with how our sample agents work), and change refs of maven to jnlp, i.e. 

containers:
 - name: jnlp # instead of maven

and get rid of the 

command:
- cat

my example works fine.  The process launching is able to find and exec the oc command.

The pod of course has only 1 container in this case.  That must have some bearing on things,
though I'll need to dig more as to why the former way we launched processes tolerated it (per @Ian's observations).

The container step in the k8s plugin does a bunch of integration with the step execution
context which our plugin has no access to, but I don't see the durable task plugin (the prior execution mechanism) leveraging that in any way either,
so *seems* orthogonal.

I've cc:ed Ben in case he has any recollections from his original experimenting with jenkins agents on openshift and how the multicontainer vs. single container pod aspect
could have some bearing here.

Also, while I start comparing with v1.0.16, @Ian, would your agent image tolerate moving to a single container named jnlp (with cat command removed)
and see if that works for you as well.

Comment 10 Ian Tewksbury 2018-12-11 21:53:37 UTC
Gabe,

> Also, while I start comparing with v1.0.16, @Ian, would your agent image tolerate moving to a single container named jnlp (with cat command removed)
> and see if that works for you as well.


two issues:

a) we would need to build a new image that combines what our multiple images are doing today
b) I did some other experimenting with trying to override the jnlp image because I had a hunch that the newer versions of the plugin was not obeying the `container()` statement but no matter what I put in the YAML i could not get the jnlp contianer to override with my custom image, it always deployed with the alpine image, it just like ignored my `- name: jnpl` container. Though I was only trying with multiple contianers and not just the jnlp container, but still an issue. (https://issues.jenkins-ci.org/browse/JENKINS-55096)

for now we have our things working by using the 1.0.16 version of the plugin, but we have also had to work around bugs there in our pipeline, but it works, and i think works good enough until we solve this issue.

Comment 11 Gabe Montero 2018-12-12 14:13:19 UTC
No problem / thanks for the info Ian

After some revisiting of our old form of process launching, the k8s plugin container step, and the durable task plugin bourne shell, I 
think I know what is going on.

Just need to figure out how to reconcile this issue with the issues that drove us to the current path.

I *think* I have a solution that doesn't require any opt in for behaviors, but will need to test a few different things.

If any opt-in's are needed, the other scenario will probably be the one that has to opt in.

Comment 12 Gabe Montero 2018-12-12 18:53:20 UTC
My prototype seems to be handling all the concerns ... after some more testing/iterating, hope to have a PR up later today

Comment 13 Gabe Montero 2018-12-17 21:15:46 UTC
OK PR https://github.com/openshift/jenkins-client-plugin/pull/207 as the set of changes that has addressed this problem across a pretty good subset of our various scenarios.

Initiating PR e2e ci to cover a few more.

Also creating some new tests which I still need to push.

Comment 14 Gabe Montero 2018-12-18 16:03:37 UTC
The PR has merged and v1.0.23 of the plugin has been initiated

Comment 15 Gabe Montero 2018-12-18 19:20:56 UTC
Note in https://bugzilla.redhat.com/show_bug.cgi?id=1657208#c7 the line

serviceAccount: jenkins

needs to be inserted before the "containers:" line in the pod template spec

Comment 16 Gabe Montero 2018-12-18 20:30:33 UTC
OK PR https://github.com/openshift/jenkins/pull/765 will trigger the job to update the centos image,
and https://buildvm.openshift.eng.bos.redhat.com:8443/job/devex/job/devex%252Fjenkins-plugins/84/ should update 
the plugin rpm used by the pre-release 4.0 rhel image.

Moving on to QE to try out once https://buildvm.openshift.eng.bos.redhat.com:8443/job/devex/job/devex%252Fjenkins-plugins/84/ results
in a brew image they can use.

Comment 17 XiuJuan Wang 2018-12-20 08:39:14 UTC
Update openshift-client-jenkins plugin to 1.0.23 manually with brewregistry.stage.redhat.io/openshift/jenkins-2-rhel7:v4.0.0-0.101.0.0

didn't hit this error in comment #0 with declarative pipeline(same with comment #7).
I would move this bug to verified.

@Gabe 
But met other permission issue,the jenkinsfile is same with comment #7.Could you help to take a look?

[describe] Error from server (Forbidden): deploymentconfigs.apps.openshift.io "jenkins" is forbidden: User "system:serviceaccount:xiu:default" cannot get deploymentconfigs.apps.openshift.io in the namespace "xiu": no RBAC policy matched


[Pipeline] }
[Pipeline] // script
[Pipeline] }
[Pipeline] // container
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // container
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // podTemplate
[Pipeline] End of Pipeline
ERROR: Error during describe;
{reference={}, err=, verb=describe, cmd=oc --server=https://172.30.0.1:443 --certificate-authority=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt --namespace=xiu --token=XXXXX describe deploymentconfig/jenkins , out=Error from server (Forbidden): deploymentconfigs.apps.openshift.io "jenkins" is forbidden: User "system:serviceaccount:xiu:default" cannot get deploymentconfigs.apps.openshift.io in the namespace "xiu": no RBAC policy matched, status=1}


 $oc get rolebinding -n xiu 
NAME                    AGE
admin                   48m
jenkins_edit            47m
system:deployers        48m
system:image-builders   48m
system:image-pullers    48m

 $oc get sa  -n xiu 
NAME       SECRETS   AGE
builder    2         48m
default    2         48m
deployer   2         48m
jenkins    2         48m

$ oc get sa default -o yaml  -n xiu 
apiVersion: v1
imagePullSecrets:
- name: default-dockercfg-zghg9
kind: ServiceAccount
metadata:
  creationTimestamp: 2018-12-20T07:48:41Z
  name: default
  namespace: xiu
  resourceVersion: "2633512"
  selfLink: /api/v1/namespaces/xiu/serviceaccounts/default
  uid: ab7446a2-042b-11e9-b0a7-0e40ce4b124a
secrets:
- name: default-token-gtlwm
- name: default-dockercfg-zghg9

$ oc get rolebinding -n xiu jenkins_edit -o yaml 
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  annotations:
    openshift.io/generated-by: OpenShiftNewApp
  creationTimestamp: 2018-12-20T07:49:20Z
  labels:
    app: jenkins-persistent
    template: jenkins-persistent-template
  name: jenkins_edit
  namespace: xiu
  resourceVersion: "2634196"
  selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/xiu/rolebindings/jenkins_edit
  uid: c2889ead-042b-11e9-b0a7-0e40ce4b124a
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: edit
subjects:
- kind: ServiceAccount
  name: jenkins

Comment 18 Ian Tewksbury 2019-01-08 16:08:12 UTC
Thank you all.

Comment 21 errata-xmlrpc 2019-06-04 10:41:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.