Bug 1573648
Summary: | jenkins slave does not respect no_proxy | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Steven Walter <stwalter> | |
Component: | Build | Assignee: | Gabe Montero <gmontero> | |
Status: | CLOSED ERRATA | QA Contact: | Wenjing Zheng <wzheng> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.6.1 | CC: | aos-bugs, bparees, clemens.utschig-utschig, gmontero, stwalter | |
Target Milestone: | --- | |||
Target Release: | 3.6.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Jenkins core/remoting has subpar handling of the no_proxy environment variable which affects communication between Jenkins agents and master when starting a build using the Kubernetes plugin in the OpenShift Jenkins image
Consequence: Pipelines using the Kubernetes plugin are unable to start agents with that plugin when http proxies are defined
Fix: The sample maven and nodejs OpenShift Jenkins images have been updated to automatically add the server url and tunnel hosts to the no_proxy list to ensure that communication works when http proxies are defined.
Result: Jenkins Pipelines can now leverage the Kubernetes plugin to start pods based on the OpenShift Jenkins Maven and NodeJS images.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1578987 (view as bug list) | Environment: | ||
Last Closed: | 2018-06-28 07:54:52 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
Steven Walter
2018-05-01 21:00:03 UTC
> We note that the variables appear inside the running container but are not used. inside what running container? the slave pod? your application container? > Jenkins master respects the proxy variables but images build off our slave image do not. please provide the pod yaml for your master and slave pods > We are concerned because the variable which shows up is lowercase no_proxy and not uppercase NO_PROXY. Why is this set as lowercase and does this matter? there's no great standard here, different software respects different cases..some respect upper, some lower, some both. The safest thing is to ensure both are set. What is failing in your build when the proxy vars are not set? Based on the information you provided my current theory is that accessing your git repo requires going through a proxy and the proxy is not being set when the git-clone occurs on a slave node, but is being set when the git-clone occurs on the master. Hey @Ben we run jenkins with slave triggered thru node ('xxx'). In both the pods (jenkins master & the slave that is boostrapped env variables are set for HTTP_PROXY / HTTPS_PROXY & NO_PROXY). Jenkins slave callback to master (during pod start) is failing with java.io.IOException: http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:165) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) which to me looks like that HTTP_PROXY is honored - but NO_PROXY is not - because a .svc call (e.g. http://jenkins.bns-cd.svc) should not go to the proxy at all (because of NO_PROXY settings as in the OP comment). We also tried to set the proxy & exceptions in jenkins / manage jenkins / plugins / advanced - this did NOT yield success either. Sorry for the badly filed bug in the first place - i hope this post now provides all info needed. /and we run on latest jenkins & slave slave pod yaml apiVersion: v1 kind: Pod metadata: annotations: openshift.io/scc: restricted creationTimestamp: '2018-04-30T13:44:39Z' labels: jenkins: slave jenkins/nodejs-6-angular: 'true' name: nodejs-6-angular-6746b6221e1cc namespace: bns-cd resourceVersion: '25329347' selfLink: /api/v1/namespaces/bns-cd/pods/nodejs-6-angular-6746b6221e1cc uid: a0e20491-4c7c-11e8-8865-0050569e3732 spec: containers: - args: - e7a7491e3d9421d206aa1642abdd03616d79f3a2db3313fe4c80e7a8230e39c3 - nodejs-6-angular-6746b6221e1cc env: - name: JENKINS_LOCATION_URL value: 'https://jenkins-bns-cd.inh-devapps.eu.xxxx.com/' - name: JENKINS_SECRET value: e7a7491e3d9421d206aa1642abdd03616d79f3a2db3313fe4c80e7a8230e39c3 - name: JENKINS_JNLP_URL value: >- http://jenkins.bns-cd.svc:80/computer/nodejs-6-angular-6746b6221e1cc/slave-agent.jnlp - name: JENKINS_TUNNEL value: 'jenkins-jnlp.bns-cd.svc:50000' - name: JENKINS_NAME value: nodejs-6-angular-6746b6221e1cc - name: JENKINS_URL value: 'http://jenkins.bns-cd.svc:80' - name: HOME value: /tmp image: 'docker-registry.default.svc:5000/cd/jenkins-nodejs-6-angular' imagePullPolicy: IfNotPresent name: jnlp resources: {} securityContext: capabilities: drop: - KILL - MKNOD - SETGID - SETUID - SYS_CHROOT privileged: false runAsUser: 1000240000 seLinuxOptions: level: 's0:c16,c0' terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /tmp name: workspace-volume - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-3kxcl readOnly: true workingDir: /tmp dnsPolicy: ClusterFirst imagePullSecrets: - name: default-dockercfg-9nmlf nodeName: inhas65290.eu.boehringer.com nodeSelector: region: primary restartPolicy: Never schedulerName: default-scheduler securityContext: fsGroup: 1000240000 seLinuxOptions: level: 's0:c16,c0' serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 volumes: - emptyDir: {} name: workspace-volume - name: default-token-3kxcl secret: defaultMode: 420 secretName: default-token-3kxcl status: conditions: - lastProbeTime: null lastTransitionTime: '2018-04-30T13:44:39Z' status: 'True' type: Initialized - lastProbeTime: null lastTransitionTime: '2018-04-30T13:44:41Z' message: 'containers with unready status: [jnlp]' reason: ContainersNotReady status: 'False' type: Ready - lastProbeTime: null lastTransitionTime: '2018-04-30T13:44:39Z' status: 'True' type: PodScheduled containerStatuses: - containerID: >- docker://e341e5aebef6836ce18d5bd957222046c195eae9f568a07e0ceb8b5b5ffae26f image: 'docker-registry.default.svc:5000/cd/jenkins-nodejs-6-angular:latest' imageID: >- docker-pullable://docker-registry.default.svc:5000/cd/jenkins-nodejs-6-angular@sha256:d3575d25a0c32d22c3f44504c77abc32d068f9a0890c61b8ff4922b8abed8756 lastState: {} name: jnlp ready: false restartCount: 0 state: terminated: containerID: >- docker://e341e5aebef6836ce18d5bd957222046c195eae9f568a07e0ceb8b5b5ffae26f exitCode: 255 finishedAt: '2018-04-30T13:44:41Z' reason: Error startedAt: '2018-04-30T13:44:40Z' hostIP: 10.183.195.13 phase: Failed qosClass: BestEffort startTime: '2018-04-30T13:44:39Z' xxxx => omitted slave is inheriting from latest slave image - see below FROM registry.access.redhat.com/openshift3/jenkins-slave-base-rhel7 MAINTAINER Richard Attermeyer <richard.attermeyer> # Labels consumed by Red Hat build service LABEL com.redhat.component="jenkins-slave-nodejs-rhel7-docker" \ name="openshift3/jenkins-slave-nodejs-rhel7" \ version="3.6" \ architecture="x86_64" \ release="4" \ io.k8s.display-name="Jenkins Slave Nodejs" \ io.k8s.description="The jenkins slave nodejs image has the nodejs tools on top of the jenkins slave base image." \ io.openshift.tags="openshift,jenkins,slave,nodejs" ENV NODEJS_VERSION=6.10 \ NPM_CONFIG_PREFIX=$HOME/.npm-global \ PATH=$HOME/node_modules/.bin/:$HOME/.npm-global/bin/:$PATH \ BASH_ENV=/usr/local/bin/scl_enable \ ENV=/usr/local/bin/scl_enable \ PROMPT_COMMAND=". /usr/local/bin/scl_enable" # Install cypress dependencies # Please note: xorg-x11-server-Xvfb is not available on RHEL via yum anymore, so "RUN yum install -y xorg-x11-server-Xvfb" won't work. # Therefore this Dockerfile uses the version from CentOS instead. ADD http://mirror.centos.org/centos/7/os/x86_64/Packages/xorg-x11-server-Xvfb-1.19.3-11.el7.x86_64.rpm /root/xorg-x11-server-Xvfb.x86_64.rpm RUN yum -y install /root/xorg-x11-server-Xvfb.x86_64.rpm && \ yum install -y gtk2-2.24* && \ yum install -y libXtst* # provides libXss RUN yum install -y libXScrnSaver* # provides libgconf-2 RUN yum install -y GConf2* # provides libasound RUN yum install -y alsa-lib* && \ yum install -y nss-devel libnotify-devel gnu-free-sans-fonts # Install NodeJS + Yarn + Angular CLI + cypress # unfortunately nodejs6 is not yet available on rhel 7 on the scl # see: https://www.softwarecollections.org/en/ # and the base image relies on scl_enable COPY contrib/bin/scl_enable /usr/local/bin/scl_enable COPY npmrc $HOME/.npmrc RUN curl --silent --location https://rpm.nodesource.com/setup_6.x | bash - && \ curl --silent --location https://dl.yarnpkg.com/rpm/yarn.repo -o /etc/yum.repos.d/yarn.repo && \ yum install -y yarn nodejs gcc-c++ make && \ yum clean all -y && \ npm install -g @angular/cli.2 && \ npm install -g cypress # install google-chrome (for angular) ADD https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm /root/google-chrome-stable_current_x86_64.rpm RUN yum -y install /root/google-chrome-stable_current_x86_64.rpm && \ ln -s /usr/lib64/libOSMesa.so.8 /opt/google/chrome/libosmesa.so && \ yum clean all && \ dbus-uuidgen > /etc/machine-id RUN chown -R 1001:0 $HOME && \ chmod -R g+rw $HOME USER 1001 we have verified that ENV with proxy settings is available in the slave container Some analysis of the data provided: 1) The stack trace snippet provided: java.io.IOException: http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:165) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) "407 Proxy Authentication Required ", based on various internet searches, means that it is hitting a URL that is expecting proxy authentication That would imply that Jenkins is *NOT* finding the "http.proxyHost" or "http_proxy" env vars / sys props when it needs to ... or that the slave pod the master pod communication is going through the proxy when it should not ... i.e. the opposite of what was suggested initially in Comment #4 2) And this snippet from the pod spec: - name: JENKINS_JNLP_URL value: >- http://jenkins.bns-cd.svc:80/computer/nodejs-6-angular-6746b6221e1cc/slave-agent.jnlp http://jenkins.bns-cd.svc:80/ should be coming from the jenkins url set the k8s cloud configuration the openshit jenkins image's kube-slave-common.sh sets the jenkins URL to http://${JENKINS_SERVICE_HOST}:${JENKINS_SERVICE_PORT} for the k8s cloud config in Jenkins when *INITIALLY* configuring jenkins on the very first start up of the jenkins image (assuming you are running jenkins on a PV)... it leaves it alone and any subsequent jenkins restarts By default, that resolves to an IP, not a host name. But typical differences in how the cluster is brought up might explain that. Or there has been subsequent configuration change of that value? So, after some inspection of the jenkins code: a) The list of URLs JnlpAgentEndpointResolver.resolve is suppose to iterate appears to come from the cloud provider's jenkins url setting ... i.e. what 2) discusses b) but it does look at the "http.proxyHost", "http_proxy" and "no_proxy" env vars when deciding if the proxy URL should be used in conjunction with the jenkins URL So, short term, what to do: I) Can the Clemens Utschig or whoever is appropriate confirm what is set for the "Jenkins URL" in the kubernetes cloud configuration ... i.e. from the console, "Manage Jenkins" -> "Configure System", "Cloud" and the "Kubernetes" entry. I would expect "http://jenkins.bns-cd.svc:80" is there, but please confirm II) When Clemens says "we have verified that ENV with proxy settings is available in the slave container" .... please provide the precise keys and values around proxy/noproxy variables found, and how they were obtained ... so I can match with the Jenkins code to confirm whether it should find it or not, as Ben mentioned, case sensitivity is in play here ... I saw no presence of the lower case env vars in the pod yaml provided III) Do we expect the slave to master pod communication to be going through the proxy? If not, sounds like cluster construction issue .... again, the 407 means it is going through a proxy IV) On the "We also tried to set the proxy & exceptions in jenkins / manage jenkins / plugins / advanced - this did NOT yield success either. " point .... can the precise changes performed be provided, and can the precise exeception / problems seen be provided I) Can the Clemens Utschig or whoever is appropriate confirm what is set for the "Jenkins URL" in the kubernetes cloud configuration ... i.e. from the c the jenkins url is set to http://jenkins.bns-cd.svc:80 ( I assume . can't reach the cluster right now) II) When Clemens says "we have verified that ENV with proxy settings is available in the slave container" .... please provide the precise keys and values around proxy/noproxy variables found, and how they were obtained ... so I can match with the Jenkins code to confirm whether it should find it or not, as Ben mentioned, case sensitivity is in play here ... I saw no presence of the lower case env vars in the pod yaml provided -> we have provided them already - they are captured thru the pod terminal -> typing env no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.1 13,10.183.195.13,10.250.127.4 HTTP_PROXY=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 HTTPS_PROXY=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 III) Do we expect the slave to master pod communication to be going through the proxy? If not, sounds like cluster construction issue .... again, the 407 means it is going through a proxy -> correct, it's NOT honoring NO_PROXY settings- I would NOT expect the cluster to do any communication internally thru the proxy .. IV) On the "We also tried to set the proxy & exceptions in jenkins / manage jenkins / plugins / advanced - this did NOT yield success either. " point .... can the precise changes performed be provided, and can the precise exeception / problems seen be provided -> manage jenkins / plugins / advanced - enter proxy host, port, user / pw and also the exception list including *.svc - test with the jenkins callback url - call works. ps - we need the proxy for when the slave goes out to the internet to fetch stuff from github or other sources. so we need NO_PROXY to work and also proxy settings :) verified slave url in jenkins / cloud / kubernetes: jenkins url: http://jenkins.bns-cd.svc:80 jenkins tunnel: jenkins-jnlp.bns-cd.svc:50000 also checked jenkins.bns-cd.svc is a valid service slave pod / terminal / env PWD=/tmp JENKINS_JNLP_URL=http://jenkins.bns-cd.svc:80/computer/nodejs-6-angular-720426b0cf091/slave-agent.jnlp KUBERNETES_PORT_53_UDP_PORT=53 JENKINS_URL=http://jenkins.bns-cd.svc:80 JENKINS_LOCATION_URL=https://jenkins-bns-cd.inh-devapps.eu.boehringer.com/ HTTPS_PROXY=http://x2inhocproxy:xxxxxxxxxxx@inhproxy.eu.boehringer.com:80 https_proxy=http://x2inhocproxy:xxxxxxxxxxx@inhproxy.eu.boehringer.com:80 JENKINS_TUNNEL=jenkins-jnlp.bns-cd.svc:50000 JENKINS_SERVICE_HOST=10.250.133.71 HOME=/tmp JENKINS_SECRET=faec3d2c544c132579790af5db95c79a0d13076928cc3aa7fc2c34f2e5f49b31 SHLVL=2 KUBERNETES_PORT_53_UDP_PROTO=udp KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_HTTPS=443 no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.1 13,10.183.195.13,10.250.127.4 HTTP_PROXY=http://x2inhocproxy:xxxxxxxxxxxxx@inhproxy.eu.boehringer.com:80 re precise exception - see above - the slave starts - and terminates immediately as it can't reach master. -> correct, it's NOT honoring NO_PROXY settings- I would NOT expect the cluster to do any communication internally thru the proxy .. The fact that you get the "407 Proxy Authentication Required " says it is though -> we have provided them already - they are captured thru the pod terminal -> typing env Apologies for missing it....thanks for reposting. So, one of my concerns has been confirmed. Jenkins does not check upper case HTTP_PROXY, etc., only http_proxy, etc., and the java System.getenv is case sensitive on linux, per the javadoc. It does look for "no_proxy" lower case though. So from what I see in the jenkins code (and I saw this in several version of remoting.jar based on the line numbers), it *WOULD* pick up the no_proxy setting. For -> manage jenkins / plugins / advanced - enter proxy host, port, user / pw and also the exception list including *.svc - test with the jenkins callback url - call works. That typically does not come into play for certain aspects of the Jenkins core, and I see that the plugin mgr proxy setting is not leveraged in the remoting path. Short term things to try: - set an env var on the slave container in the pod template config that sets http_proxy=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 and https_proxy=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 - in conjunction, you have two choices a) leave the default no_proxy ... based on what I see in the jenkins code, I would expect it to filter URLs ending with .svc, not apply the proxy setting, and you get the 407 b) no-op the no_proxy setting ... the http_proxy setting should be applied, and in theory you should not get the 407 or - perhaps change the jenkins url in the cloud config to the service ip / port ... that way, you can better confirm it does not go through the proxy, and would rule out http://jenkins.bns-cd.svc:80 resolving to something unexpected that does go through the proxy re precise exception - see above - the slave starts - and terminates immediately as it can't reach master. I assume you mean Comment #4 So the exception in comment #4 is when you configured the proxy in adv setting, or not? OK ... Given HTTPS_PROXY=http://x2inhocproxy:xxxxxxxxxxx@inhproxy.eu.boehringer.com:80 https_proxy=http://x2inhocproxy:xxxxxxxxxxx@inhproxy.eu.boehringer.com:80 And this snippet from the jenkins code: static URLConnection openURLConnection(URL url, String credentials, String proxyCredentials, SSLSocketFactory sslSocketFactory, boolean disableHttpsCertValidation) throws IOException { String httpProxy = null; // If http.proxyHost property exists, openConnection() uses it. if (System.getProperty("http.proxyHost") == null) { httpProxy = System.getenv("http_proxy"); } that env var needs to be "http_proxy", not "https_proxy" Gabe ( > clemens comments) - So the exception in comment #4 is when you configured the proxy in adv setting, or not? > it occurs whether or NOT this is configured - so jenkins remoting does not care it seems. - set an env var on the slave container in the pod template config that sets http_proxy=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 and https_proxy=http://xxx:yyyyyzzzze@inhproxy.eu.boehringer.com:80 > how are they are set? where is this magic template.. for me that's a sucky solution - we configured the global proxy settings on OC as per the doc - why on earth do I have to set something else?! - in conjunction, you have two choices a) leave the default no_proxy ... based on what I see in the jenkins code, I would expect it to filter URLs ending with .svc, not apply the proxy setting, and you get the 407 > so NOT set no proxy? ... but that would only fly for cluster cnames, and not for anything outside of the cluster BUT inside our network :( b) no-op the no_proxy setting ... the http_proxy setting should be applied, and in theory you should not get the 407 > no op? remove / leave empty .. ?! it sounds to me like a fat jenkins bug...?! ... Gabe - from the code snippet 2018-05-02 14:44:08 EDT this is an even worse bug ... because the global settings etc/origin/master/master-config.yaml - name: HTTP_PROXY value: http://user:pass@xxxxxx.com:80 - name: HTTPS_PROXY value: http://user:pass@xxxxxx.com:80 - name: NO_PROXY value: are NOT pushed to the slave ... :( so realistically the only chance we have is to start setting env vars?! ... which would be http_proxy an no_proxy then,... BUT ... what's the codebase to pickup no_proxy? I'm not sure how the variables are being set on either the jenkins master or the jenkins slave, but i'm pretty sure they do not come from the master-config. (I would have to see the context around the excerpt you provided to know for sure what that is setting). I also need to take a step back in this thread to understand what behavior is desired. My understanding of the current issue is that "no_proxy" is not being respected by the Jenkins slave process, despite it being set (at least as lowercase) per the pod yaml you supplied. As a result, the jenkins slave process communication is going to the proxy and getting 407ed by the proxy. So I think the next steps are: 1) try manually setting NO_PROXY (since you already seem to have no_proxy set) to at least confirm that works (Gabe can continue checking the code to see what form of no_proxy it *should* respect) 2) We need to understand where all your proxy variables are coming from on the master+slave to see if there's an openshift issue here w/ what we're setting up. Again, I do not think they are coming from your master-config.yaml. Ben
re 1) try manually setting NO_PROXY (since you already seem to have no_proxy set) to at least confirm that works (Gabe can continue checking the code to see what form of no_proxy it *should* respect)
> how? where is the pod config template (it's not of type template) - so I have NO idea how to set it..
Should I set it on the jenkins DC? ... if so - with *.svc, or just .svc.
Seriously - the documentation (and support) for a pretty simple proxy case sucks!
we are completely dead in the water migrating from an AWS deployment of OC - without proxy inhouse.
Ben
re I'm not sure how the variables are being set on either the jenkins master or the jenkins slave, but i'm pretty sure they do not come from the master-config. (I would have to see the context around the excerpt you provided to know for sure what that is setting).
> we are NOT setting anything on the DC of jenkins - when we build the image - the env vars are injected without us doing anything in the BC or alike
Ok, that makes sense. they are being injected into the image when you build the image because your master-config has build default env vars setup (presumably that is the section of the master-config you pasted). So those proxy/no_proxy env vars are baked directly into your slave image right now, they aren't coming from jenkins or the pod definition. If you haven't rebuilt your image since editing the master-config, you will need to do so or the changes you made will not be reflected in the image you built. I would be interested to see the output of a docker inspect on your slave image, just to confirm what env vars are baked into it. And to summarize I think there are a few issues here: 1) Let's find a no_proxy/NO_PROXY env var that works when it is set in the slave container, assuming the jenkins slave process respects any such env var. right now I think that means rebuilding your slave now that your master-config contains properly defined "no_proxy" and "NO_PROXY" build default env vars. 2) setting those vars by baking them into the image isn't an ideal solution (I realize you probably didn't do it intentionally), Jenkins+its slaves should be configured properly w/ the proxy/noproxy information. That said I think there are some limitations around doing that today and the proxy plugin for Jenkins. Gabe can confirm/elaborate but I view that as a followup after we get *something* working. Hey ben I am re-building master and slave now (ps - I have no idea how we can disable this automated injection into the build). And will verify tmrw am CET the settings - can you tell me what I am supposed to look at - I will run env again - to see what's in master and what's in the slave image. Best, clemens > ps - I have no idea how we can disable this automated injection into the build you'd have to remove the builddefaulter configuration that is currently doing it: https://docs.openshift.org/latest/install_config/build_defaults_overrides.html#manually-setting-global-build-defaults > can you tell me what I am supposed to look at - I will run env again - to see what's in master and what's in the slave image. i'd like to see two things: 1) the output of env from within the slave pod (as you provided earlier) 2) the output of docker inspect yourslaveimage:tag (this will show us the env vars that are baked into the image itself) If you can't run docker inspect, you can also get this information via "oc describe istag yourslaveimagestream:tag", assuming you're pushing the slave image to an openshift imagestream. (The information will be reported as Docker Labels). (btw removing that config from your builddefaulter will likely cause your builds to start failing since I assume you need those proxy settings for your builds to succeed. The issue is that those env vars will both be present during the build, and also baked into the resulting image produced by the build. The latter can cause confusion). master env LC_ALL=en_US.UTF-8 NO_PROXY=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 KUBERNETES_PORT_53_UDP=udp://10.250.0.1:53 KUBERNETES_PORT_53_TCP_PORT=53 http_proxy=http://x2inhocproxy:xxxxxxxx@inhproxy.eu.boehringer.com:80 JENKINS_JNLP_PORT_50000_TCP_PORT=50000 KUBERNETES_PORT_53_UDP_ADDR=10.250.0.1 JENKINS_UC=https://updates.jenkins-ci.org OPENSHIFT_BUILD_NAMESPACE=bix-shared JENKINS_SERVICE_PORT=80 JENKINS_JNLP_PORT=tcp://10.250.27.49:50000 HTTPS_PROXY=http://x2inhocproxy:xxxxxxxxx@inhproxy.eu.boehringer.com:80 https_proxy=http://x2inhocproxy:xxxxxxxxx@inhproxy.eu.boehringer.com:80 JENKINS_SERVICE_HOST=10.250.133.71 KUBERNETES_MASTER=https://kubernetes.default:443 no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 HTTP_PROXY=http://x2inhocproxy:xxxxxxx@inhproxy.eu.boehringer.com:80 slave env (as fast as I could grab it) KUBERNETES_PORT_53_UDP_PROTO=udp KUBERNETES_PORT_443_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_HTTPS=443 no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0 /16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10 .183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 HTTP_PROXY=http://x2inhocproxy:xxxxxxxxxxx@inhproxy.eu.boehringer.com:8 0 JENKINS_JNLP_SERVICE_PORT_AGENT=50000 JENKINS_PORT_80_TCP_PORT=80 JENKINS_PORT=tcp://10.250.133.71:80 JENKINS_JNLP_PORT_50000_TCP=tcp://10.250.27.49:50000 KUBERNETES_PORT_53_TCP_PROTO=tcp KUBERNETES_SERVICE_PORT_DNS_TCP=53 KUBERNETES_PORT_443_TCP_ADDR=10.250.0.1 JENKINS_PORT_80_TCP_PROTO=tcp KUBERNETES_PORT_443_TCP=tcp://10.250.0.1:443 JENKINS_JNLP_SERVICE_PORT=50000 container=oci JENKINS_PORT_80_TCP_ADDR=10.250.133.71 C:\Users\utschig>oc describe istag jenkins:latest Image Name: sha256:3aaf2d384d0f19ac61dc81ea97177b243ad91e7e03660b9869fbe099e87962a9 Docker Image: docker-registry.default.svc:5000/bix-shared/jenkins@sha256:3aaf2d384d0f19ac61dc81ea97177b243ad91e7e03660b9869fbe099e87962a9 Name: sha256:3aaf2d384d0f19ac61dc81ea97177b243ad91e7e03660b9869fbe099e87962a9 Created: 10 hours ago Annotations: openshift.io/image.managed=true Image Size: 436.3 MB (first layer 254.9 kB, last binary layer 74.83 MB) Image Created: 10 hours ago Author: <none> Arch: amd64 Command: /usr/libexec/s2i/run Working Dir: <none> User: 1001 Exposes Ports: 50000/tcp, 8080/tcp Docker Labels: architecture=x86_64 authoritative-source-url=registry.access.redhat.com build-date=2017-09-01T16:17:26.812452 com.redhat.build-host=ip-10-29-120-11.ec2.internal com.redhat.component=openshift-jenkins-2-docker description=Jenkins is a continuous integration server distribution-scope=public io.k8s.description=Jenkins is a continuous integration server io.k8s.display-name=Jenkins 2 io.openshift.build.commit.author=Utschig-Utschig,Clemens (IT) BIG-AT-V \u003cclemens.utschig-utschig\u003e io.openshift.build.commit.date=Fri Jan 5 15:29:14 2018 +0000 io.openshift.build.commit.id=7c335219ed3fdae83fd0b8f87cf0c69d669faf88 io.openshift.build.commit.message=kube-slave-common.sh - replace more IPs with services io.openshift.build.commit.ref=master io.openshift.build.name=bixjenkins-8 io.openshift.build.namespace=bix-shared io.openshift.build.source-context-dir=jenkins-customization io.openshift.expose-services=8080:http io.openshift.s2i.scripts-url=image:///usr/libexec/s2i io.openshift.tags=jenkins,jenkins2,ci name=openshift3/jenkins-2-rhel7 release=17 summary=Provides the latest release of Red Hat Enterprise Linux 7 in a fully featured and supported base image. url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/jenkins-2-rhel7/images/v3.6.173.0.21-17 vcs-ref=0459742e070cfe8410f0b0b2cf72a3b87d020fb8 vcs-type=git vendor=Red Hat, Inc. version=v3.6.173.0.21 Environment: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin container=oci JENKINS_VERSION=2.46 HOME=/var/lib/jenkins JENKINS_HOME=/var/lib/jenkins JENKINS_UC=https://updates.jenkins-ci.org LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 HTTP_PROXY=http://x2inhocproxy:xxxxxx@inhproxy.eu.boehringer.com:80 HTTPS_PROXY=http://x2inhocproxy:xxxxx@inhproxy.eu.boehringer.com:80 NO_PROXY=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 http_proxy=http://x2inhocproxy:xxxxx@inhproxy.eu.boehringer.com:80 https_proxy=http://x2inhocproxy:xxxxx@inhproxy.eu.boehringer.com:80 no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 JAVA_OPTS=-Dhudson.tasks.MailSender.SEND_TO_UNKNOWN_USERS=true -Dhudson.tasks.MailSender.SEND_TO_USERS_WITHOUT_READ=true OPENSHIFT_BUILD_NAME=bixjenkins-8 OPENSHIFT_BUILD_NAMESPACE=bix-shared OPENSHIFT_BUILD_SOURCE=https://bitbucket.bix-digital.com/scm/cicd/bixjenkins.git OPENSHIFT_BUILD_COMMIT=7c335219ed3fdae83fd0b8f87cf0c69d669faf88 Volumes: /var/lib/jenkins C:\Users\utschig>oc describe istag jenkins-nodejs-6-angular:latest Image Name: sha256:87aa6e93a6d88ce479fe7e273cc0fc4fadd0a3ba5822c277e49b4ac7e3ce07b2 Docker Image: docker-registry.default.svc:5000/cd/jenkins-nodejs-6-angular@sha256:87aa6e93a6d88ce479fe7e273cc0fc4fadd0a3ba5822c277e49b4ac7e3ce07b2 Name: sha256:87aa6e93a6d88ce479fe7e273cc0fc4fadd0a3ba5822c277e49b4ac7e3ce07b2 Created: 10 hours ago Annotations: openshift.io/image.managed=true Image Size: 1.214 GB (first layer 166.6 MB, last binary layer 74.87 MB) Image Created: 10 hours ago Author: Richard Attermeyer <richard.attermeyer> Arch: amd64 Entrypoint: /usr/local/bin/run-jnlp-client Working Dir: <none> User: 1001 Exposes Ports: <none> Docker Labels: License=GPLv2+ architecture=x86_64 authoritative-source-url=registry.access.redhat.com build-date=2018-04-18T04:07:14.688798 com.redhat.build-host=ip-10-29-120-29.ec2.internal com.redhat.component=jenkins-slave-nodejs-rhel7-docker description=The jenkins slave base image is intended to be built on top of, to add your own tools that your jenkins job needs. The slave base image includes all the jenkins logic to operate as a slave, so users just have to yum install any additional packages their specific jenkins job will need distribution-scope=public io.k8s.description=The jenkins slave nodejs image has the nodejs tools on top of the jenkins slave base image. io.k8s.display-name=Jenkins Slave Nodejs io.openshift.build.commit.author=Schweikert,Christian (IT BI X) BIX-DE-I \u003cchristian.schweikert\u003e io.openshift.build.commit.date=Thu Apr 12 15:18:10 2018 +0000 io.openshift.build.commit.id=846f0913831705b2f952748cedc86148774887cc io.openshift.build.commit.message=Merge pull request #5 in CICD/dockerimages-jenkins-slaves from feature/BIX-342.. io.openshift.build.commit.ref=master io.openshift.build.name=nodejs-6-angular-slave-32 io.openshift.build.namespace=cd io.openshift.build.source-context-dir=nodejs-6-angular io.openshift.tags=openshift,jenkins,slave,nodejs name=openshift3/jenkins-slave-nodejs-rhel7 release=4 summary=Provides the latest release of Red Hat Enterprise Linux 7 in a fully featured and supported base image. url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/jenkins-slave-base-rhel7/images/v3.6.173.0.113-3 vcs-ref=59fe52c7eb78ada3d2ba6ce9ec3be55656001a74 vcs-type=git vendor=Red Hat, Inc. version=3.6 Environment: PATH=/home/jenkins/node_modules/.bin/:/home/jenkins/.npm-global/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin container=oci HOME=/home/jenkins HTTP_PROXY=http://x2inhocproxy:xxxxxxxxxx@inhproxy.eu.boehringer.com:80 HTTPS_PROXY=http://x2inhocproxy:xxxxxxxxx@inhproxy.eu.boehringer.com:80 NO_PROXY=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 http_proxy=http://x2inhocproxy:xxxxxxxxxx@inhproxy.eu.boehringer.com:80 https_proxy=http://x2inhocproxy:xxxxxxxxx@inhproxy.eu.boehringer.com:80 no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 NODEJS_VERSION=6.10 NPM_CONFIG_PREFIX=/home/jenkins/.npm-global BASH_ENV=/usr/local/bin/scl_enable ENV=/usr/local/bin/scl_enable PROMPT_COMMAND=. /usr/local/bin/scl_enable OPENSHIFT_BUILD_NAME=nodejs-6-angular-slave-32 OPENSHIFT_BUILD_NAMESPACE=cd OPENSHIFT_BUILD_SOURCE=https://bitbucket.bix-digital.com/scm/cicd/dockerimages-jenkins-slaves.git OPENSHIFT_BUILD_COMMIT=846f0913831705b2f952748cedc86148774887cc still failing - same error /usr/local/bin/scl_enable: line 3: scl_source: No such file or directory May 03, 2018 6:40:03 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: nodejs-6-angular-15002fa5876cd May 03, 2018 6:40:03 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. May 03, 2018 6:40:03 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.bns-cd.svc:80] May 03, 2018 6:40:03 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required java.io.IOException: http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:165) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) ok so I see both the no_proxy and NO_PROXY env vars in your slave image but I don't see it in the pod env you grabbed. I am guessing that's because the node already had an old version of the slave image and thus did not pull the new one. I think you will need to enable the force pull option[1] in your slave pod template to ensure your slave is running w/ the updated image. Sorry I didn't think about that earlier. If that still doesn't work we'll have to wait for the results of Gabe's investigation into whether the slave client properly respects no_proxy/NO_PROXY. [1] https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/resources/org/csanchez/jenkins/plugins/kubernetes/ContainerTemplate/config.jelly#L17 see comment: 018-05-03 02:36:10 EDT slave env (as fast as I could grab it) this shows what I could get quickly .. it contains no_proxy :) But it doesn't contain NO_PROXY, right? (even though your new image does, which would imply the pod didn't run the new image) Ben - I am rebuilding the image now as well - right now - I am just not fast enough to grab the whole env. but I am pretty sure it's there. I have verified latest image is pulled - it just stops too fast to grab the env - but describe shows that the ENV is there. Seems like no_proxy is supposed to work: https://issues.jenkins-ci.org/plugins/servlet/mobile#issue/jenkins-32326 Gabe, we made need to try to recreate this locally (set garbage proxy env vars and then set no_proxy and ensure the slave can reach the master) i checked and no_proxy is set when doing env on the slave pod no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127. running with Jenkins ver. 2.46.3 - which is pulled thru openshift3/jenkins-2-rhel7 the fix seems to be 2.9 and above?! ... build yaml: apiVersion: v1 kind: BuildConfig metadata: creationTimestamp: '2018-01-05T08:49:53Z' labels: build: bixjenkins name: bixjenkins namespace: bix-shared resourceVersion: '25618183' selfLink: /oapi/v1/namespaces/bix-shared/buildconfigs/bixjenkins uid: 65b79010-f1f5-11e7-8d02-0050569e2dbf spec: nodeSelector: null output: to: kind: ImageStreamTag name: 'jenkins:latest' postCommit: {} resources: {} runPolicy: Serial source: contextDir: jenkins-customization git: uri: 'https://cd_user@bitbucket.bix-digital.com/scm/cicd/bixjenkins.git' sourceSecret: name: cd-user-pwd type: Git strategy: dockerStrategy: from: kind: ImageStreamTag name: 'jenkins:2' namespace: openshift type: Docker triggers: [] status: lastVersion: 8 2.9 is pretty old, so i would expect the openshift jenkins image to include the fix if indeed that is the release it was delivered in. I don't think we ever shipped anything older than 2.35 or so. Cloning "https://bitbucket.bix-digital.com/scm/cicd/bixjenkins.git" ... Commit: 7c335219ed3fdae83fd0b8f87cf0c69d669faf88 (kube-slave-common.sh - replace more IPs with services) Author: Utschig-Utschig,Clemens (IT) BIG-AT-V <clemens.utschig-utschig> Date: Fri Jan 5 15:29:14 2018 +0000 Step 1 : FROM registry.access.redhat.com/openshift3/jenkins-2-rhel7@sha256:c47b5d8c9ba8a57255e5191cbf0ed9e0cb998bc823846ba52c34cca11a3cf2a0 ---> 8789fa88d268 interesting is that it does not grab latest (although there is no latest tag) .. i.e. no_proxy ... yes Ben, I noted earlier that the lower case form was suppose to work, not the upper case form. And yeah, any jenkins jira's around this that I found were supposed to have been resolved in earlier versions than the ones we have shipped in 3.6, which is why I didn't mention them. I'll start working on recreates today. In addition to the end to end test, I should be able to pull out the regexp logic in org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.inNoProxyEnvVar(String) and put it in a simple test program, and then start feeding it Clemens' input of: no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127. for the regexp and http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/ for the candidate to match looks like : openshift3/jenkins-2-rhel7 always resolves to https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/jenkins-2-rhel7/images/v3.6.173.0.21-17 which is really really weird ... ?! slave image used: https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/jenkins-slave-base-rhel7/images/v3.6.173.0.113-3 OK, my experiment putting the jenkins no proxy regexp logic into a simple main program revealed what is going on, and allowed me to quickly experiment with various permutations. First, turns out, the JENKINS_URL value of 'http://jenkins.bns-cd.svc:80' is *NOT* passing the regexp logic they employ when running against `no_proxy=.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.` I also confirmed that the same algorithm is used from the latest 2.107 code (and the remoting dependency of 3.14 that has) to 2.46.3, etc. (and the remoting dependency of 3.7 that has). I was able to get the no_proxy match to work when I 1) stripped the "http://" prefix and ":80" suffix ... reducing the string to "jenkins.bns-cd.svc" 2) changed ".svc" to "bns-cd.svc" in the no_proxy setting ... it doesn't like a "single element" domain. Here is my simple program that pulls the org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.inNoProxyEnvVar(String) logic into a main program: public class TestNoProxyRegexp { public static void main(String[] args) { boolean rc = false; String host = "jenkins.bnd-cd.svc"; //String host = "http://jenkins.bnd-cd.svc"; //String host = "jenkins.bns-cd.svc:80"; //String host = "http://jenkins.bns-cd.svc:80/tcpSlaveAgentListener/"; //String host = "http://jenkins.bns-cd.svc:80/"; //String host = "http://jenkins.bns-cd.svc"; //String noProxy = ".svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127."; String noProxy = "bnd-cd.svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127."; noProxy = noProxy.trim() // Remove spaces .replaceAll("\\s+", "") // Convert .foobar.com to foobar.com .replaceAll("((?<=^|,)\\.)*(([a-z0-9]+(-[a-z0-9]+)*\\.)+[a-z]{2,})(?=($|,))", "$2"); if (!noProxy.isEmpty()) { // IPV4 and IPV6 if (host.matches("^(?:[0-9]{1,3}\\.){3}[0-9]{1,3}$") || host .matches("^(?:[a-fA-F0-9]{1,4}:){7}[a-fA-F0-9]{1,4}$")) { rc = noProxy.matches(".*(^|,)\\Q" + host + "\\E($|,).*"); System.out.println("GGM checkpoint 1 " + rc); } else { int depth = 0; // Loop while we have a valid domain name: acme.com // We add a safeguard to avoid a case where the host would always be valid because the regex would // for example fail to remove subdomains. // According to Wikipedia (no RFC defines it), 128 is the max number of subdivision for a valid // FQDN: // https://en.wikipedia.org/wiki/Subdomain#Overview while (host.matches("^([a-z0-9]+(-[a-z0-9]+)*\\.)+[a-z]{2,}$") && depth < 128) { ++depth; // Check if the no_proxy contains the host if (noProxy.matches(".*(^|,)\\Q" + host + "\\E($|,).*")) { rc = true; System.out.println("GGM checkpoint 2"); break; } // Remove first subdomain: master.jenkins.acme.com -> jenkins.acme.com else { host = host.replaceFirst("^[a-z0-9]+(-[a-z0-9]+)*\\.", ""); System.out.println("GGM checkpoint 3, host name " + host); } } } } System.out.println("GGM " + rc); } } Oops ... forgot one point they use java.netURL.getHost() before calling the above logic, so that should strip the "http://" and ":80". But the change to no_proxy and changing .svc to bns-cd.svc would be what is still needed. Gabe - awesome that we are getting to a resolution - or at least know whats going wrong. As we set those env props globally, and they are injected into the build and people create jenkins instances on the fly - we have NO way of doing this (your workaround) scripted - (q1) as I don't think there is a way in openshift to reference in a DC an already existing ENV (or in our case prepend to the one that exists). q2) Is CDIR working or not working (as in the no_proxy settings above) - I assume not, given that it's a regexp?! That's a must fix as well - to allow dynamic addition of nodes. q3) ENV is not honored on jenkins master as well (it should automatically populate the plugins / advanced / proxy section) - I assume. So can we have this bug be the tracking bug to fix all this - clearly - right now jenkins is UNUSABLE behind a corporate proxy on kubernetes / openshift. To address the earlier question about why :latest is pointing to a v3.6 image... after v3.6 we introduced version specific tags for all jenkins related images. This was to address compatibility issues in which the v3.7 images could not be used on v3.6 clusters. To ensure that v3.6 clusters which were already configured to use the "latest" tag, we decided to lock the "latest" tag to always point to v3.6, so that older clusters would never pick up the v3.7+ images and break (I think you were impacted by this break when it first occurred). So for a v3.7+ cluster, jenkins images (master+slave) should be referenced using a version specific tag that aligns w/ the cluster version. regarding q1, no, you can't prepend, but you can certainly override the entire env value with a new one. or of course you can rebuild your slave images w/ the new env value baked in. You can explicitly define env vars for slave pods via the slave pod template configuration also, and this would override the value that is baked into your images. I'm not clear on what you mean by not being able to do it scripted. Are you concerned about the case where you need to change the proxy settings and roll them out after people have built their own jenkins slave images w/ the old values baked in? regarding q2, to the extent that there's any standardization around it, no_proxy doesn't support CIDR in general: https://unix.stackexchange.com/questions/23452/set-a-network-range-in-the-no-proxy-environment-variable?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa So i'm not surprised the Jenkins implementation is also deficient in this respect. We can certainly open an issue against Jenkins to consider changing/supporting it. I'll let Gabe speak to q3 but I know there are some complexities around proxy management in Jenkins and the recommended approach tends to be the proxy plugin which carries its own limitations. So all that said, clearly there are documentation deficiencies and less than ideal configuration steps, but I still believe we can provide you and your users with a working configuration either by: 1) baking the proper proxy related env vars into the image (now that Gabe has uncovered the nuances of the Jenkins no_proxy implementation) 2) baking the env vars into a slave pod template configuration that is installed as a configmap (which jenkins will then pick up automatically) 3) baking the env vars into a slave pod template configuration that is part of a custom jenkins configuration that is baked into a custom jenkins image here is the setup we run. we have ONE image - that sits on top of the core openshift one - and people in our org create DCs in many different projects- whose names we don't know. so we cannot populate global settings with project names (e.g. bns-cd) as you suggested because we don't know them. We also use the same base configuration (git repo & dockerfile) in a second cluster that is @AWS without proxy - so the only way I see on how to make this work is to get a code fix from you folks - that fixes this end 2 end for 3.6++ both for master and slave. ok adding - name: no_proxy value: 'bns-cd.svc,default,localhost,local,17.0.0.1' to the master jenkins DC - does NOT help. We need an ETA for the codefix on this - we cannot introduce changing proxy settings in config maps all over the place, as it's a development cluster - and hence we will have potentially 100s of projects with a DC referencing the in jenkins image. Centered on the Jenkins aspects of this (ignoring the "how no_proxy and http_proxy get set" list of questions/concerns for the moment), I was able to get set both http_proxy and no_proxy and get a slave based build to work (cumbersome as it was to do it). I also constructed a debug jenkins remoting.jar file that dumps out what is found for the env vars, what the regexp code decides when determining whether no_proxy should be applied, and ultimately whether a proxy connection is attempted for access to both the jenkins and jenkins-jnlp end points within the slave pod. Quick details on my no_proxy setting: - my pod template's jenkins url is set the the jenkins service IP ... i.e. the default when instantiating our vanilla template on an `oc cluster up` - my no_proxy includes the IP of the jenkins and jenkins-jnlp service **** those env vars have to be set on the slave pods ... either as env vars on the pod templates or env vars on the image used; Jenkins, nor their K8s plugin, *DOES NOTHING* wrt taking those env vars, if they are set on master, and sets them on the slave pods ******** ... submitting a PR against the k8s plugin to allow for that somehow I think is warranted though. And of course the use of IPs is not a long term solution for you Clemens, but some sort of baseline proof was needed to prove this stuff worked at all in Jenkins, however cumbersome it is. I played around with the value of the http_proxy setting to confirm it was getting used for URLs not in no_proxy, and I removed either the jenkins or jenkins-jnlp service IP from no_proxy and proved that the slave pod would fail with connection/comm errors The debug jar .... - I updated a container running the registry.access.redhat.com/openshift3/jenkins-2-rhel7:v3.6.173.0.21-17, and oc rsync'ed my update of the remoting.jar 3.7 level - I oc rscyn'ed that jar to /tmp/target - I then oc rsh'ed into the pod, and went to /var/lib/jenkins/war/WEB-INF/lib, and removed the existing remoting-3.7.jar - I copied over the remoting-3.7.jar I uploaded at /tmp/target to /var/lib/jenkins/war/WEB-INF/lib, and removed the existing remoting-3.7.jar - I restarted jenkins - on my next build attempt, the slave pod logs had all my debug statements, showing whether the proxy was used, and whether the host in question was or was not in the no_proxy list - I committed that container to a image, and pushed it to docker.io/gmontero/jenkins-2-rhel7-v3.6.173.0.21-17-with-debug:latest Now, since /var/lib/jenkins if the image volume, none of my changes of in that dir are included. But the /tmp/target contents are there. Now for the request .... Clemens - I know this only addresses a subset of your concerns, and is a multi-step and manual task ... but to eliminate at least some of the concerns, is it possible for you to take the image I pushed to docker.io, launch your master with it, update the remoting jar and restart Jenkins, update the set the ENVs on either the pod template or slave image, and capture the slave POD logs (my debug statements have "GGM" in them) so we can at least confirm if the http_proxy / no_proxy setting are getting processed as expected? If so, and if we can sort out the Jenkins side of things at least, we can then zero in on the openshift side. thanks hey gabe . super happy to do all that BUT .. can you please help me understand <update the set the ENVs on either the pod template> Where is this template hidden? - We cannot update the dockerfile we use to build the image as it's shared with a non proxy cluster @AWS. Let me know and I'll try the rest tmrw CET time. hey gabe - question 2 - is there anything on the slave I need to do? (except the template). or is this all based on the new remoting jar? Hey Clemens, Thanks for giving the repro a go. As to the pod template, I am referring to the kubernetes plugin configuration within Jenkins. And yeah, I should describe the path to this in more detail for you. So, via the "traditional" way, from the Jenkins Console's main screen, go to "Manage Jenkins", then "Configure System". The bottom of that screen will be the "Cloud" portion of the Jenkins configuration. You should see a "Kubernets" cloud defined, with a set of global config setting for the kubernetes plugin. You will then see an "Images" subsection, and a set of "Kubernetes Pod Template" settings. The openshift jenkins image ships 2 by default, one maven image and one nodejs image. The image that you have been referring to ... that should also be one of these images. Under the "kubernetes pod templates" you will also see a set of "Containers" and then "Container template" entries. Within these "Container templates", you should see references to your specific image as well. Also, within the "Container template" you should see an "EnvVars" section, with a button to add env vars. In my test, I injected the "http_proxy" and "no_proxy" env vars by setting those name / value pairs there, vs. building an image with those env vars baked in. Now, it should not matter ... I've seen jenkins pick up env vars from images and templates before..... *MAYBE* that is where the discrepancy in what we are seeing stems from. I'll go back on Monday and build my slave images with no_proxy and http_proxy baked in, and confirm that. But in parallel, with your Jenkins restarted with the remoting jar from my image placed in /var/lib/jenkins/war/WEB-INF/lib, then, with either a) your slave images running as is b) or by setting the env vars in the kubernetes plugin configuration as described above my new remoting jar will print our what it is finding in the JVM sys/env props, and what decisions it has made wrt those props, and whether it is using proxies (those print statements will start with "GGM") You should not need to do anything else with your slaves (i.e. your second question). If you can get pod logs with the "GGM" prints, analyze them or upload them for me to look at as needed. I'll report back when I've obtained useful results from baking in no_proxy / http_proxy. Hopefully we can put the Jenkins side of things to rest after all this. thanks again master : with your latest image - and NO changes ogs: INFO: Jenkins agent is running in headless mode. May 07, 2018 11:33:15 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.bixpr-cd.svc:80] May 07, 2018 11:33:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ creds null proxy creds null prox http://x2inhocproxy:xxxxxxxx@inhproxy.eu.boehringer.com:80 May 07, 2018 11:33:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host jenkins.bixpr-cd.svc no proxy .svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 May 07, 2018 11:33:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host svc no proxy .svc,.default,.local,localhost,boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 returning(3) false May 07, 2018 11:33:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ opening conn with proxy HTTP @ inhproxy.eu.boehringer.com/10.183.157.6:80 May 07, 2018 11:33:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ returning conn sun.net.www.protocol.http.HttpURLConnection:http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ May 07, 2018 11:33:15 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required java.io.IOException: http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:166) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) May 07, 2018 11:33:18 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback call SEVERE: Error in provisioning; slave=KubernetesSlave name: maven-ada5f716026, template=org.csanchez.jenkins.plugins.kubernetes.PodTemplate@4782b3 May 07, 2018 11:33:22 AM hudson.slaves.NodeProvisioner$2 run change both jenkins service / jnlp to IP: slave log: Using 64 bit Java since OPENSHIFT_JENKINS_JVM_ARCH is not set Downloading http://10.250.108.230:80/jnlpJars/remoting.jar ... and stops here - looks like it can't connect?! service: Selectors: name=jenkins Type: ClusterIP IP: 10.250.108.230 Hostname: jenkins.bixpr-cd.svc Session affinity: None ay 07, 2018 11:40:02 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply INFO: Started provisioning Kubernetes Pod Template from openshift with 1 executors. Remaining excess workload: 0 May 07, 2018 11:40:02 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback call INFO: Created Pod: maven-b39d5cdf92b May 07, 2018 11:40:02 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback call INFO: Waiting for Pod to be scheduled (0/100): maven-b39d5cdf92b May 07, 2018 11:40:08 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud$ProvisioningCallback call INFO: Waiting for slave to connect (0/100): maven-b39d5cdf92b is there some magic miracle - naming convention or alike - that the download works?! ... because the jar is called *-3.7.jar after ages - got the following err message (on the slave). Using 64 bit Java since OPENSHIFT_JENKINS_JVM_ARCH is not set Downloading http://10.250.108.230:80/jnlpJars/remoting.jar ... Running java -XX:+UseParallelGC -XX:MaxPermSize=100m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:MaxMetaspaceSize=100m -cp /home/jenkins/remoting.jar hudson.remoting.jnlp.Main -headless -url http://10.250.108.230:80 -tunnel 10.250.28.120:50000 bea6648a2655796b8d956748b8d3e2ecb465daa68cdcd8fa544c47f1a4586b08 maven-c8dc38c2f98 Error: Could not find or load main class hudson.remoting.jnlp.Main 1986 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/GuiListener.class 1852 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/Main$CuiListener.class 1765 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/GuiListener$2.class 1204 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/MainMenu.class 202 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/Main$1.class 14370 Fri May 04 16:46:30 UTC 2018 hudson/remoting/jnlp/title.png 3391 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/MainDialog.class 1260 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/GuiListener$1.class 2136 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/GUI.class 9585 Fri May 04 16:46:32 UTC 2018 hudson/remoting/jnlp/Main.class 938 Fri May 04 16:46:32 UTC 2018 hudson/remoting/SocketInputStream.class 1239 Fri May 04 16:46:32 UTC 2018 hudson/remoting/Request$Cancel.class it's in there though ... but /home/jenkins is empty ... is there some naming convention .... ?! if I point jenkins back the CNAME instead of IP address ... jenkins slave comes up - and the well known error Using 64 bit Java since OPENSHIFT_JENKINS_JVM_ARCH is not set Downloading http://jenkins.bixpr-cd.svc:80/jnlpJars/remoting.jar ... Running java -XX:+UseParallelGC -XX:MaxPermSize=100m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:MaxMetaspaceSize=100m -cp /home/jenkins/remoting.jar hudson.remoting.jnlp.Main -headless -url http://jenkins.bixpr-cd.svc:80 -tunnel jenkins-jnlp.bixpr-cd.sv:50000 a9af87f068a79e7adcd69ee85bd006f55bc4c402c5f786a29dfeecc5448d0042 maven-d5602d52f71 May 07, 2018 12:18:45 PM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: maven-d5602d52f71 May 07, 2018 12:18:45 PM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. May 07, 2018 12:18:45 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.bixpr-cd.svc:80] May 07, 2018 12:18:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ creds null proxy creds null prox http://x2inhocproxy:Am28UaKrTHpbqC:9HhAu@inhproxy.eu.boehringer.com:80 May 07, 2018 12:18:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host jenkins.bixpr-cd.svc no proxy .svc,.default,.local,localhost,.boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 May 07, 2018 12:18:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host svc no proxy .svc,.default,.local,localhost,boehringer.com,10.250.0.0/16,10.251.0.0/16,10.183.195.106,10.183.195.107,10.183.195.108,10.183.195.109,10.183.195.11,10.183.195.111,10.183.195.112,10.183.195.113,10.183.195.13,10.250.127.4 returning(3) false May 07, 2018 12:18:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ opening conn with proxy HTTP @ inhproxy.eu.boehringer.com/10.183.157.6:80 May 07, 2018 12:18:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ returning conn sun.net.www.protocol.http.HttpURLConnection:http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ May 07, 2018 12:18:46 PM hudson.remoting.jnlp.Main$CuiListener error SEVERE: http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required java.io.IOException: http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ is invalid: 407 Proxy Authentication Required at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:166) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) the only thing I did not try is to set the noproxy as env ... (here it's with CIDR setting currently - which is supported by openshift). Hi Clemens, Both Ben and Gabe here (though Gabe is typing). So, we were able to make some inroads with the analysis. It is more and more looking like Jenkins' sub standard regexp for no_proxy is the immediate road block. 1) As you probably saw, the debug in https://bugzilla.redhat.com/show_bug.cgi?id=1573648#c54 proved the analysis from last week. The Jenkins regexp for no_proxy does *NOT* account for a single segment domain like "svc", or ".svc" or "*svc" We clearly see it constructing a proxy URL 2) Now ... why your use of IP failed to download the remoting.jar This was insightful as well for us ... we use `curl` in our jenkins slave image to download Jenkins remoting jar. As it turns out, it does honor both `http_proxy` and `no_proxy`. When you switched to ip's, but did not update the `no_proxy` list (at least your comment https://bugzilla.redhat.com/show_bug.cgi?id=1573648#c55 did not include you saying you updated it), the curl download tries to go through the proxy and hangs/fails (the missing class is a red herring manifestation of that problem) When I tried it with IPs, I included the precise IP addresses of both jenkins and jenkins-jnlp services in no_proxy. When I do not include the service IPs in no_proxy on the slave pod, the download of the remoting jar fails in the same way as you saw. Also, this revealed to us that `curl`'s regexp for the no_proxy evn var is much better than Jenkins. It *DOES* honor the use of ".svc" for example. We then went back and were in fact able to verify that with simple experiments from the command line.. we see curl handle no_proxy settings like ".svc" or ".com" through varying experiments setting those env vars in various fashion befoe executing the curl (including executing curl with verbose logging on ). Action items / resolution: a) In your comment https://bugzilla.redhat.com/show_bug.cgi?id=1573648#c49 you said you added 'bns-cd.svc' to 'no_proxy' and it did *NOT* help. Did the project in fact change run to run, and it was no longer 'bns-cd'. Or is 'bns-cd' not the project name, and is it possible that the jenkins and jenkins-jnlp service hostnames end in different domain suffixes? I did NOT think that was the case based on prior data provided, but am asking now just in case. Or did you only add bns-cd.svc to the no_proxy for the master and it was not added to the no_proxy value in the slave image? It needs to be in the slave image. We see it set in the slave image in the output you provided today, but is there any chance it was not set when you tried https://bugzilla.redhat.com/show_bug.cgi?id=1573648#c49 ? The jenkins slave boot up communicates twice the the jenkins service (i.e. the "jenkins server url" when looking ), once to download the jar file, once to initiate communication, and then it initiates a connection to the jnlp service (or the "JENKINS_TUNNEL" as I saw in some of the data you provided). We want all three to be "no_proxy". Seeing a run with the debug jar, and using hostnames, where no_proxy is set on the slave and has the correct domain suffix like 'bns-cd.svc' for both jenkins and jenkins-jnlp service could provide closure on that element of this problem. b) Assuming what is happening in a) is what we think is happening, and after reviewing the resolv.conf search patterns OpenShift specifies, Ben suggests changing the jenkins server url and jenkins jnlp url to end in "svc.cluster.local" and add "svc.cluster.local" to the no_proxy value in the slave image (this can be done by updating the master-config and then rebuilding the slave images, based on what has been done previously). Or you can manipulate those setting from the kubernetes plugin configuration panel as I detailed before. thanks if I add sing 64 bit Java since OPENSHIFT_JENKINS_JVM_ARCH is not set Downloading http://jenkins.bixpr-cd.svc:80/jnlpJars/remoting.jar ... Running java -XX:+UseParallelGC -XX:MaxPermSize=100m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:MaxMetaspaceSize=100m -cp /home/jenkins/remoting.jar hudson.remoting.jnlp.Main -headless -url http://jenkins.bixpr-cd.svc:80 -tunnel jenkins-jnlp.bixpr-cd.svc:50000 12781525397c8ceabb34632ba1d4d2394d340a7a4d83d834d92a8ad9586a4c3b maven-576cd988d3015 May 08, 2018 6:55:30 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: maven-576cd988d3015 May 08, 2018 6:55:32 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. May 08, 2018 6:55:34 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.bixpr-cd.svc:80] May 08, 2018 6:55:34 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ creds null proxy creds null prox http://x2inhocproxy:Am28UaKrTHpbqC:9HhAu@inhproxy.eu.boehringer.com:80 May 08, 2018 6:55:34 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host jenkins.bixpr-cd.svc no proxy .bixpr-cd.svc,eu.boehringer.com May 08, 2018 6:55:34 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host bixpr-cd.svc no proxy bixpr-cd.svc,eu.boehringer.com returning(2) true May 08, 2018 6:55:34 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ opening without proxy May 08, 2018 6:55:36 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ returning conn sun.net.www.protocol.http.HttpURLConnection:http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ May 08, 2018 6:56:36 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: Read timed out java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:164) at hudson.remoting.Engine.innerRun(Engine.java:335) at hudson.remoting.Engine.run(Engine.java:287) Never mind - some network hickup Using 64 bit Java since OPENSHIFT_JENKINS_JVM_ARCH is not set Downloading http://jenkins.bixpr-cd.svc:80/jnlpJars/remoting.jar ... Running java -XX:+UseParallelGC -XX:MaxPermSize=100m -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=40 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:MaxMetaspaceSize=100m -cp /home/jenkins/remoting.jar hudson.remoting.jnlp.Main -headless -url http://jenkins.bixpr-cd.svc:80 -tunnel jenkins-jnlp.bixpr-cd.svc:50000 38297bf1d7dec5d085bbc31a8f3195d2e1390d249c96a7688e309f7aa5814cfc maven-576ee3c0457b1 May 08, 2018 6:57:14 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: maven-576ee3c0457b1 May 08, 2018 6:57:14 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. May 08, 2018 6:57:15 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.bixpr-cd.svc:80] May 08, 2018 6:57:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ creds null proxy creds null prox http://x2inhocproxy:Am28UaKrTHpbqC:9HhAu@inhproxy.eu.boehringer.com:80 May 08, 2018 6:57:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host jenkins.bixpr-cd.svc no proxy .bixpr-cd.svc,eu.boehringer.com May 08, 2018 6:57:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host bixpr-cd.svc no proxy bixpr-cd.svc,eu.boehringer.com returning(2) true May 08, 2018 6:57:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ opening without proxy May 08, 2018 6:57:15 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection INFO: GGM openURLConnection http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ returning conn sun.net.www.protocol.http.HttpURLConnection:http://jenkins.bixpr-cd.svc:80/tcpSlaveAgentListener/ May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, CLI2-connect, JNLP-connect, Ping, CLI-connect, JNLP2-connect] May 08, 2018 6:57:16 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Agent discovery successful Agent address: jenkins-jnlp.bixpr-cd.svc Agent port: 50000 Identity: e0:9d:d9:c0:02:78:a0:53:40:a5:d5:03:38:68:16:90 May 08, 2018 6:57:16 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking May 08, 2018 6:57:16 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to jenkins-jnlp.bixpr-cd.svc:50000 May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver getResolvedHttpProxyAddress INFO: GGM getResolvedHttpProxyAddress host jenkins-jnlp.bixpr-cd.svc port 50000 proxies java.util.ArrayList$Itr@71265072 May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver getResolvedHttpProxyAddress INFO: GGM getResolvedHttpProxyAddress host jenkins-jnlp.bixpr-cd.svc port 50000 proxy http://x2inhocproxy:Am28UaKrTHpbqC:9HhAu@inhproxy.eu.boehringer.com:80 May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host jenkins-jnlp.bixpr-cd.svc no proxy .bixpr-cd.svc,eu.boehringer.com May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver inNoProxyEnvVar INFO: GGM inNoProxyEnvVar host bixpr-cd.svc no proxy bixpr-cd.svc,eu.boehringer.com returning(2) true May 08, 2018 6:57:16 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver getResolvedHttpProxyAddress INFO: GGM getResolvedHttpProxyAddress host jenkins-jnlp.bixpr-cd.svc port 50000 returning null May 08, 2018 6:57:16 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Trying protocol: JNLP4-connect May 08, 2018 6:58:15 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Remote identity confirmed: e0:9d:d9:c0:02:78:a0:53:40:a5:d5:03:38:68:16:90 May 08, 2018 6:58:15 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected it's working now it seems ... ;) so - what have we learnt a) proxy settings not propagated from the master b) no proxy needs to be set on the slave pod template / or ENV c) no proxy NOT honoring simple domain ending (.svc) d) no proxy NOT honoring CIDR e) curl of remoting jar broken (at least the error message if it does not work) we cannot start changing global service settings (as we are migrating between two clusters and don't want to change stable cnames in the services). So we need fixes for this whole thing (as we also cannot predict the project names - that will be used - e.g bixpr-cd) I have some at least "partial" updates: 1) I dropped https://github.com/openshift/jenkins/commit/4d27b1113496a71a3a1420b28b0671f7826bc601 into our current release and upstream pipelines last week on May 9th This change to our existing maven/nodejes agent/slave images will dynamically add which ever addresses the k8s plugin has seeded the slave pod with for the jenkins master and jnlp endpoints to the "no_proxy" env var. With that change, the slave bringup works with the type of static, no_proxy setting Clemens provided in conjunction with the setting of http_proxy for access to other endpoints. In other words, the user will *NOT* have to manage the agent/master communication in his no_proxy setting. The image will do it. The change has worked for the various consumers in those pipelines I mentioned to date. Ideally, I'd like to give it a full week, and if all is well, backport to our existing release streams (3.6, 3.7, 3.9) on Wednesday May 16 to go out as needed in errata, etc. via validation/testing in that pipeline. Also note, this new behavior can be turned off by setting a documented env var on the slave image used to any non-empty value. But our thought is this change could satisfy the "meets minimum" requirement for this scenario. 2) Last week I also submitted PR https://github.com/jenkinsci/kubernetes-plugin/pull/321 against the kubernetes plugin so that if desired it will propagate the http_proxy and no_proxy env vars set on the jenkins master to each of the jenkins agents/slaves it starts up. The maintainer to that plugin seems amenable to the change, though it will be off by default. So if we want our images to do it by default, after we bump to the appropriate version of the k8s plugin, we will have to set the appropriate flag to initiate the propagation in our images. 3) I opened issue https://issues.jenkins-ci.org/browse/JENKINS-51223 and provide PR https://github.com/jenkinsci/remoting/pull/269 to update Jenkins core to better process no_proxy. We've at least gotten acknowledgement from the upstream Jenkins maintainer, and will work with them to get to an amenable change. Though we won't be able to consume said change until upstream Jenkins / CloudBees includes it in a LTS release we can consume. And certainly 1), as well as 2), should be more of what is needed for the immediate scenario here. I'll report back as various threads make progress. PR https://github.com/openshift/jenkins/pull/607 has merged, which brings in the defaulting of no_proxy discussed in https://bugzilla.redhat.com/show_bug.cgi?id=1573648#c63 to the 3.6.z stream Per process creating clones for the 3.7, 3.9, and 3.10 release streams to initiate testing for the PRs merged in each of those as well. Will report back when I see the 3.6 image in brew-pulp with these changes. We can then checkpoint to see if it makes sense to commence internal testing and initiate errata update on succesful tests. OK, I'm going to send this to QA for verification against 3.6 Tag v3.6.173.0.122.20180525.154052 for the brew-pulp images has the fix. So, for example, image brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-slave-maven-rhel7:v3.6.173.0.122.20180525.154052 See the testing instructions at https://bugzilla.redhat.com/show_bug.cgi?id=1578993#c1 but simply substitute the 3.6 image for the 3.10 image referenced, and run this against a 3.6 cluster. This verification insures that slave to master / master to slave communication is not adversely affected by the http_proxy/no_proxy configuration. Also, per other discussion points in this bugzilla, unrelated to what QA will verify: 1) v1.6.2 of the kubernetes plugin has my fix to allow that plugin to be configured such that any http_proxy/no_proxy settings on the master will be propagated to any slaves. That version of the plugin was pulled into our master branch last week and is undergoing testing now. When sufficiently satisfied with the results we will initiate backports to older release as needed. 2) My PR to fix jenkins core/remoting so that it can tolerate no_proxy values like ".svc" merged last week as well. I am still waiting on cloudbees to provide a target as to which versions of Jenkins that change will land. Change back to ON_QA to double confirm: @Gabe, I found below release version still has no such issue by according to steps like https://bugzilla.redhat.com/show_bug.cgi?id=1578993#c1 ,anything special to 3.6 when verify this bug? since other versions like 3.7 and 3.9 with release version with same steps can reproduce this issue. registry.access.redhat.com/openshift3/jenkins-slave-maven-rhel7 latest 0c8695d0aa95 12 days ago 1.019 GB You can use the same steps to verify 3.6 as well @Wenjing Zheng OK, thanks for reply! Per comment #67, will verify this bug now. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2007 |