Bug 807260 - Jenkins hang there forever when slave app DNS can not be resolved.
Summary: Jenkins hang there forever when slave app DNS can not be resolved.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 1.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Bill DeCoste
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-27 11:44 UTC by Johnny Liu
Modified: 2012-04-13 18:30 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-13 18:30:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Johnny Liu 2012-03-27 11:44:19 UTC
Description of problem:
Jenkins hang there forever when slave app DNS can not be resolved.

jenkins log:
<--snip-->
Mar 27, 2012 7:40:13 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Mar 27, 2012 7:40:14 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Mar 27, 2012 7:40:19 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Mar 27, 2012 7:40:19 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
<--snip-->

Version-Release number of selected component (if applicable):
jenkins-plugin-openshift-0.5.4-1.el6_2.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create an app with jenkins client embedded.
2. Log into instance, set PUBLIC_HOSTNAME to be invalid on purpose to reproduce this issue.
# vi /etc/stickshift/stickshift-node.conf
PUBLIC_HOSTNAME=aa.bbbbius.com
# /usr/libexec/mcollective/update_yaml.rb > /etc/mcollective/facts.yaml
3. Do some change, do git push to trigger git jenkins build.
  
Actual results:
Jenkins build job hang there for ever.

Expected results:
When some failure is always happening, jenkins build should fail to avoid user's wasting time on it, and tell user to check jenkins log to debug this issue.


Additional info:

Comment 1 Johnny Liu 2012-03-27 11:46:09 UTC
Actually this issue is already addressed in Bug 802686, but the fix patch ignore this issue, just fix partially. So I file this new bug to track this issue.

Comment 2 Bill DeCoste 2012-03-27 19:54:58 UTC
Added a timeout to the node/slave. Default is 60s. Build will be terminated at timeout if DNS does not resolve

Comment 3 Johnny Liu 2012-03-29 05:36:57 UTC
Verified this bug with devenv_1679, and PASS.

Jenkins log:
<--snip-->
Mar 29, 2012 1:34:00 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Mar 29, 2012 1:34:01 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Mar 29, 2012 1:33:04 AM hudson.plugins.openshift.OpenShiftSlave connect
WARNING: Slave DNS not propagated. Timing out.
Mar 29, 2012 1:33:04 AM hudson.plugins.openshift.OpenShiftCloud$2 call
WARNING: Unable to provision node java.io.IOException: Slave DNS not propagated. Timing out.
Mar 29, 2012 1:33:05 AM hudson.slaves.NodeProvisioner update
WARNING: Provisioned slave phptest-build failed to launch
java.io.IOException: Slave DNS not propagated. Timing out.
	at hudson.plugins.openshift.OpenShiftSlave.connect(OpenShiftSlave.java:198)
	at hudson.plugins.openshift.OpenShiftSlave.provision(OpenShiftSlave.java:210)
	at hudson.plugins.openshift.OpenShiftCloud$2.call(OpenShiftCloud.java:459)
	at hudson.plugins.openshift.OpenShiftCloud$2.call(OpenShiftCloud.java:451)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:679)
<--snip-->


Note You need to log in before you can comment on or make changes to this bug.