Bug 811509 - Need more timeout to resolve node/slave DNS
Need more timeout to resolve node/slave DNS
Status: CLOSED CURRENTRELEASE
Product: OpenShift Origin
Classification: Red Hat
Component: Containers (Show other bugs)
1.x
Unspecified Unspecified
low Severity low
: ---
: ---
Assigned To: Bill DeCoste
libra bugs
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-11 05:53 EDT by Johnny Liu
Modified: 2012-04-27 16:46 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-04-27 16:46:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Johnny Liu 2012-04-11 05:53:54 EDT
Description of problem:
According to 807260, the default timeout is 60s for resolving slave DNS.
Maybe it is not enough. 
Recently, I often encounter jenkins build failure due to not enough timeout.

$ git push
remote: You can track your build at https://jenkins-jialiu.dev.rhcloud.com/job/phptest-build
remote: 
remote: Waiting for build to schedule................................................................
remote: **BUILD FAILED/CANCELLED**
remote: Please see the Jenkins log for more details via rhc-tail-files
remote: !!!!!!!!
remote: Deployment Halted!
remote: If the build failed before the deploy step, your previous
remote: build is still running.  Otherwise, your application may be
remote: partially deployed or inaccessible.
remote: Fix the build and try again.
remote: !!!!!!!!
To ssh://0f9708805c19406082c4d126cb227f26@phptest-jialiu.dev.rhcloud.com/~/git/phptest.git/
   8f71a00..6b127b3  master -> master

But check jenkins log, found it succeed.
<--jenkins log-->
Apr 11, 2012 5:36:39 AM hudson.plugins.openshift.OpenShiftSlave stopApp
INFO: Slave stopping application...
Apr 11, 2012 5:36:41 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Connecting to slave phptestbldr...
Apr 11, 2012 5:36:41 AM hudson.plugins.openshift.OpenShiftSlave stopApp
INFO: Slave stopping application...
Apr 11, 2012 5:36:42 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = b4cdcaa6dfe64929bbc651be669ac2b7
Apr 11, 2012 5:36:44 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Connecting to slave phptestbldr...
Apr 11, 2012 5:36:44 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = b4cdcaa6dfe64929bbc651be669ac2b7
Apr 11, 2012 5:36:47 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:36:47 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:36:49 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:36:50 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:36:52 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:36:53 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:36:55 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:36:55 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:36:58 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:36:58 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:00 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:00 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:03 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:04 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:05 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:06 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:09 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:09 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:11 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:11 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:14 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:15 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:16 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:17 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS resolved - phptestbldr-jialiu.dev.rhcloud.com/10.62.5.140
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftComputer <init>
INFO: Creating Computer
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftComputerLauncher launch
INFO: Launching slave...
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftComputerLauncher launch
INFO: Checking availability of computer hudson.plugins.openshift.OpenShiftSlave@fb809362
Apr 11, 2012 5:37:20 AM hudson.plugins.openshift.OpenShiftComputerLauncher launch
INFO: Checking SSH access to application phptestbldr-jialiu.dev.rhcloud.com
Apr 11, 2012 5:37:21 AM hudson.slaves.NodeProvisioner update
INFO: phptest-build provisioning successfully completed. We have now 1 computer(s)
Apr 11, 2012 5:37:22 AM hudson.plugins.openshift.OpenShiftComputerLauncher launch
INFO: Connected via SSH.
Apr 11, 2012 5:37:22 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 11, 2012 5:37:23 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS resolved - phptestbldr-jialiu.dev.rhcloud.com/10.62.5.140
Apr 11, 2012 5:37:23 AM hudson.slaves.NodeProvisioner update
INFO: phptest-build provisioning successfully completed. We have now 1 computer(s)
Apr 11, 2012 5:37:28 AM hudson.plugins.openshift.OpenShiftComputerLauncher launch
INFO: Slave connected.
Apr 11, 2012 5:37:52 AM hudson.model.Run run
INFO: phptest-build #1 main build action completed: SUCCESS
<--jenkins log-->

That will cause application can not be accessed, though jenkins build actually has been completed successfully.

Version-Release number of selected component (if applicable):
devenv_1715

How reproducible:
Often

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Johnny Liu 2012-04-11 05:55:49 EDT
Personally, I think 360s will be better.
Comment 2 Bill DeCoste 2012-04-11 10:04:34 EDT
Increased to 5 mins (300000ms)
Comment 3 Johnny Liu 2012-04-12 07:42:19 EDT
Re-test this bug jenkins-plugin-openshift-0.5.11-2.el6_2.x86_64 with on devenv-stage_166, it looks like the timeout still is "60s".

Check timestamps in jenkins log: 
<--snip-->
INFO: Connecting to slave phptestbldr...
Apr 12, 2012 7:36:04 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = ac764dee3cd74acea181aa4aa314f149
Apr 12, 2012 7:36:04 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = ac764dee3cd74acea181aa4aa314f149
Apr 12, 2012 7:36:09 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
Apr 12, 2012 7:36:09 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS is resolvable...
<--snip-->
Apr 12, 2012 7:37:11 AM hudson.plugins.openshift.OpenShiftSlave connect
WARNING: Slave DNS not propagated. Timing out.
Apr 12, 2012 7:37:11 AM hudson.plugins.openshift.OpenShiftCloud$2 call
WARNING: Unable to provision node java.io.IOException: Slave DNS not propagated. Timing out.
Apr 12, 2012 7:37:11 AM hudson.plugins.openshift.OpenShiftCloud cancelBuild
<--snip-->
Comment 4 Bill DeCoste 2012-04-12 08:43:24 EDT
Missed changing the jenkins_job_template.xml in li. Should be good to go in the next build.
Comment 5 Johnny Liu 2012-04-16 03:33:55 EDT
Re-test this bug with cartridge-jenkins-client-1.4-0.25.2-1.el6_2.noarchh on devenv-1732, still reproduce.

Check timestamps in jenkins log: 
<--snip-->
Apr 16, 2012 3:30:46 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = c117b18230624c878527375d92eebbbd
Apr 16, 2012 3:30:51 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS for phptestbldr-jialiu.dev.rhcloud.com is resolvable ...
Apr 16, 2012 3:30:51 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
<--snip-->
Apr 16, 2012 3:31:51 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 16, 2012 3:31:56 AM hudson.plugins.openshift.OpenShiftSlave connect
WARNING: Slave DNS not propagated. Timing out.
Apr 16, 2012 3:31:56 AM hudson.plugins.openshift.OpenShiftCloud$2 call
WARNING: Unable to provision node java.io.IOException: Slave DNS not propagated. Timing out.
Apr 16, 2012 3:31:56 AM hudson.plugins.openshift.OpenShiftCloud cancelBuild
INFO: Cancelling build
Apr 16, 2012 3:31:56 AM hudson.plugins.openshift.OpenShiftCloud cancelBuild
WARNING: Build for label phptest-build has been cancelled
Apr 16, 2012 3:31:56 AM hudson.slaves.NodeProvisioner update
WARNING: Provisioned slave phptest-build failed to launch
java.io.IOException: Slave DNS not propagated. Timing out
<--snip-->
Comment 6 Johnny Liu 2012-04-16 03:35:47 EDT
(In reply to comment #5)
> Re-test this bug with cartridge-jenkins-client-1.4-0.25.2-1.el6_2.noarchh on
> devenv-1732, still reproduce.
> 
Testing is executed against devenv_1723
Comment 7 Bill DeCoste 2012-04-16 10:17:51 EDT
One more time.
Comment 8 Johnny Liu 2012-04-18 02:49:40 EDT
The bug is fixed in jenkins_job_template.xml for every cartridge, but currently there is no latest cartridge, once newer cartridge is came out, I will verify this bug.
E.g:
For now, cartridge-php-5.3-0.91.2-1.el6_2.noarch is installed on latest instance.
Comment 9 Johnny Liu 2012-04-23 01:31:27 EDT
Verified this bug on devenv_1735, and PASS.

Check timestamps in jenkins log: 
<--snip-->
Apr 23, 2012 1:24:27 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Established UUID = 1592ec2d410c4e9bb425db6c828ac252
Apr 23, 2012 1:24:33 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Checking to see if slave DNS for wsgitestbldr-jialiu.dev.rhcloud.com is resolvable ...
Apr 23, 2012 1:24:37 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 23, 2012 1:24:42 AM hudson.plugins.openshift.OpenShiftSlave connect
<--snip-->
Apr 23, 2012 1:29:31 AM hudson.plugins.openshift.OpenShiftSlave connect
INFO: Slave DNS not propagated yet, retrying...
Apr 23, 2012 1:29:36 AM hudson.plugins.openshift.OpenShiftSlave connect
WARNING: Slave DNS not propagated. Timing out.
Apr 23, 2012 1:29:36 AM hudson.plugins.openshift.OpenShiftCloud$2 call
WARNING: Unable to provision node java.io.IOException: Slave DNS not propagated. Timing out.
Apr 23, 2012 1:29:36 AM hudson.plugins.openshift.OpenShiftCloud cancelBuild
INFO: Cancelling build
<--snip-->

Note You need to log in before you can comment on or make changes to this bug.