Bug 1150026

Summary: Jenkins doesn't reconnect to the slave
Product: OpenShift Container Platform Reporter: Miheer Salunke <misalunk>
Component: ImageStreamsAssignee: John W. Lamb <jolamb>
Status: CLOSED ERRATA QA Contact: libra bugs <libra-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.1.0CC: adellape, bleanhar, bparees, erich, jialiu, jokerman, jolamb, libra-bugs, libra-onpremise-devel, lmeyer, mfojtik, misalunk, mmccomas, ofayans, pep
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-04-06 17:06:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1134206    
Bug Blocks:    

Description Miheer Salunke 2014-10-07 09:39:53 UTC
Created attachment 944486 [details]
Jenkins log

Description of problem:

The jenkins connection with the builder that gets lost after 15 minutes and is not reconnected when launching a new build (it tries to re provision which is not needed, and not supported by the broker).


Version-Release number of selected component (if applicable):
2.1

How reproducible:
Always

Steps to Reproduce:
1 create a new domain
2 create a new application
3 enable jenkins
4 modifiy the time to live to 60 minutes
5 modify the build : add a "sleep 20m" in the script which will block the build for 20 minutes
6 launch


Actual results:
It's jenkins that looses the connection. However, the Openshift plugin doesn't seem to cope very well with this situation : jenkins requests a new slave to be added, the broker detects that it still exists and build is cancelled.


Expected results:
When you detect that the slave already exists, the conclusion should be to trigger a reconnect (you know the slave exists and that jenkins is not connected)... this would at least fix the situation where the disconnect happens after the work is done.


Additional info:

Comment 6 Ben Parees 2014-10-21 16:14:33 UTC
Michal, this is the other jenkins bug I was referring to this morning.

Comment 18 Brenton Leanhardt 2015-03-18 13:44:39 UTC
Would you mind retesting this now that BZ 1134206 has been verified?

Comment 19 Johnny Liu 2015-03-19 08:06:26 UTC
Verified this bug with 2.2/2015-03-18.2 puddle, PASS.

In initial reproduce steps, actually no need step 4, because when jenkins build is running, whatever time-to-live is 15m or 60m, the jenkins builder would never be terminated, time-to-live is for "idled slaves". In old builds, in my case, when the time for jenkins building is longer that 5 mins, the jenkins slave would be offline.

Steps:
1. Create an app with jenkins-client embedded.
2. Add a "build" action hooks script in app's git repo: add a "sleep 20m" in the script which will block the build for 20 minutes


Result:
After 20 mins, the jenkins build is completed successfully.

Comment 21 errata-xmlrpc 2015-04-06 17:06:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0779.html