Bug 835778 - Jenkins build often fail aperiodically
Summary: Jenkins build often fail aperiodically
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: Dan McPherson
QA Contact: libra bugs
URL:
Whiteboard:
: 955305 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-27 05:58 UTC by Meng Bo
Modified: 2015-05-14 22:55 UTC (History)
7 users (show)

Fixed In Version: jenkins-plugin-openshift 0.6.14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-05-11 20:08:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
jenkins log (8.92 KB, text/x-log)
2013-02-06 07:39 UTC, Xiaoli Tian
no flags Details
The log of the jenkins server app while doing jenkins build (97.25 KB, text/x-log)
2013-02-06 12:29 UTC, jizhao
no flags Details

Description Meng Bo 2012-06-27 05:58:22 UTC
Description of problem:
Create app with jenkins embedded, make some changes to the app and push it via jenkins. Sometimes got 'remote: **BUILD FAILED/CANCELLED**' from the client output, but when login to the jenkins to check the build status, it said the build was successful.

Version-Release number of selected component (if applicable):
cartridge-jenkins-1.4-0.95.1-1.el6_3.noarch
cartridge-jenkins-client-1.4-0.29.1-1.el6_3.noarch

How reproducible:
sometimes

Steps to Reproduce:
1.create app with jenkins embedded
2.do some change and push the build
3.
  
Actual results:
Sometimes build successfully but show build failed in client output.

<------->
remote: Waiting for build to schedule...........................................................
remote: **BUILD FAILED/CANCELLED**
remote: Please see the Jenkins log for more details via rhc-tail-files
remote: !!!!!!!!
remote: Deployment Halted!
remote: If the build failed before the deploy step, your previous
remote: build is still running.  Otherwise, your application may be
remote: partially deployed or inaccessible.
remote: Fix the build and try again.
remote: !!!!!!!!
<-------->

Expected results:
Should not show build failed if the build is actually successful.

Additional info:
Upload the jenkins log as attachment.
This condition often occurs when first time build the app via jenkins.

Comment 1 Xiaoli Tian 2013-02-06 06:05:34 UTC
It often failed recently, the version under testing today is devenv_2779, tried 3 times, failed eventually, more jenkins log will be attached below

' >> jenkins_trigger.txt && git add . && git commit -amt && git push
[master 182bf66] t
2 files changed, 4 insertions(+), 2 deletions(-)
create mode 100644 jenkins_trigger.txt
remote: restart_on_add=true
remote: Executing Jenkins build.
remote:
remote: You can track your build at https://jenkinskgipl-9gyxksj2g0.dev.rhcloud.com/job/nodejsqg6asm1-build
remote:
remote: Waiting for build to schedule...........................
remote: **BUILD FAILED/CANCELLED**
remote: Please see the Jenkins log for more details via rhc-tail-files
remote: !!!!!!!!
remote: Deployment Halted!
remote: If the build failed before the deploy step, your previous
remote: build is still running. Otherwise, your application may be
remote: partially deployed or inaccessible.
remote: Fix the build and try again.
remote: !!!!!!!!
To ssh://5111ea295bcc8304c20003a4.rhcloud.com/~/git/nodejsqg6asm1.git/
c0c6887..182bf66 master -> master
Command Return: 0
--------------------------------------------------------------------------------
Trying to trigger jenkins build - 1
--------------------------------------------------------------------------------

Running Command - cd ./nodejsqg6asm1 && echo '1
' >> jenkins_trigger.txt && git add . && git commit -amt && git push
[master 9de0619] t
1 files changed, 2 insertions(+), 0 deletions(-)
remote: restart_on_add=true
remote: Executing Jenkins build.
remote:
remote: You can track your build at https://jenkinskgipl-9gyxksj2g0.dev.rhcloud.com/job/nodejsqg6asm1-build
remote:
remote: Waiting for build to schedule...........................
remote: **BUILD FAILED/CANCELLED**
remote: Please see the Jenkins log for more details via rhc-tail-files
remote: !!!!!!!!
remote: Deployment Halted!
remote: If the build failed before the deploy step, your previous
remote: build is still running. Otherwise, your application may be
remote: partially deployed or inaccessible.
remote: Fix the build and try again.
remote: !!!!!!!!
To ssh://5111ea295bcc8304c20003a4.rhcloud.com/~/git/nodejsqg6asm1.git/
182bf66..9de0619 master -> master
Command Return: 0
--------------------------------------------------------------------------------
Trying to trigger jenkins build - 2
--------------------------------------------------------------------------------

Running Command - cd ./nodejsqg6asm1 && echo '2
' >> jenkins_trigger.txt && git add . && git commit -amt && git push
[master 7d6dfc8] t
1 files changed, 2 insertions(+), 0 deletions(-)
remote: restart_on_add=true
remote: Executing Jenkins build.
remote:
remote: You can track your build at https://jenkinskgipl-9gyxksj2g0.dev.rhcloud.com/job/nodejsqg6asm1-build
remote:
remote: Waiting for build to schedule...................................................
remote: **BUILD FAILED/CANCELLED**
remote: Please see the Jenkins log for more details via rhc-tail-files
remote: !!!!!!!!
remote: Deployment Halted!
remote: If the build failed before the deploy step, your previous
remote: build is still running. Otherwise, your application may be
remote: partially deployed or inaccessible.
remote: Fix the build and try again.
remote: !!!!!!!!
To ssh://5111ea295bcc8304c20003a4.rhcloud.com/~/git/nodejsqg6asm1.git/
9de0619..7d6dfc8 master -> master
Command Return: 0

Comment 2 Xiaoli Tian 2013-02-06 07:39:22 UTC
Created attachment 693743 [details]
jenkins log

Attach jenkins log for debugging

Comment 3 jizhao 2013-02-06 12:29:30 UTC
Created attachment 693900 [details]
The log of the jenkins server app while doing jenkins build

Comment 4 Bill DeCoste 2013-02-06 17:16:41 UTC
Only occurs in medium images - unable to recreate in large images.

Changed GET read logic on reload - available() call was returning 0.

trim config.xml and added debugging

Comment 5 Meng Bo 2013-02-07 09:38:01 UTC
Since this condition appears not so frequently on today's testing, lower the severity of the bug.

But for close, we need to take more time to observe it. Thanks.

Comment 6 Meng Bo 2013-02-08 10:32:27 UTC
Still meet the issue when building via jenkins. Assign it back for future investigating.

Comment 7 Bill DeCoste 2013-02-08 12:46:53 UTC
I'll retest

Comment 8 Bill DeCoste 2013-02-08 13:59:11 UTC
Meng, can you post the jenkins.log when you see a failure? There were 2 different exceptions reported in this bug. Thanks

Comment 9 Bill DeCoste 2013-02-08 14:56:23 UTC
I'm unable to recreate on a medium instance. On a small instance I can consistently get timeouts in the rest api calls that results in build failures. I can increase the timeout settings from the jenkins client but I'm leery to do this as the default are already pretty high. (60s connect and 10s read). The increased timeouts fix the problem and builds succeed consistenly on small instances.

If QE can confirm via logs that we are seeing these timeouts in medium instances I'll up the timeouts.

Comment 10 Chandrika Gole 2013-02-08 20:25:21 UTC
Couldn't reproduce this on a medium instance
Tried it on devenv_2794 (ami-ba53c5d3) and fork_ami_BZ894248_459 

Steps to reproduce -

1. Create a php application
# rhc-create-app -t  jbossas-7 -a app11 -p xx

2. Create a jenkins app
# rhc-create-app -t jenkins-1.4 -a app22 -p xx

3. Embed jenkins client to app11
# rhc-ctl-app -a app11 -e add-jenkins-client-1.4 -p xx

4. Make some change in app11's git repo, and git push


5. During executing Jenkins build, do the following:
1). access the jenkins build job URL, check build job is running
http://app22-jialiu.dev.rhcloud.com/job/app11-build/
2). access your app's url to check your application is still available.
http://app11-jialiu.dev.rhcloud.com/
3). Using rhc-user-info to check the "build app" is listed there.

6. After build job is finished, do the following:
1). access the jenkins build job URL, check build job is completed, and build log is there.
http://app22-jialiu.dev.rhcloud.com/job/app11-build/
2). access your app's url to check your application is still available, and your change take effect.
http://app11-jialiu.dev.rhcloud.com/
3). wait 15 minutes, the 'build app' will be destroyed and will no longer show up with the rhc-user-info command

Comment 11 Dan McPherson 2013-04-24 04:47:54 UTC
*** Bug 955305 has been marked as a duplicate of this bug. ***

Comment 12 Dan McPherson 2013-04-24 05:06:38 UTC
Took me a couple hours to recreate the error.  Which in the end gave me no additional debug info.  I then looked at the code and think I spotted the logic flaw.  What was happening was we would check for the new build to be launched.  If it wasn't launched we would check the queue to see if it was still being launched.  If it wasn't there we reported failure.  However, if after checking whether it was launched but before checking if it was cancelled, the build was actually launched.  It would look at the same as it was cancelled.  The fix is to recheck if it was launched before declaring it was actually cancelled.

https://github.com/openshift/origin-server/pull/2210

This fixes the problem for v1 and v2 carts.

Comment 14 Meng Bo 2013-05-07 06:49:37 UTC
Check this case on devenv_3188, with many times try, and various cartridge types.

Issue did not appear. Move the bug to verified.


Note You need to log in before you can comment on or make changes to this bug.