+++ This bug was initially created as a clone of Bug #906863 +++ Description of problem: Created a new EAP application, made change and verified. Then clicked 'Enable Jenkins Builds'. Jenkins master was successfully created as was the Jenkins job. However, making another change in the application led to the following failure: mhicks@dhcp-185-24 eapdemo$ git push Counting objects: 11, done. Delta compression using up to 4 threads. Compressing objects: 100% (5/5), done. Writing objects: 100% (6/6), 445 bytes, done. Total 6 (delta 4), reused 0 (delta 0) remote: restart_on_add=false remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-cloudydemo.rhcloud.com/job/eapdemo-build remote: remote: Waiting for build to schedule............................................../opt/rh/ruby193/root/usr/share/ruby/json/common.rb:155:in `parse': 757: unexpected token at '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> (JSON::ParserError) remote: <html><head> remote: <title>502 Proxy Error</title> remote: </head><body> remote: <h1>Proxy Error</h1> remote: <p>The proxy server received an invalid remote: response from an upstream server.<br /> remote: The proxy server could not handle the request <em><a href="/job/eapdemo-build/api/json">GET /job/eapdemo-build/api/json</a></em>.<p> remote: Reason: <strong>Error reading from remote server</strong></p></p> remote: <hr> remote: <address>Apache/2.2.15 (Red Hat) Server at jenkins-cloudydemo.rhcloud.com Port 443</address> remote: </body></html>' remote: from /opt/rh/ruby193/root/usr/share/ruby/json/common.rb:155:in `parse' remote: from /usr/libexec/openshift/cartridges/abstract/info/bin/jenkins_build:24:in `get_jobs_info' remote: from /usr/libexec/openshift/cartridges/abstract/info/bin/jenkins_build:38:in `get_build_num' remote: from /usr/libexec/openshift/cartridges/abstract/info/bin/jenkins_build:75:in `<main>' remote: !!!!!!!! remote: Deployment Halted! remote: If the build failed before the deploy step, your previous remote: build is still running. Otherwise, your application may be remote: partially deployed or inaccessible. remote: Fix the build and try again. remote: !!!!!!!! To ssh://94af85cf61cc4e28b95bfad484a218bd.com/~/git/eapdemo.git/ b39d16f..9e66183 master -> master mhicks@dhcp-185-24 eapdemo$ Version-Release number of selected component (if applicable): Tested against online on 2/1/2013 Actual results: Proxy error passed to the client and an error that the deployment was halted. The deployment WAS NOT halted and the application was successfully deployed. Expected results: The deployment is actually a sucess. Hence the client shouldn't be provided with a proxy error. Additional Notes: Sometimes an error is reported in the jenkins server logs as below : FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel One of the cause for the above could be that the watchman gear_state_plugin is restarting the jenkins gears. This happens because, during a build, a gear is put into a 'building' state. The gear state plugin checks to ensure that the state of the gear matches the gear's status. I believe disabling the gear_state_plugin could help to resolve this issue.
I believe this issue was a symptom of https://bugzilla.redhat.com/show_bug.cgi?id=1134686. I think I've verified that the fix for #1134686 also fixes this bug: 0. Edited the watchman gear state plugin configuration to be aggressive (20s delay) in order to improve my chances of watchman finding the builder, restarted watchman 1. Created a jbosseap app via rhc 2. Added Jenkins via web console (just to be as close to the reported scenario as possible) 3. Used a git push to invoke a build of jbosseap 4. Observed success after repeated attempts, no relevant watchman activity 5. Reverted patch from #1134686 6. Used a git push to invoke a build of jbosseap 7. Observed the reported client failure coinciding with log entries indicating the watchman restarted the builder 8. Re-applied patch from #1134686, restarted watchman 9. Manually deleted the now defunct/disconnected builder from the Jenkins console 10. Used a git push to invoke a build of jbosseap 11. Observed success after repeated attempts, no relevant watchman activity I'm moving this issue to ON_QA. Please re-open it if the bug can be reproduced in an environment also containing the fix for #1134686.
Checked on devenv-stage_1138, with the steps described in c#1 Without the patch, the jenkins build will show error in the output. And after apply the patch, it can be done without errors, Move the bug to verified.
Jaspreet, The fix for this was rolled out to production. Are you still experiencing the issue?
Yes Dan, The issue is still seen. I tried to git push today. The first time it succedded but the second push resulted in the same failure as below: [root@jkaur webapp]# git push Counting objects: 11, done. Delta compression using up to 4 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (6/6), 421 bytes, done. Total 6 (delta 3), reused 0 (delta 0) remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-jasdomain.rhcloud.com/job/jbosseap-build remote: remote: Waiting for build to schedule......Done remote: Waiting for job to complete................................................./opt/rh/ruby193/root/usr/share/ruby/json/common.rb:155:in `parse': 757: unexpected token at '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> (JSON::ParserError) remote: <html><head> remote: <title>502 Proxy Error</title> remote: </head><body> remote: <h1>Proxy Error</h1> remote: <p>The proxy server received an invalid remote: response from an upstream server.<br /> remote: The proxy server could not handle the request <em><a href="/job/jbosseap-build/2/api/json">GET /job/jbosseap-build/2/api/json</a></em>.<p> remote: Reason: <strong>Error reading from remote server</strong></p></p> remote: <hr> remote: <address>Apache/2.2.15 (Red Hat) Server at jenkins-jasdomain.rhcloud.com Port 443</address> remote: </body></html>' remote: from /opt/rh/ruby193/root/usr/share/ruby/json/common.rb:155:in `parse' remote: from /var/lib/openshift/55029cc3e0b8cd2ae200011c/jenkins-client//bin/jenkins_build:35:in `get_job_info' remote: from /var/lib/openshift/55029cc3e0b8cd2ae200011c/jenkins-client//bin/jenkins_build:91:in `<main>' remote: !!!!!!!! remote: Deployment Halted! remote: If the build failed before the deploy step, your previous remote: build is still running. Otherwise, your application may be remote: partially deployed or inaccessible. remote: Fix the build and try again. remote: !!!!!!!! remote: An error occurred executing 'gear postreceive' (exit code: 1) remote: Error message: CLIENT_ERROR: Failed to execute: 'control post-receive' for /var/lib/openshift/55029cc3e0b8cd2ae200011c/jenkins-client remote: remote: For more details about the problem, try running the command again with the '--trace' option. To ssh://55029cc3e0b8cd2ae200011c.com/~/git/jbosseap.git/ 6708c48..d85377c master -> master
This is an entirely separate issue, and I believe it is related to https://bz.apache.org/bugzilla/show_bug.cgi?id=37770 Unfortunately, the only suggestions there involve turning off keepalives & connection pooling in certain cases, and the performance benefits of those features were the primary reason we switched back to vhosts in OpenShift Online. I think that maybe a more reasonable solution would be to have some error handling & retry logic in the jenkins-client code. Currently, the specific problem that the user sees is that get_jobs_info exec's curl and then expects that the data returned is valid JSON, regardless of the HTTP status. Of course, the limitation of using curl throughout this code instead of a proper ruby http client library is that we cannot get both the http status _and_ the content back from the subshell ; we'd have to write the content to a file and then read it back in. So maybe we just rescue and pass (or emit a warning) on JSON parsing errors, and leave it at that.
I agree with Andy's assessment in comment #6. The jenkins_build script needs converted to use net/http so it can gracefully handle errors.
FYI, I'm nearly done with a significant refactor to jenkins_build to use the Ruby net/http library. Using net/http allows the program to easily retry requests and correctly interpret responses. There's a WIP PR for discussion: https://github.com/openshift/origin-server/pull/6098
https://github.com/openshift/origin-server/pull/6098
Checked on devenv_5475, Jenkins build will return meaningful message when the build failed now. Modify the jenkins_build script to simulate the failure, 1. Make the get_jobs_info fail: remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-bmengdev.dev.rhcloud.com/job/perl1-build remote: remote: Couldn't look up the last build number. Jenkins may be inaccessible. remote: Error: Couldn't list jobs: HTTP 200 OK 2. Make the schedule_build fail remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-bmengdev.dev.rhcloud.com/job/perl1-build remote: remote: Couldn't schedule build. Jenkins may be inaccessible. remote: Error: Couldn't schedule build: HTTP 201 Created Move the bug to verified.
Also do regression testing for the jenkins client, no issue found.
this should be released on 4/13
Hello, Is this issue resolved now as per the above date? or if there is still more time to be released. Regards, Jaspreet