Description of problem: When a git push to a gear is done, the new deployment is registered with the broker. If this fails for any reason, activation of the deploy fails and HAproxy is left in a state where it returns 503 errors until restarted. How reproducible: 100% so far Steps to Reproduce: 1. Create a scaled app. 2. on broker: service httpd stop 3. git push a change to the app 4. Try to access the app. To clearly see what's happening, start httpd broker again, port-forward from the app, and curl -I each of the forwarded ports. The one from HAproxy will return 503 even though the frameworkd cartridge itself is 200. Actual results: App is unavailable, error 503 Expected results: App is available, even though the broker is not. Additional info: Haven't tested with non-scaled apps. Might be an issue there too. Need to test whether this occurs Online as well.
This doesn't appear to be a problem with non-scaled apps. It's just HAproxy that doesn't survive the deployment; perhaps there's some haproxy reconfigure step that's supposed to complete after the deployment is registered?
It occurs with Online too. Fortunately our brokers are never down.
Fix by cherry-picking from origin-server: commit 19e2995306bff7bea037823675f5cf279bafe880 Author: Paul Morie <pmorie> Date: Tue Jan 21 16:05:29 2014 -0500 Fix bug 1055653 and improve post-receive output readability commit 1fa84300ec27093f0f7f10643f4d46ecd1ba8eec Author: Paul Morie <pmorie> Date: Thu Jan 23 11:06:51 2014 -0500 Fix bug 1055653: handle exceptions from RestClient commit 2a7ca5491b59bbcbbaa7504cd0c383215b28465a Author: Paul Morie <pmorie> Date: Mon Jan 27 10:26:16 2014 -0500 Fix bug 1055653 for cases when httpd is down
In the meantime, workarounds are: 1) Make sure node can always communicate with the broker 2) If a deployment to a scaled app fails activation with messages similar to the following, restart the haproxy cartridge (rhc cartridge restart haproxy): remote: Activation status: failure remote: Activation failed for the following gears: remote: <uuid> (Error activating gear: Connection refused - connect(2)) remote: Deployment completed with status: failure remote: postreceive failed "Error activating gear:" may also indicate other errors, e.g. connection timed out, status 401, 502, 503 depending on the problem with reaching the broker.
verified with puddle-2014-01-30 with service openshift-broker down and rhc port-forward [pruan@homer-linux <DEV> mynodejsapp1]# curl -I 127.0.0.1:8082 HTTP/1.1 200 OK X-Powered-By: Express Content-Type: text/html Content-Length: 5235 Date: Fri, 31 Jan 2014 21:12:00 GMT Connection: keep-alive [pruan@homer-linux <DEV> mynodejsapp1]# curl -I 127.0.0.1:8082 HTTP/1.1 200 OK X-Powered-By: Express Content-Type: text/html Content-Length: 5235 Date: Fri, 31 Jan 2014 21:12:36 GMT Connection: keep-alive [pruan@homer-linux <DEV> mynodejsapp1]# curl -I 127.0.0.1:8081 HTTP/1.0 200 OK Cache-Control: no-cache Connection: close Content-Type: text/html [pruan@homer-linux <DEV> mynodejsapp1]# curl -I 127.0.0.1:8080 HTTP/1.1 200 OK X-Powered-By: Express Content-Type: text/html Content-Length: 5235 Date: Fri, 31 Jan 2014 21:12:42 GMT Set-Cookie: GEAR=local-52eb49663eefa979ea000001; path=/ Cache-control: private
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0209.html