Description of problem: See forum post: https://openshift.redhat.com/community/forums/openshift/not-able-to-access-app-getting-error-500#comment-22270 I worked with the user for a short while on IRC and wasn't able to get to his gear. He is having issues reaching his app in the following ways: 1) rhc app force-stop -d -a main RESULT: Node execution failure (invalid exit code from node). If the problem persists please contact Red Hat support. See http://pastebin.com/A4G9R7Mm 2) rhc app tidy -d -a main RESULT: Node execution failure (invalid exit code from node). If the problem persists please contact Red Hat support. See http://pastebin.com/h8082puk 3) ssh -v 96903524327743dfa45ff9a2f54df967.com It immediately exits See ssh -v 96903524327743dfa45ff9a2f54df967.com 4) https://main-pingbox.rhcloud.com/ is only responsive on the login splash screen. Subsequent db related txns will fail with 500 errors. Usually, in cases where we get a node execution failure, we try to stop the app and run tidy to clear resources before trying to ssh onto the machine to do some debugging (review app/db logs, etc). We're not able to do any of that. Please reach out to gbabun for more details. In the meantime, I've contacted Ops (rharrison/mmcgrath) to try to restart his app and it was suggested to open a bug as well.
NEF
Its likely the app in question exceeded either the limit on the number of its processes or the memory limits for its gear size.
Followed up on the thread.
A useful fix on our end would be for the force-stop functionality to eventually kill all processes owned by the gear's user. That way, an app can be brought down to the point where its manageable and diagnosable by the end user even if it has swamped its resources.
*** Bug 839086 has been marked as a duplicate of this bug. ***
Crankcase commit 6f99f9015 changes the force-stop function so that it doesn't fail if the gear UID is out of resources. It also makes setting the application state not rely on being able to run as the user. Will submit a pull request after the STG cut.
Pull request #245.
Pull request accepted into Crankcase.
verified on devenv_1899 steps; 1.create an application rhc app create -a app1 -t diy-0.1 rhc app cartridge add -a app1 -c postgresql-8.4 2.add test script to app and git push(According to case:[US1155][rhc-cartridge] Implement Force Stop to kill apps) 2.ssh into application, run script to exceed app's process limit ps -ef node down UID of all processes ./multifork.py -c 300 -D 600 3.on the node, monitor all processes of UID in step 2 top -u 500 4.force stop application and tidy application rhc app force-stop -a app1 rhc app tidy -a app1 Results: hjw@my app1$ rhc app force-stop -a app1 Password: ****** RESULT: Success hjw@my app1$ rhc app tidy -a app1 Password: ****** MESSAGES: Stopping app... Running 'git prune' Running 'git gc --aggressive' Emptying log dir: /var/lib/stickshift/2da3269bd9df4536b7fda900d5b0da39/app1/logs/ Emptying tmp dir: /tmp/ Emptying tmp dir: /var/lib/stickshift/2da3269bd9df4536b7fda900d5b0da39/app1/tmp/ Starting app... RESULT: Success application is stopped. on the node, all processes are terminated Additional Info: I have reproduced the problem reported on an older instance. And now this problem is gone. Fixed in devenv_1899
*** Bug 844736 has been marked as a duplicate of this bug. ***