Bug 838365 - "rhc app force-stop -a {appName} throws error consistently for one of our users: Node execution failure (invalid exit code from node)
"rhc app force-stop -a {appName} throws error consistently for one of our use...
Product: OpenShift Origin
Classification: Red Hat
Component: Containers (Show other bugs)
Unspecified Unspecified
high Severity low
: ---
: ---
Assigned To: Rob Millner
libra bugs
: Triaged
: 839086 844736 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2012-07-08 16:49 EDT by Nam Duong
Modified: 2015-05-14 18:56 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-08-07 16:42:23 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Nam Duong 2012-07-08 16:49:45 EDT
Description of problem:
See forum post: 

I worked with the user for a short while on IRC and wasn't able to get to his gear.  He is having issues reaching his app in the following ways:
1) rhc app force-stop -d -a main
Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support.
See http://pastebin.com/A4G9R7Mm

2) rhc app tidy -d -a main
Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support.
See http://pastebin.com/h8082puk

3) ssh -v 96903524327743dfa45ff9a2f54df967@main-pingbox.rhcloud.com
It immediately exits
See ssh -v 96903524327743dfa45ff9a2f54df967@main-pingbox.rhcloud.com

4) https://main-pingbox.rhcloud.com/ is only responsive on the login splash screen.  Subsequent db related txns will fail with 500 errors.

Usually, in cases where we get a node execution failure, we try to stop the app and run tidy to clear resources before trying to ssh onto the machine to do some debugging (review app/db logs, etc).  We're not able to do any of that.  Please reach out to gbabun for more details.  

In the meantime, I've contacted Ops (rharrison/mmcgrath) to try to restart his app and it was suggested to open a bug as well.
Comment 1 Clayton Coleman 2012-07-09 10:07:37 EDT
Comment 2 Rob Millner 2012-07-10 19:56:49 EDT
Its likely the app in question exceeded either the limit on the number of its processes or the memory limits for its gear size.
Comment 3 Rob Millner 2012-07-10 20:07:09 EDT
Followed up on the thread.
Comment 4 Rob Millner 2012-07-10 21:20:24 EDT
A useful fix on our end would be for the force-stop functionality to eventually kill all processes owned by the gear's user.

That way, an app can be brought down to the point where its manageable and diagnosable by the end user even if it has swamped its resources.
Comment 5 Rob Millner 2012-07-10 21:22:01 EDT
*** Bug 839086 has been marked as a duplicate of this bug. ***
Comment 6 Rob Millner 2012-07-11 19:14:16 EDT
Crankcase commit 6f99f9015 changes the force-stop function so that it doesn't fail if the gear UID is out of resources.  It also makes setting the application state not rely on being able to run as the user.

Will submit a pull request after the STG cut.
Comment 7 Rob Millner 2012-07-17 13:17:31 EDT
Pull request #245.
Comment 8 Rob Millner 2012-07-17 17:39:43 EDT
Pull request accepted into Crankcase.
Comment 9 Jianwei Hou 2012-07-18 03:52:35 EDT
verified on devenv_1899

1.create an application
  rhc app create -a app1 -t diy-0.1
  rhc app cartridge add -a app1 -c postgresql-8.4
2.add test script to app and git push(According to case:[US1155][rhc-cartridge] Implement Force Stop to kill apps)
2.ssh into application, run script to exceed app's process limit
  ps -ef
  node down UID of all processes
  ./multifork.py -c 300 -D 600
3.on the node, monitor all processes of UID in step 2
  top -u 500
4.force stop application and tidy application
  rhc app force-stop -a app1
  rhc app tidy -a app1

hjw@my app1$ rhc app force-stop -a app1
Password: ******


hjw@my app1$ rhc app tidy -a app1
Password: ******

Stopping app...
Running 'git prune'
Running 'git gc --aggressive'
Emptying log dir: /var/lib/stickshift/2da3269bd9df4536b7fda900d5b0da39/app1/logs/
Emptying tmp dir: /tmp/
Emptying tmp dir: /var/lib/stickshift/2da3269bd9df4536b7fda900d5b0da39/app1/tmp/
Starting app...


application is stopped. 
on the node, all processes are terminated

Additional Info:
I have reproduced the problem reported on an older instance. And now this problem is gone.

Fixed in devenv_1899
Comment 10 Rob Millner 2012-07-31 14:02:57 EDT
*** Bug 844736 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.