Bug 991177 - Scaled up gear stuck in 'new' state
Summary: Scaled up gear stuck in 'new' state
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Image
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Dan Mace
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-01 19:11 UTC by Ben Browning
Modified: 2015-05-15 00:33 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-26 19:07:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ben Browning 2013-08-01 19:11:14 UTC
Description of problem:

When testing a JBoss AS7 scaled up application on OpenShift, I ran into an issue where it looks like HAProxy tried to scale the application up and then subsequently scale it back down before the new gear entered the 'started' state. I recall seeing various 503 (or 504?) errors from haproxy in the tail_all output of the initial gear but did not think to capture them at the time.

After this happened 'rhc app show --gears' continued to list a second gear in a 'new' state that never transitioned to another state. Attempts to scale back down to one gear with rhc scale-cartridge resulted in "Unable to complete the requested operation due to: Node execution failure (invalid exit code from node)".

Any rhc app <start|stop|force-stop> command failed with ""Unable to complete the requested operation due to: Failed to correctly execute all parallel operations."

I tried to delete the application and was unable to, again getting the "invalid exit code from node" error. After stepping away for lunch and coming back I tried to delete the application again and was able to succeed, approximately 1.5 hours after the gear first entered this stuck state.


Document URL: 

Section Number and Name: 

Describe the issue: 

Suggestions for improvement: 

Additional information: 


Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:


Document URL: 

Section Number and Name: 

Describe the issue: 

Suggestions for improvement: 

Additional information: 


Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 zhaozhanqi 2013-08-02 02:51:39 UTC
Tested this issue on STG(devenv-stage_435) three time can not reproduce it.

The following is my steps:

1) create one scalalbe jbossas-7 app
2) add disable_auto_scaling and git push
 touch .openshift/markers/disable_auto_scaling
3) scale up this app using restapi
4) scale down this app using restapi 
5) scale up this app again
6) check all gears state
  rhc app show $app -g
7) do some parallel operations for this app like (stop/start .etc)

all steps can work well.

Comment 2 Mrunal Patel 2013-08-19 02:06:17 UTC
Is this a stock Jboss Application or was it a quickstart? Could you provide more details about the application like if it had any user action hooks?

Thanks,
Mrunal

Comment 3 Ben Browning 2013-08-19 12:05:47 UTC
This is an application created from the TorqueBox quickstart at https://github.com/openshift-quickstart/torquebox-quickstart. There are  pre_start_jbossas-7 and pre_restart_jbossas-7 action hooks, which you can find in the linked GitHub repository.

Comment 4 Mrunal Patel 2013-08-20 14:55:16 UTC
I have similar comments as on the other bug (994130). I wasn't able to reproduce the issue. There were timeouts on the client side but the new gear did come up every time. I am lowering severity as it should not block the release.

Comment 5 Dan McPherson 2014-02-08 00:38:21 UTC
We made some significant fixes to the autoscaler last release that I believe fixed the root cause of this bug.  Please reopen if it can be recreated.

Comment 6 Meng Bo 2014-02-08 06:53:05 UTC
Checked on devenv_4348, scale-up the app created with TorqueBox quickstart several times, and did not meet the issue.

Move bug to verified.


Note You need to log in before you can comment on or make changes to this bug.