This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 991177 - Scaled up gear stuck in 'new' state
Scaled up gear stuck in 'new' state
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Image (Show other bugs)
2.x
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Dan Mace
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-01 15:11 EDT by Ben Browning
Modified: 2015-05-14 20:33 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-02-26 14:07:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ben Browning 2013-08-01 15:11:14 EDT
Description of problem:

When testing a JBoss AS7 scaled up application on OpenShift, I ran into an issue where it looks like HAProxy tried to scale the application up and then subsequently scale it back down before the new gear entered the 'started' state. I recall seeing various 503 (or 504?) errors from haproxy in the tail_all output of the initial gear but did not think to capture them at the time.

After this happened 'rhc app show --gears' continued to list a second gear in a 'new' state that never transitioned to another state. Attempts to scale back down to one gear with rhc scale-cartridge resulted in "Unable to complete the requested operation due to: Node execution failure (invalid exit code from node)".

Any rhc app <start|stop|force-stop> command failed with ""Unable to complete the requested operation due to: Failed to correctly execute all parallel operations."

I tried to delete the application and was unable to, again getting the "invalid exit code from node" error. After stepping away for lunch and coming back I tried to delete the application again and was able to succeed, approximately 1.5 hours after the gear first entered this stuck state.


Document URL: 

Section Number and Name: 

Describe the issue: 

Suggestions for improvement: 

Additional information: 


Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:


Document URL: 

Section Number and Name: 

Describe the issue: 

Suggestions for improvement: 

Additional information: 


Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 zhaozhanqi 2013-08-01 22:51:39 EDT
Tested this issue on STG(devenv-stage_435) three time can not reproduce it.

The following is my steps:

1) create one scalalbe jbossas-7 app
2) add disable_auto_scaling and git push
 touch .openshift/markers/disable_auto_scaling
3) scale up this app using restapi
4) scale down this app using restapi 
5) scale up this app again
6) check all gears state
  rhc app show $app -g
7) do some parallel operations for this app like (stop/start .etc)

all steps can work well.
Comment 2 Mrunal Patel 2013-08-18 22:06:17 EDT
Is this a stock Jboss Application or was it a quickstart? Could you provide more details about the application like if it had any user action hooks?

Thanks,
Mrunal
Comment 3 Ben Browning 2013-08-19 08:05:47 EDT
This is an application created from the TorqueBox quickstart at https://github.com/openshift-quickstart/torquebox-quickstart. There are  pre_start_jbossas-7 and pre_restart_jbossas-7 action hooks, which you can find in the linked GitHub repository.
Comment 4 Mrunal Patel 2013-08-20 10:55:16 EDT
I have similar comments as on the other bug (994130). I wasn't able to reproduce the issue. There were timeouts on the client side but the new gear did come up every time. I am lowering severity as it should not block the release.
Comment 5 Dan McPherson 2014-02-07 19:38:21 EST
We made some significant fixes to the autoscaler last release that I believe fixed the root cause of this bug.  Please reopen if it can be recreated.
Comment 6 Meng Bo 2014-02-08 01:53:05 EST
Checked on devenv_4348, scale-up the app created with TorqueBox quickstart several times, and did not meet the issue.

Move bug to verified.

Note You need to log in before you can comment on or make changes to this bug.