Bug 958514 - On the Web Console, changing the minimum gears setting of scalable app is responding unreliably
Summary: On the Web Console, changing the minimum gears setting of scalable app is res...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Abhishek Gupta
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-01 17:32 UTC by Nam Duong
Modified: 2015-06-11 21:02 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-11 21:02:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nam Duong 2013-05-01 17:32:15 UTC
Description of problem:
During the internal beta, I'm asking users to create several scalable applications with a 3gear minimum on the web framework tier.  While trying it myself, I got 3 different results:
Trial 1:  Got a 504
Trial 2:  Web Console returned with success
Trial 3:  Got an error "Application is currently busy performing another operation. Please try again in a minute"

Note that in all 3 cases, the app scaled up to 3 minimum nodes, but the Web Console returned different responses.

Comment 1 Rajat Chopra 2013-05-06 19:30:57 UTC
Changing the min scale settings results in scale-up. 504 is a timeout (apparently it took > 10 minutes ?)
App being busy (trial 3), is what it says. It must be possibly doing something else. All legal. The current timeout is 10 minutes. So lets find out if you still see the message after 10 minutes.

Comment 2 Nam Duong 2013-05-06 22:45:45 UTC
For clarity, here are the definitions of the 3 trials where I got different results in trying to set scaling limit of min=3 gears:
Trial 1:  Creating a scalable php-5.3 app on a small gear.  Failed when setting min=3 gears (504 thrown)
Trial 2:  creating php-5.3 app on medium gears.  Successful at setting min=3 gears.
Trial 3:  creating EAP app on small gears.  Failed at setting min=3 gears ("Application is currently busy performing another operation. Please try again in a minute")  


I'm on IRC with a BETA user who is running the exact same test case (creating these apps) and is running into the same errors.

Comment 3 Nam Duong 2013-05-07 16:57:34 UTC
Feedback from a user that ran into this same scenario:

    If that operation is always taking a long time, I would give the heads up, like “this can take up to a couple of minutes, etc.”

    I had to try three times before getting the result. If that operation is unstable, what about having some system of queue/request?

    For example: When I click that I want a min of 3 gears… the web console tells me that the request will be processed ASAP and that I will receive an e-mail when it is done. That could be a stopgap or temporary solution. I normally do not try the same step three times.

Comment 4 Rajat Chopra 2013-05-08 16:10:31 UTC
There is a scheduler planned for the future which will run on queued jobs whose status can be viewed/polled by user. Upon completion, a notification can possibly be sent - feedback taken.

Also planned ahead is parallel creation of gears. Do not have a timeline for these features, but unlikely that it will happen before June, 2013.

Comment 5 Abhishek Gupta 2013-05-21 21:44:27 UTC
We have increased the connector execute timeout from 60s to 220s. This is the timeout that is being hit when trying to execute connections between HAProxy and the newly created gears. In case of multiple gears, the work that the connector needs to do is increased and the chances of hitting the timeout increased as well. 

In the mid term, we plan to call execute connections from the broker more frequently (rather than once at the end of all gear creations) to eliminate this issue.

Lowering the severity since the increased timeout should allow these applications to be successfully created.

Comment 6 Abhishek Gupta 2013-05-21 21:46:12 UTC
Pull request pertaining to the increase of the connector execute timeout --> https://github.com/openshift/origin-server/pull/2578

Comment 7 Abhishek Gupta 2013-05-21 22:20:05 UTC
In the mid term, one of the options that we may consider is to parallelize the creation of new gears to reduce the time it takes to scale up.

Comment 8 manoj 2013-09-18 18:56:23 UTC
Please verify if there are any other errors

Comment 9 Jianwei Hou 2013-09-22 02:50:25 UTC
Test on prod, tried 3 times

1st time: Create scalable jbosseap app, set min gears to 3, web console shows 'Maintenance in progress' 
2nd time: Create scalable php app, set min gears to 3, web console shows success
3rd time: Create scalable jbosseap app, set min gears to 3, web console shows 'Maintenance in progress'

In all 3 cases, the app has increased to 3 gears and status is started, there is no other errors.

I'm assigning back to see whether the timeout needs to be increased, thanks


Note You need to log in before you can comment on or make changes to this bug.