Bug 1257757 - Scaled application takes 4+mins to unidle
Scaled application takes 4+mins to unidle
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers (Show other bugs)
2.2.0
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Timothy Williams
Anping Li
:
: 1170040 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-27 17:41 EDT by Ryan Howe
Modified: 2015-10-12 14:06 EDT (History)
8 users (show)

See Also:
Fixed In Version: openshift-origin-cartridge-haproxy-1.30.1.1-1.el6op
Doc Type: Bug Fix
Doc Text:
When a scaled application is unidled, HAProxy is started first. Previously, HAProxy then made a blocking `curl` request to every gear in its configuration to unidle it. After HAProxy was finished, the rest of the gears received a 'start' from the broker. This caused a loop to be seen when unidling a scaled application that could cause delays and timeouts to be hit: HAProxy attempted to unidle all gears while the broker was already handling the unidling process, starting another unidling process for each gear. This bug fix removes HAProxy's logic where it attempts to unidle all gears in the application, as the broker already handles this operation. As a result, HAProxy no longer attempts to unidle all gears in an application, instead deferring this process to the broker, and unidling a scaled application takes much less time.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-09-30 12:38:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ryan Howe 2015-08-27 17:41:44 EDT
Description of problem:
Scaled application takes 4+mins to unidle after being stopped 

Version-Release number of selected component (if applicable):v2.2


How reproducible:100%


Steps to Reproduce:
1. rhc app create nodejs -a idle -s
2. oo-app-info -a idle
3. ssh node
4. oo-admin-ctl-gears idlegear 55de3ba95a00089d70000641
5. oo-admin-ctl-gears unidlegear 55de3ba95a00089d70000641

Actual results:

4-5mins to wait for gear to move from stopped to running 

Expected results:


Comments:
Seeing the time take exactly 4 minutes on tests. 

- Put echos statements in haproxy/bin/control seeing 4 min gap when haproxy/bin/control start is called 
https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-haproxy/bin/control#L29-L34


function ping_server_gears() {
    #  Ping the server gears and wake 'em up on startup.
echo "($(date)) - ping server gears" | tee -a $support_logs
    for geardns in $(web_gears | cut -f 3 -d ','); do
          echo "($(date)) - function ping_server_gears" | tee -a $support_logs
         [ -z "$geardns" ]  ||  curl "http://$geardns/" > /dev/null 2>&1  ||  :
echo "($(date)) - pinging gears done" | tee -a $support_logs
    done
}
Comment 14 Anping Li 2015-09-18 03:05:16 EDT
The fix wasn't included in this puddle. 

The bug can be reproduced as following.
[root@broker ~]# time oo-admin-ctl-gears unidlegear anlidom-idle-1
Unidling gear anlidom-idle-1 ... [ OK ]

real	3m58.683s
user	0m0.749s
sys	0m0.196
Comment 16 Anping Li 2015-09-21 21:09:14 EDT
Verify and pass. the unidlegear took less time now.

[root@node2 ~]# time oo-admin-ctl-gears unidlegear  anlidom-sphp-1
Unidling gear anlidom-sphp-1 ... [ OK ]

real	0m2.918s
user	0m0.732s
sys	0m0.190s
Comment 17 Timothy Williams 2015-09-23 17:15:53 EDT
*** Bug 1170040 has been marked as a duplicate of this bug. ***
Comment 20 errata-xmlrpc 2015-09-30 12:38:40 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1844.html

Note You need to log in before you can comment on or make changes to this bug.