Bug 1257757 - Scaled application takes 4+mins to unidle
Summary: Scaled application takes 4+mins to unidle
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Timothy Williams
QA Contact: Anping Li
URL:
Whiteboard:
: 1170040 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-27 21:41 UTC by Ryan Howe
Modified: 2019-08-15 05:15 UTC (History)
8 users (show)

Fixed In Version: openshift-origin-cartridge-haproxy-1.30.1.1-1.el6op
Doc Type: Bug Fix
Doc Text:
When a scaled application is unidled, HAProxy is started first. Previously, HAProxy then made a blocking `curl` request to every gear in its configuration to unidle it. After HAProxy was finished, the rest of the gears received a 'start' from the broker. This caused a loop to be seen when unidling a scaled application that could cause delays and timeouts to be hit: HAProxy attempted to unidle all gears while the broker was already handling the unidling process, starting another unidling process for each gear. This bug fix removes HAProxy's logic where it attempts to unidle all gears in the application, as the broker already handles this operation. As a result, HAProxy no longer attempts to unidle all gears in an application, instead deferring this process to the broker, and unidling a scaled application takes much less time.
Clone Of:
Environment:
Last Closed: 2015-09-30 16:38:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1844 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 2.2.7 security, bug fix and enhancement update 2015-09-30 20:35:28 UTC

Description Ryan Howe 2015-08-27 21:41:44 UTC
Description of problem:
Scaled application takes 4+mins to unidle after being stopped 

Version-Release number of selected component (if applicable):v2.2


How reproducible:100%


Steps to Reproduce:
1. rhc app create nodejs -a idle -s
2. oo-app-info -a idle
3. ssh node
4. oo-admin-ctl-gears idlegear 55de3ba95a00089d70000641
5. oo-admin-ctl-gears unidlegear 55de3ba95a00089d70000641

Actual results:

4-5mins to wait for gear to move from stopped to running 

Expected results:


Comments:
Seeing the time take exactly 4 minutes on tests. 

- Put echos statements in haproxy/bin/control seeing 4 min gap when haproxy/bin/control start is called 
https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-haproxy/bin/control#L29-L34


function ping_server_gears() {
    #  Ping the server gears and wake 'em up on startup.
echo "($(date)) - ping server gears" | tee -a $support_logs
    for geardns in $(web_gears | cut -f 3 -d ','); do
          echo "($(date)) - function ping_server_gears" | tee -a $support_logs
         [ -z "$geardns" ]  ||  curl "http://$geardns/" > /dev/null 2>&1  ||  :
echo "($(date)) - pinging gears done" | tee -a $support_logs
    done
}

Comment 14 Anping Li 2015-09-18 07:05:16 UTC
The fix wasn't included in this puddle. 

The bug can be reproduced as following.
[root@broker ~]# time oo-admin-ctl-gears unidlegear anlidom-idle-1
Unidling gear anlidom-idle-1 ... [ OK ]

real	3m58.683s
user	0m0.749s
sys	0m0.196

Comment 16 Anping Li 2015-09-22 01:09:14 UTC
Verify and pass. the unidlegear took less time now.

[root@node2 ~]# time oo-admin-ctl-gears unidlegear  anlidom-sphp-1
Unidling gear anlidom-sphp-1 ... [ OK ]

real	0m2.918s
user	0m0.732s
sys	0m0.190s

Comment 17 Timothy Williams 2015-09-23 21:15:53 UTC
*** Bug 1170040 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2015-09-30 16:38:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1844.html


Note You need to log in before you can comment on or make changes to this bug.