Bug 1013512 - [deploy][origin_runtime_210]It will take too long to restart scalable app with multi gears when there are some traffics passed in
[deploy][origin_runtime_210]It will take too long to restart scalable app wit...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Andy Goldstein
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-30 05:07 EDT by Meng Bo
Modified: 2015-05-14 19:29 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-10-17 09:33:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Meng Bo 2013-09-30 05:07:38 EDT
Description of problem:
Create a scalable app and scale up some times to make the app has multiple gears.
Try to restart the app during some traffics passed in. It will take 3 minutes to finish the restart and the web page return 503 for at least 2 minutes.

Version-Release number of selected component (if applicable):
fork_ami_deploy_873

How reproducible:
always

Steps to Reproduce:
1.Create scalable app 
2.Scale up to 3 gears
3.Use ab to send traffic to the app
# ab -c 30 -n 100000 http://<app_url>
4.Restart the app during the ab running

Actual results:
It will take too long to restart the app, and the app is down for about 2 minutes.

# time rhc app restart php1s
RESULT:
php1s restarted

real	3m22.264s
user	0m0.717s
sys	0m0.044s


Expected results:
It should not take so much time to restart the app.

Additional info:
Find the following log in platform.log
The haproxy started to restart at 03:09:22 and finished it at 03:11:24

September 30 03:09:20 INFO 52491c761d6e88c4d8000006 disable-server against 'haproxy'
September 30 03:09:22 INFO Shell command '/sbin/runuser -s /bin/sh 52491c761d6e88c4d8000006 -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c0,c1000' /bin/sh -c \"set -e; /var/lib/openshift/52491c761d6e88c4d8000006/haproxy/bin/control disable-server 52491c761d6e88c4d8000006\""' ran. rc=0 out=Disabling server 52491c761d6e88c4d8000006

September 30 03:09:22 INFO 52491c761d6e88c4d8000006 restart against 'haproxy'
September 30 03:11:24 INFO Shell command '/sbin/runuser -s /bin/sh 52491c761d6e88c4d8000006 -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c0,c1000' /bin/sh -c \"set -e; /var/lib/openshift/52491c761d6e88c4d8000006/haproxy/bin/control restart \""' ran. rc=0 out=Restarted HAProxy instance

September 30 03:11:24 INFO 52491c761d6e88c4d8000006 enable-server against 'haproxy'
September 30 03:11:25 INFO Shell command '/sbin/runuser -s /bin/sh 52491c761d6e88c4d8000006 -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c0,c1000' /bin/sh -c \"set -e; /var/lib/openshift/52491c761d6e88c4d8000006/haproxy/bin/control enable-server 52491c761d6e88c4d8000006\""' ran. rc=0 out=Enabling server 52491c761d6e88c4d8000006

To compare, do the same operation on a devenv_3844, it will take about 1 minutes to restart the scalable app with 3 gears.
# time rhc app restart php1s
RESULT:
php1s restarted

real	1m1.822s
user	0m0.366s
sys	0m0.028s
Comment 1 Andy Goldstein 2013-10-07 11:18:36 EDT
I just tested on what will soon be fork_ami_deploy_888 (I tested on 887 after fixing the errors that made 887 fail) and see much better results. My app was using the php-5.3 cartridge, fyi:

time rhc app restart ps
ps restarted

real	0m42.282s
user	0m1.162s
sys	0m0.333s

Please retest when >= 888 is available and let me know the results. Thanks!

Also, please note that 'rhc app restart' will restart the entire application, including the haproxy cartridge, so there will be some # of tests in the 'ab' run that fail while the haproxy cartridge is restarting. If you want to do a rolling restart of just the web cartridge, you can do 'rhc cartridge restart -a <app> <cartridge>' (support for restarting all gears for a specific cartridge is still in development and will be ready soon - I'll update this when it's ready).
Comment 2 Meng Bo 2013-10-08 07:21:09 EDT
I have retested on fork_ami_889, it will still cost to much time to restart the app during the ab is running.

# time rhc app restart php1s 
RESULT:
php1s restarted

real	2m50.007s
user	0m1.150s
sys	0m0.707s


And this condition only happens when the ab is running, if restart the app without benchmark the app, it will restart very fast.

# time rhc app restart php1s 
RESULT:
php1s restarted

real	0m26.580s
user	0m0.750s
sys	0m0.400s
Comment 3 Andy Goldstein 2013-10-08 10:18:04 EDT
Using devenv_3873, I am running the ab command you listed above, using a scalable php-5.3 application with 3 gears, and this is the result I get:

[root@ip-10-166-61-99 ~]# time rhc cartridge restart -a p3 php-5.3
Restarting php-5.3 ... done

real	0m22.864s
user	0m0.850s
sys	0m0.461s

Note that 'rhc cartridge restart' now supports rolling restarts of the cartridge across all the application's gears.

I'm running ab on the devenv itself - is that what you're doing?

I'm unable to reproduce the slowdown you're seeing. Is your application anything more than the default template that comes with the PHP cartridge?
Comment 4 Meng Bo 2013-10-08 23:03:28 EDT
Tested on devenv_3874, with php cartridge and 3 gears in the scalable app. The issue is gone.

[root@ip-10-145-147-198 ~]# time rhc app restart php1s
RESULT:
php1s restarted

real	0m53.123s
user	0m0.741s
sys	0m0.434s
[root@ip-10-145-147-198 ~]# time rhc cartridge-restart php-5.3 -a php1s 
Restarting php-5.3 ... done

real	0m53.056s
user	0m0.948s
sys	0m0.527s


And I just found the fork_ami_889 is not a deploy ami... so maybe the last comment is useless.

The current behaviour is acceptable, please help move the bug to ON_QA, and I will close it. Thanks.
Comment 5 Meng Bo 2013-10-09 23:30:23 EDT
Checked on devenv_3880, issue cannot be reproduced.

Move bug to verified.

Note You need to log in before you can comment on or make changes to this bug.