Bug 996948 - [origin_runtime_209] sessions_per_gear cannot reflect the actual sessions for each gear
[origin_runtime_209] sessions_per_gear cannot reflect the actual sessions for...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Mrunal Patel
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-14 06:51 EDT by Meng Bo
Modified: 2015-05-14 19:26 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-29 08:50:52 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Meng Bo 2013-08-14 06:51:20 EDT
Description of problem:
The gear capacity of session is decreased with the increasing of the haproxy instance number.

eg,
For app with one haproxy, it will not auto-scale up till 15 sessions per gear,
For app with two haproxies, it will not auto-scale up till 8 sessions per gear,
For app with three haproxies, it will not auto-scale up till 5 sessions per gear.

This will make the app keep scaling up with a constant request number.
eg.
#ab -n 10000000 -c 40 http://php1s-bmengdev.dev.rhcloud.com/

My scaling app scaled up to 20 gears with this benchmark, and the gear is still increasing.


Version-Release number of selected component (if applicable):
devenv_3647

How reproducible:
always

Steps to Reproduce:
1.Set multiplier to 2 and max to 5 for haproxy cartridge
2.Create scalable app
3.Keep sending requests to the app via ab with same concurrency
4.Check the gear number and check the scale_events.log

Actual results:
The gear number keep growing since when there are 5 haproxies, only 3 session per gear will trigger a scale up.

Expected results:
The gear capacity should not be affected when the haproxy instance increasing.

Additional info:
The following is the scale_events.log of my app when I am sending request with:
ab -n 10000000 -c 40 http://php1s-bmengdev.dev.rhcloud.com/


I, [2013-08-14T03:56:13.060325 #22457]  INFO -- : GEAR_UP - capacity: 137.5% gear_count: 10 sessions: 55 up_thresh: 90.0%
I, [2013-08-14T03:56:39.598820 #23563]  INFO -- : GEAR_UP - capacity: 105.0% gear_count: 10 sessions: 42 up_thresh: 90.0%
I, [2013-08-14T04:02:30.068274 #30878]  INFO -- : GEAR_UP - capacity: 97.72727272727273% gear_count: 11 sessions: 43 up_thresh: 90.0%
I, [2013-08-14T04:03:49.715540 #32465]  INFO -- : GEAR_UP - capacity: 115.90909090909092% gear_count: 11 sessions: 51 up_thresh: 90.0%
I, [2013-08-14T04:05:21.880374 #32465]  INFO -- : GEAR_UP - capacity: 104.54545454545455% gear_count: 11 sessions: 46 up_thresh: 90.0%
I, [2013-08-14T04:09:40.598243 #8020]  INFO -- : GEAR_UP - capacity: 91.66666666666666% gear_count: 12 sessions: 44 up_thresh: 90.0%
I, [2013-08-14T04:10:57.439684 #10434]  INFO -- : GEAR_UP - capacity: 91.14583333333333% gear_count: 12 sessions: 35 up_thresh: 90.0%
I, [2013-08-14T04:15:40.110510 #17818]  INFO -- : GEAR_UP - capacity: 103.3653846153846% gear_count: 13 sessions: 43 up_thresh: 90.0%
I, [2013-08-14T04:16:11.529953 #18380]  INFO -- : GEAR_UP - capacity: 100.96153846153845% gear_count: 13 sessions: 42 up_thresh: 90.0%
I, [2013-08-14T04:21:44.604161 #18380]  INFO -- : GEAR_UP - capacity: 136.16071428571428% gear_count: 14 sessions: 61 up_thresh: 90.0%
I, [2013-08-14T04:22:14.712717 #28045]  INFO -- : GEAR_UP - capacity: 102.67857142857142% gear_count: 14 sessions: 46 up_thresh: 90.0%
I, [2013-08-14T04:22:47.714179 #29164]  INFO -- : GEAR_UP - capacity: 91.51785714285714% gear_count: 14 sessions: 41 up_thresh: 90.0%
I, [2013-08-14T04:24:35.723918 #29164]  INFO -- : GEAR_UP - capacity: 109.375% gear_count: 14 sessions: 49 up_thresh: 90.0%
I, [2013-08-14T04:27:08.684542 #29164]  INFO -- : GEAR_UP - capacity: 91.51785714285714% gear_count: 14 sessions: 41 up_thresh: 90.0%
I, [2013-08-14T04:29:20.584283 #7912]  INFO -- : GEAR_UP - capacity: 118.75% gear_count: 15 sessions: 57 up_thresh: 90.0%
I, [2013-08-14T04:30:56.340399 #7912]  INFO -- : GEAR_UP - capacity: 120.83333333333333% gear_count: 15 sessions: 58 up_thresh: 90.0%
I, [2013-08-14T04:31:35.287587 #12671]  INFO -- : GEAR_UP - capacity: 91.66666666666666% gear_count: 15 sessions: 44 up_thresh: 90.0%
I, [2013-08-14T04:32:47.207483 #15141]  INFO -- : GEAR_UP - capacity: 93.75% gear_count: 15 sessions: 45 up_thresh: 90.0%
I, [2013-08-14T04:34:34.319028 #15141]  INFO -- : GEAR_UP - capacity: 100.0% gear_count: 15 sessions: 48 up_thresh: 90.0%
I, [2013-08-14T04:35:49.891008 #24351]  INFO -- : GEAR_UP - capacity: 99.609375% gear_count: 16 sessions: 51 up_thresh: 90.0%
I, [2013-08-14T04:37:40.834441 #28730]  INFO -- : GEAR_UP - capacity: 105.46875% gear_count: 16 sessions: 54 up_thresh: 90.0%
I, [2013-08-14T04:42:23.715628 #6858]  INFO -- : GEAR_UP - capacity: 108.45588235294117% gear_count: 17 sessions: 59 up_thresh: 90.0%
I, [2013-08-14T04:43:38.679241 #9700]  INFO -- : GEAR_UP - capacity: 101.10294117647058% gear_count: 17 sessions: 55 up_thresh: 90.0%
I, [2013-08-14T04:49:43.748512 #22109]  INFO -- : GEAR_UP - capacity: 93.75% gear_count: 18 sessions: 54 up_thresh: 90.0%
I, [2013-08-14T04:50:50.139817 #25519]  INFO -- : GEAR_UP - capacity: 98.95833333333333% gear_count: 18 sessions: 57 up_thresh: 90.0%
I, [2013-08-14T04:55:14.280379 #4976]  INFO -- : GEAR_UP - capacity: 101.9736842105263% gear_count: 19 sessions: 62 up_thresh: 90.0%
I, [2013-08-14T04:57:18.916116 #8999]  INFO -- : GEAR_UP - capacity: 100.32894736842107% gear_count: 19 sessions: 61 up_thresh: 90.0%
I, [2013-08-14T05:05:21.821723 #25786]  INFO -- : GEAR_UP - capacity: 95.3125% gear_count: 20 sessions: 61 up_thresh: 90.0%
I, [2013-08-14T05:21:35.283264 #32270]  INFO -- : GEAR_UP - capacity: 92.1875% gear_count: 20 sessions: 59 up_thresh: 90.0%
I, [2013-08-14T05:26:04.698287 #16988]  INFO -- : GEAR_UP - capacity: 100.0% gear_count: 20 sessions: 64 up_thresh: 90.0%
I, [2013-08-14T05:27:39.799378 #21782]  INFO -- : GEAR_UP - capacity: 93.75% gear_count: 21 sessions: 63 up_thresh: 90.0%
I, [2013-08-14T05:35:06.191186 #7855]  INFO -- : GEAR_UP - capacity: 90.625% gear_count: 20 sessions: 58 up_thresh: 90.0%
I, [2013-08-14T05:37:05.816136 #10852]  INFO -- : GEAR_UP - capacity: 93.75% gear_count: 20 sessions: 60 up_thresh: 90.0%
I, [2013-08-14T05:40:41.353614 #19275]  INFO -- : GEAR_UP - capacity: 95.3125% gear_count: 20 sessions: 61 up_thresh: 90.0%
I, [2013-08-14T05:42:14.874312 #21489]  INFO -- : GEAR_UP - capacity: 98.21428571428571% gear_count: 21 sessions: 66 up_thresh: 90.0%
I, [2013-08-14T05:44:03.978319 #21489]  INFO -- : GEAR_UP - capacity: 96.72619047619048% gear_count: 21 sessions: 65 up_thresh: 90.0%
Comment 2 Meng Bo 2013-08-16 08:59:18 EDT
Hi Mrunal,

I have retested this bug on devenv_3660. The original issue has been fixed.
But when I have 2 haproxies in my app, since the maxconn is limited to 128 for haproxy. Then my app will never auto scale-up anymore.

Not sure if this is the expected result.


My haproxy setting:
min: 1
max: 5
multiplier: 2

When I have 5 gear and 2 haproxies it will never auto scale-up.

@session_per_gear= 128 / (5*2) = 12.8
This will never reach the 90% up_threshold for auto scaler.
Comment 3 Mrunal Patel 2013-08-16 11:56:28 EDT
Hi Meng Bo,
I will do some tests and up that number accordingly if required. In this case
however, the max number of sessions should 128 per HAProxy, so it should be 128 * 2/ 2 * 5. 

Thanks,
Mrunal
Comment 4 Mrunal Patel 2013-08-16 12:06:22 EDT
Forgot to ask -- are you testing by hitting both HA proxies or only the head gear?
Comment 5 Meng Bo 2013-08-18 22:03:09 EDT
Hi Mrunal,

I have tested this with ab to both the two haproxies, seems the traffic on the secondary haproxy is not counted.

Thanks.
Comment 6 Mrunal Patel 2013-08-19 14:46:32 EDT
https://github.com/openshift/origin-server/pull/3411

I have added a couple of debug statements to print sessions count from each haproxy.
You can start the haproxy daemon in debug mode with

haproxy_ctld_daemon stop
haproxy_ctld_daemon start -- --debug

I tested that we do get the session counts from all HAProxies. 
I tested with scale factor of 2 and minimum 4 gears and then
hit both HAProxies with 80 concurrent requests to scale up (ab only manages
to keep scur count max around 50 per HAProxy).
Comment 7 Meng Bo 2013-08-20 07:17:39 EDT
Checked on devenv_3678, with haproxy daemon debug mode.

The traffic on each haproxy will be counted by the auto-scaler.
And the up_threshold/down_threshold looks reasonable now.

Move bug to verified.


D, [2013-08-20T07:14:16.738237 #20499] DEBUG -- : Local sessions 25
D, [2013-08-20T07:14:16.738380 #20499] DEBUG -- : Getting stats from http://a5e00c08098611e3af2f22000a93b1f9-bmengdev2.dev.rhcloud.com/haproxy-status/
D, [2013-08-20T07:14:16.815900 #20499] DEBUG -- : Remote sessions http://a5e00c08098611e3af2f22000a93b1f9-bmengdev2.dev.rhcloud.com/haproxy-status/ 32
D, [2013-08-20T07:14:16.816151 #20499] DEBUG -- : Getting stats from http://688892794230084382752768-bmengdev2.dev.rhcloud.com/haproxy-status/
D, [2013-08-20T07:14:16.860135 #20499] DEBUG -- : Remote sessions http://688892794230084382752768-bmengdev2.dev.rhcloud.com/haproxy-status/ 0
D, [2013-08-20T07:14:16.904399 #20499] DEBUG -- : Got stats from 2 remote proxies.
I, [2013-08-20T07:14:16.909678 #20499]  INFO -- : GEAR_DOWN - capacity: 59.375% gear_count: 6 sessions: 57 remove_thresh: 70.5%

Note You need to log in before you can comment on or make changes to this bug.