Hide Forgot
Created attachment 830067 [details] jmeter test suite to simulate lots of users Description of problem: I have setup OSE 1.2.1 in my lab. For testing of scaling I have created a test application from the kitchensink example. rhc app create -a kitchenscale -t jbosseap --from-code git://github.com/openshift/kitchensink-example.git -s then I put some load on the gear using jmeter. I an attaching the test suite to this case. After simulating 5000 users hitting the site with a ramp up phase of 10 seconds, only one gear was created. If I understand the configuration of haproxy and the scaling code correctly, we have a maxconn setting of 256 meaning haproxy will not accept more connections than this. Within the scaling code we do have a MAX_SESSIONS_PER_GEAR setting of 16.0 (see https://github.com/openshift/origin-server/blob/openshift-origin-release-2/cartridges/openshift-origin-cartridge-haproxy/usr/bin/haproxy_ctld.rb#L99) Looking at https://github.com/openshift/origin-server/blob/openshift-origin-release-2/cartridges/openshift-origin-cartridge-haproxy/usr/bin/haproxy_ctld.rb#L159 it seems we are trying to distribute the load between all gears and make sure we have an additional gear if we have more than 16 sessions per gear. During my tests I monitored the scaling-events log which printed the correct usage but failed to finish the scale up events: I, [2013-11-28T02:51:08.109463 #25422] INFO -- : GEAR_UP - capacity: 212.5% gear_count: 1 sessions: 34 up_thresh: 90.0% D, [2013-11-28T02:51:13.314401 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 20951 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:51:13.413059 #25422] DEBUG -- : GEAR_INFO - capacity: 306.25% gear_count: 1 sessions: 49 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 115 gear _remove_thresh: 0/20 I, [2013-11-28T02:51:16.506160 #25422] INFO -- : GEAR_UP - capacity: 762.5% gear_count: 1 sessions: 122 up_thresh: 90.0% D, [2013-11-28T02:51:25.814215 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 20980 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:51:26.206974 #25422] DEBUG -- : GEAR_INFO - capacity: 131.25% gear_count: 1 sessions: 21 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 111 gear _remove_thresh: 0/20 I, [2013-11-28T02:51:29.515573 #25422] INFO -- : GEAR_UP - capacity: 650.0% gear_count: 1 sessions: 104 up_thresh: 90.0% D, [2013-11-28T02:51:35.305046 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21019 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:51:35.517902 #25422] DEBUG -- : GEAR_INFO - capacity: 700.0% gear_count: 1 sessions: 112 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 114 gear _remove_thresh: 0/20 I, [2013-11-28T02:51:38.607511 #25422] INFO -- : GEAR_UP - capacity: 600.0% gear_count: 1 sessions: 96 up_thresh: 90.0% D, [2013-11-28T02:51:46.104945 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21039 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:51:46.213241 #25422] DEBUG -- : GEAR_INFO - capacity: 406.25% gear_count: 1 sessions: 65 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 113 gear_remove_thresh: 0/20 I, [2013-11-28T02:51:49.412718 #25422] INFO -- : GEAR_UP - capacity: 550.0% gear_count: 1 sessions: 88 up_thresh: 90.0% D, [2013-11-28T02:52:00.501189 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21059 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:00.703825 #25422] DEBUG -- : GEAR_INFO - capacity: 750.0% gear_count: 1 sessions: 120 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 109 gear_remove_thresh: 0/20 I, [2013-11-28T02:52:03.901944 #25422] INFO -- : GEAR_UP - capacity: 125.0% gear_count: 1 sessions: 20 up_thresh: 90.0% D, [2013-11-28T02:52:12.015385 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21542 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:12.103658 #25422] DEBUG -- : GEAR_INFO - capacity: 231.25% gear_count: 1 sessions: 37 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 112 gear_remove_thresh: 0/20 I, [2013-11-28T02:52:15.217038 #25422] INFO -- : GEAR_UP - capacity: 706.25% gear_count: 1 sessions: 113 up_thresh: 90.0% D, [2013-11-28T02:52:23.312704 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21564 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:23.407078 #25422] DEBUG -- : GEAR_INFO - capacity: 625.0% gear_count: 1 sessions: 100 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 112 gear_remove_thresh: 0/20 I, [2013-11-28T02:52:26.509022 #25422] INFO -- : GEAR_UP - capacity: 543.75% gear_count: 1 sessions: 87 up_thresh: 90.0% D, [2013-11-28T02:52:34.802143 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21588 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:35.010723 #25422] DEBUG -- : GEAR_INFO - capacity: 393.75% gear_count: 1 sessions: 63 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 112 gear_remove_thresh: 0/20 I, [2013-11-28T02:52:38.105981 #25422] INFO -- : GEAR_UP - capacity: 381.25% gear_count: 1 sessions: 61 up_thresh: 90.0% D, [2013-11-28T02:52:45.217444 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21609 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:45.218343 #25422] DEBUG -- : GEAR_INFO - capacity: 0.0% gear_count: 1 sessions: 0 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 113 gear_remove_thresh: 0/20 I, [2013-11-28T02:52:48.305904 #25422] INFO -- : GEAR_UP - capacity: 143.75% gear_count: 1 sessions: 23 up_thresh: 90.0% D, [2013-11-28T02:52:53.910140 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 21628 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:52:54.000785 #25422] DEBUG -- : GEAR_INFO - capacity: 206.25% gear_count: 1 sessions: 33 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 115 gear_remove_thresh: 0/20 I, [2013-11-28T02:53:03.424342 #25422] INFO -- : GEAR_UP - capacity: 693.75% gear_count: 1 sessions: 111 up_thresh: 90.0% D, [2013-11-28T02:53:09.609657 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 22081 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) D, [2013-11-28T02:53:09.707919 #25422] DEBUG -- : GEAR_INFO - capacity: 587.5% gear_count: 1 sessions: 94 up/remove_thresh: 90.0%/1.0% sec_left_til_remove: 114 gear_remove_thresh: 0/20 I, [2013-11-28T02:53:12.710972 #25422] INFO -- : GEAR_UP - capacity: 256.25% gear_count: 1 sessions: 41 up_thresh: 90.0% D, [2013-11-28T02:53:20.902993 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 22103 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) The gear count stayed at 1 although the command 09:15 $ rhc app show kitchenscale --gears ID State Cartridges Size SSH URL ------------------------ ------- ------------------------ ----- ------------------------------------------------------------------------------ 5295eb4fe3c9c3fee1000022 started jbosseap-6.0 haproxy-1.4 small 5295eb4fe3c9c3fee1000022.gsso.redhat.com 5296f1fae3c9c3fee1000043 new jbosseap-6.0 haproxy-1.4 small 5296f1fae3c9c3fee1000043.gsso.redhat.com showed a new gear. Version-Release number of selected component (if applicable): How reproducible (Steps to reproduce): in your OSE 1.2.1 installation run: 1. rhc app create -a kitchenscale -t jbosseap --from-code git://github.com/openshift/kitchensink-example.git -s 2. take the attached testsuite. Modify the url in HTTP Request Defaults run the tests 3. rhc ssh kitchenscale 4. cd haproxy/logs 5. tail -f scale-events.log Actual results: I, [2013-11-28T02:53:12.710972 #25422] INFO -- : GEAR_UP - capacity: 256.25% gear_count: 1 sessions: 41 up_thresh: 90.0% D, [2013-11-28T02:53:20.902993 #25422] DEBUG -- : GEAR_UP - add-gear: exit: pid 22103 exit 1 stdout: [FATAL] Failed to create timer thread (errno: 11) Expected results: Creation of new gears successfully Additional info:
additional information [kitchenscale-juhoffma.ose.gsso.redhat.com 5295eb4fe3c9c3fee1000022]\> ctl_all restart Stopping services Stopping gear... can't create Thread (11) Starting services [FATAL] Failed to create timer thread (errno: 11) [kitchenscale-juhoffma.ose.gsso.redhat.com 5295eb4fe3c9c3fee1000022]\> less haproxy/run/haproxy_ctld.rb.output [kitchenscale-juhoffma.ose.gsso.redhat.com 5295eb4fe3c9c3fee1000022]\> cat haproxy/run/haproxy_ctld.rb.output An error occurred; try again later: Failed to get information from haproxy
after upgrading my lab to the latest version it seems to work fine. After yum -y update on the broker and the node: [kitchenscale-juhoffma.ose.gsso.redhat.com logs]\> ctl_all restart Stopping services Stopping gear... CLIENT_RESULT: HAProxy instance is stopped Stopping jbosseap cart Sending SIGTERM to jboss:6957 ... Starting services Starting gear... CLIENT_RESULT: HAProxy instance is started Starting jbosseap cart Found 127.0.252.1:8080 listening port