Bug 1121139

Summary: Haproxy gear ratio is only considered when adding a gear, not when start an existing application
Product: OpenShift Online Reporter: Luke Meyer <lmeyer>
Component: ImageAssignee: Ben Parees <bparees>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: bleanhar, bmeng, bperkins, jokerman, libra-bugs, libra-onpremise-devel, mmccomas, tiwillia
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1120887 Environment:
Last Closed: 2014-10-10 00:49:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1120887    

Description Luke Meyer 2014-07-18 13:16:23 UTC
+++ This bug was initially created as a clone of Bug #1120887 +++

Description of problem:

When setting OPENSHIFT_HAPROXY_GEAR_RATIO to one (in /etc/openshift/env/OPENSHIFT_HAPROXY_GEAR_RATIO), a 503 is seen when creating an application that is scaled to a single gear. This is expected behavior, as the framework cartridge will be disabled due to the haproxy ratio.

However, when this application is restarted, it works fine, without throwing 503's. When looking at the haproxy-status page, the framework cartridge has been enabled. 

It appears as though the only time OPENSHIFT_HAPROXY_GEAR_RATIO is considered is when a gear is added to the application. This is why we are seeing the behavior where the gear is disabled at first, but enabled on app-restart.

When a gear is added to the application, we go through 'update-cluster' and consider the gear ratio:

cartridges/openshift-origin-cartridge-haproxy/usr/bin/update-cluster
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-
echo "Web/Proxy gears ratio $ratio"
if [ "$ratio" -ge ${OPENSHIFT_HAPROXY_GEAR_RATIO-"3"} ]; then
    echo "Disabling colocated gears ${info[@]}"
    nohup $OPENSHIFT_HAPROXY_DIR/usr/bin/disable-colocated-gears ${info[@]} &
else
    echo "No disabling required"
fi
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-

We can see that on creation, with OPENSHIFT_HAPROXY_GEAR_RATIO = 1, that the framework gear is disabled:

platform-trace.log
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-
July 17 10:54:07 INFO oo_spawn running /sbin/runuser -s /bin/sh 53c7e383e3c9c3a31c0000ba -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c5,c976' /bin/sh -c \"set -e; /var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy/bin/control update-cluster 01141730-admin.voyager.com\|node1.voyager.com:63411\"": {:unsetenv_others=>true, :close_others=>true, :in=>"/dev/null", :chdir=>"/var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy", :out=>#<IO:fd 12>, :err=>#<IO:fd 8>}
July 17 10:54:10 INFO oo_spawn buffer(11/) Web/Proxy gears ratio 1
Disabling colocated gears 53c7e383e3c9c3a31c0000ba
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-

When the application is restarted, the bit of code that uses OPENSHIFT_HARPROXY_GEAR_RATIO is never touched. Instead, we see control/enable-gear used:

-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-
July 17 10:54:56 INFO oo_spawn running /sbin/runuser -s /bin/sh 53c7e383e3c9c3a31c0000ba -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c5,c976' /bin/sh -c \"set -e; /var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy/bin/control restart \"": {:unsetenv_others=>true, :close_others=>true, :in=>"/dev/null", :chdir=>"/var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy", :out=>#<IO:fd 12>, :err=>#<IO:fd 8>}
July 17 10:54:56 INFO oo_spawn buffer(11/) Restarted HAProxy instance

July 17 10:54:56 INFO oo_spawn running /sbin/runuser -s /bin/sh 53c7e383e3c9c3a31c0000ba -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c5,c976' /bin/sh -c \"set -e; /var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy/bin/control enable-server 53c7e383e3c9c3a31c0000ba\"": {:unsetenv_others=>true, :close_others=>true, :in=>"/dev/null", :chdir=>"/var/lib/openshift/53c7e383e3c9c3a31c0000ba/haproxy", :out=>#<IO:fd 12>, :err=>#<IO:fd 8>}
July 17 10:54:56 INFO oo_spawn buffer(11/) Enabling server 53c7e383e3c9c3a31c0000ba
-=~~~~~~~~~~~~~~~~~~~~~~~~~~=-

Then, the framework gear is enabled. 

Looking at the code, the only place we actually consider the OPENSHIFT_HAPROXY_GEAR_RATIO is in update-cluster:

$ grep -R HAPROXY_GEAR_RATIO /enterprise-server/cartridges/openshift-origin-cartridge-haproxy
./usr/bin/update-cluster:if [ "$ratio" -ge ${OPENSHIFT_HAPROXY_GEAR_RATIO-"3"} ]; then

Version-Release number of selected component (if applicable):
2.1.3

How reproducible:
Always

Steps to Reproduce:
1. Create a file on each node with just '1' inside '/etc/openshift/env/OPENSHIFT_HAPROXY_GEAR_RATIO'
2. Create a scaled application of any type.
3. Curl the application to ensure a 503 is returned (as expected)
4. Restart the application
5. Curl the application again

Actual results:
200 returned

Expected results:
503 returned

Additional info:
Spun off from investigation in bug 1119338

Comment 2 Meng Bo 2014-07-22 11:10:10 UTC
Checked on devenv_4992, after restart the scalable app, access the app will not get 200 returned.

Move bug to verified.