Bug 1377433

Summary: haproxy configuration in HA gears sets inconsistent cookie values, breaking session affinity
Product: OpenShift Container Platform Reporter: Josep 'Pep' Turro Mauri <pep>
Component: ImageStreamsAssignee: Timothy Williams <tiwillia>
Status: CLOSED ERRATA QA Contact: Wang Haoran <haowang>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.2.0CC: aos-bugs, bperkins, jialiu, jokerman, mmccomas, rthrashe, tiwillia
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op Doc Type: Bug Fix
Doc Text:
Cause: Haproxy cookies were inconsistently named. Consequence: Requests to an HA app were not always being routed to the correct gear. Fix: Change the cookie naming logic to have the cookie name reflect which backend gear is handling the request. Result: All backend haproxy gears should return the same cookie name and the requests should be properly routed to the correct backend gear.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-04 20:23:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Josep 'Pep' Turro Mauri 2016-09-19 16:45:00 UTC
Description of problem:

Using enable-ha we have configured some of our applications to have two haproxies for resilience. Using an external load balancer to forward requests. Within this configuration we have encountered issues with session affinity and scaling.

  1. If we enable session affinity at the load balancer so that the users are
     directed to a particular haproxy it is possible that all requests are
     forwarded to the secondary haproxy (i.e. non-head). This means that the
     application will not always scale.

  2. We disabled session affinity at the load balancers to distribute requests
     across the haproxies – assuming that the haproxy cookies would ensure the
     requests are always routed to the target gear. However, due to inconsistent
     cookie names allocated by the haproxys, this is not the case.

This report is about #2.

Version-Release number of selected component (if applicable):
OSE 2.2.10

How reproducible:
#2 can be reproduced always; it's still unclear how much #1 happens in real life.

Steps to Reproduce:
1. Create an HA application (i.e. with at least 2 gears including haproxy)
2. Send requests to both haproxy gears

Actual results:
The session cookies returned by each haproxy differ:

  Head gear:
    $ http -h http://test-pep.ose22.example.com | grep Cookie
    Set-Cookie: GEAR=5491f7f72fa4576d0c00076c-pep; path=/
    $ http -h http://test-pep.ose22.example.com | grep Cookie
    Set-Cookie: GEAR=local-5491f7b82fa4576d0c00074b; path=/

  Secondary gear:
    $ http -h http://5491f7f72fa4576d0c00076c-pep.ose22.example.com Host:test-pep.ose22.example.com | grep Cookie
    Set-Cookie: GEAR=local-5491f7f72fa4576d0c00076c; path=/
    $ http -h http://5491f7f72fa4576d0c00076c-pep.ose22.example.com Host:test-pep.ose22.example.com | grep Cookie
    Set-Cookie: GEAR=test-pep; path=/

Expected results:

All haproxies should return consistent values for the session cookie: same backend gear => same value

Additional info:
Pending additional details of a real-life scenario where #1 has been identified as a real problem - making it a non-viable workaround.

Comment 1 Josep 'Pep' Turro Mauri 2016-09-19 17:01:01 UTC
Proposed change: https://github.com/openshift/origin-server/pull/6421

Comment 5 Johnny Liu 2016-12-14 08:01:34 UTC
Reproduce steps:
$ rhc app create scaphp54app php-5.4 -p redhat -s
$ rhc app-enable-ha scaphp54app
Head gear:
$ curl -I scaphp54app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=584fcfc4d1bd4dbab8000003-jialiu; path=/
$ curl -I scaphp54app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=local-584fccd0d1bd4d7ca8000001; path=/
Secondary gear:
$ curl -I 584fcfc4d1bd4dbab8000003-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=scaphp54app-jialiu; path=/
$ curl -I 584fcfc4d1bd4dbab8000003-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=local-584fcfc4d1bd4dbab8000003; path=/


Verified this bug with openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch, FAIL.

$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/
$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Set-Cookie: GEAR=scaphp53app-jialiu; path=/


After review the PR, this is because the value of ${OPENSHIFT_GEAR_NAME} in the 2nd gear is its uuid (5850f950d1bd4dbab8000113), I think the PR should do some update like the following:
gear_name="${OPENSHIFT_APP_NAME}-${OPENSHIFT_NAMESPACE}"

Comment 6 Rory Thrasher 2016-12-14 21:55:28 UTC
So the fact that you are seeing "Set-Cookie: GEAR=local-584fcfc4d1bd4dbab8000003" means that the change made for this bug is not there.  The fix for this bug should have removed the creation of any cookies using "local-*".  I tested this change on a devenv and I was getting cookies of "appname-domain" and "secondarygearUUID-domain", which were consistent when curling either proxy.


Can you grab the update-cluster script from the gear and verify that lines 117-119 look like the following?

gear_name="${OPENSHIFT_GEAR_NAME}-${OPENSHIFT_NAMESPACE}"
sed -i "/\s*server\s*local-gear\s.*/d" /tmp/haproxy.cfg.$$
echo "    server local-gear $local_ep check fall 2 rise 3 inter 2000 cookie $gear_name" >> /tmp/haproxy.cfg.$$


If the changes are NOT there, then it the change possibly didn't make it into openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch properly and I may need to rebuild it.

If the changes are there, then I'm not sure.  Could you provide the testing devenv so we could access the app/gears and see whats going on?

Comment 7 Johnny Liu 2016-12-15 02:48:36 UTC
Verified this bug with openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch, and PASS.


In comment 5, I mis-understood this bug, actually it already was working well there. From the output, there was no any cookies using "local-*".

$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/
$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
Set-Cookie: GEAR=scaphp53app-jialiu; path=/


$ curl -I 5850f950d1bd4dbab8000113-jialiu.ose-20161212.example.com |grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
Set-Cookie: GEAR=scaphp53app-jialiu; path=/
$ curl -I 5850f950d1bd4dbab8000113-jialiu.ose-20161212.example.com |grep Cookie
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/

Comment 9 errata-xmlrpc 2017-01-04 20:23:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0017.html