| Summary: | haproxy configuration in HA gears sets inconsistent cookie values, breaking session affinity | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Josep 'Pep' Turro Mauri <pep> |
| Component: | ImageStreams | Assignee: | Timothy Williams <tiwillia> |
| Status: | CLOSED ERRATA | QA Contact: | Wang Haoran <haowang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2.2.0 | CC: | aos-bugs, bperkins, jialiu, jokerman, mmccomas, rthrashe, tiwillia |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op | Doc Type: | Bug Fix |
| Doc Text: |
Cause: Haproxy cookies were inconsistently named.
Consequence: Requests to an HA app were not always being routed to the correct gear.
Fix: Change the cookie naming logic to have the cookie name reflect which backend gear is handling the request.
Result: All backend haproxy gears should return the same cookie name and the requests should be properly routed to the correct backend gear.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-04 20:23:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Proposed change: https://github.com/openshift/origin-server/pull/6421 Reproduce steps:
$ rhc app create scaphp54app php-5.4 -p redhat -s
$ rhc app-enable-ha scaphp54app
Head gear:
$ curl -I scaphp54app-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=584fcfc4d1bd4dbab8000003-jialiu; path=/
$ curl -I scaphp54app-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=local-584fccd0d1bd4d7ca8000001; path=/
Secondary gear:
$ curl -I 584fcfc4d1bd4dbab8000003-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=scaphp54app-jialiu; path=/
$ curl -I 584fcfc4d1bd4dbab8000003-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=local-584fcfc4d1bd4dbab8000003; path=/
Verified this bug with openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch, FAIL.
$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/
$ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Set-Cookie: GEAR=scaphp53app-jialiu; path=/
After review the PR, this is because the value of ${OPENSHIFT_GEAR_NAME} in the 2nd gear is its uuid (5850f950d1bd4dbab8000113), I think the PR should do some update like the following:
gear_name="${OPENSHIFT_APP_NAME}-${OPENSHIFT_NAMESPACE}"
So the fact that you are seeing "Set-Cookie: GEAR=local-584fcfc4d1bd4dbab8000003" means that the change made for this bug is not there. The fix for this bug should have removed the creation of any cookies using "local-*". I tested this change on a devenv and I was getting cookies of "appname-domain" and "secondarygearUUID-domain", which were consistent when curling either proxy.
Can you grab the update-cluster script from the gear and verify that lines 117-119 look like the following?
gear_name="${OPENSHIFT_GEAR_NAME}-${OPENSHIFT_NAMESPACE}"
sed -i "/\s*server\s*local-gear\s.*/d" /tmp/haproxy.cfg.$$
echo " server local-gear $local_ep check fall 2 rise 3 inter 2000 cookie $gear_name" >> /tmp/haproxy.cfg.$$
If the changes are NOT there, then it the change possibly didn't make it into openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch properly and I may need to rebuild it.
If the changes are there, then I'm not sure. Could you provide the testing devenv so we could access the app/gears and see whats going on?
Verified this bug with openshift-origin-cartridge-haproxy-1.31.7.1-1.el6op.noarch, and PASS. In comment 5, I mis-understood this bug, actually it already was working well there. From the output, there was no any cookies using "local-*". $ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/ $ curl -I scaphp53app-jialiu.ose-20161212.example.com|grep Cookie % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 Set-Cookie: GEAR=scaphp53app-jialiu; path=/ $ curl -I 5850f950d1bd4dbab8000113-jialiu.ose-20161212.example.com |grep Cookie % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 Set-Cookie: GEAR=scaphp53app-jialiu; path=/ $ curl -I 5850f950d1bd4dbab8000113-jialiu.ose-20161212.example.com |grep Cookie % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 Set-Cookie: GEAR=5850f950d1bd4dbab8000113-jialiu; path=/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0017.html |
Description of problem: Using enable-ha we have configured some of our applications to have two haproxies for resilience. Using an external load balancer to forward requests. Within this configuration we have encountered issues with session affinity and scaling. 1. If we enable session affinity at the load balancer so that the users are directed to a particular haproxy it is possible that all requests are forwarded to the secondary haproxy (i.e. non-head). This means that the application will not always scale. 2. We disabled session affinity at the load balancers to distribute requests across the haproxies – assuming that the haproxy cookies would ensure the requests are always routed to the target gear. However, due to inconsistent cookie names allocated by the haproxys, this is not the case. This report is about #2. Version-Release number of selected component (if applicable): OSE 2.2.10 How reproducible: #2 can be reproduced always; it's still unclear how much #1 happens in real life. Steps to Reproduce: 1. Create an HA application (i.e. with at least 2 gears including haproxy) 2. Send requests to both haproxy gears Actual results: The session cookies returned by each haproxy differ: Head gear: $ http -h http://test-pep.ose22.example.com | grep Cookie Set-Cookie: GEAR=5491f7f72fa4576d0c00076c-pep; path=/ $ http -h http://test-pep.ose22.example.com | grep Cookie Set-Cookie: GEAR=local-5491f7b82fa4576d0c00074b; path=/ Secondary gear: $ http -h http://5491f7f72fa4576d0c00076c-pep.ose22.example.com Host:test-pep.ose22.example.com | grep Cookie Set-Cookie: GEAR=local-5491f7f72fa4576d0c00076c; path=/ $ http -h http://5491f7f72fa4576d0c00076c-pep.ose22.example.com Host:test-pep.ose22.example.com | grep Cookie Set-Cookie: GEAR=test-pep; path=/ Expected results: All haproxies should return consistent values for the session cookie: same backend gear => same value Additional info: Pending additional details of a real-life scenario where #1 has been identified as a real problem - making it a non-viable workaround.