Bug 1309533 - [RFE] Heat template to configure quagga active/active/active HAproxy for OSP controllers
[RFE] Heat template to configure quagga active/active/active HAproxy for OSP ...
Status: CLOSED WONTFIX
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-haproxy (Show other bugs)
12.0 (Pike)
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Fabio Massimo Di Nitto
Udi Shkalim
: FutureFeature
Depends On:
Blocks: 1419948 1458798
  Show dependency treegraph
 
Reported: 2016-02-17 23:04 EST by Kyle Bader
Modified: 2017-10-19 11:25 EDT (History)
32 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-10-19 11:25:22 EDT
Type: Feature Request
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kyle Bader 2016-02-17 23:04:00 EST
Currently, HAproxy is configured in an active/passive way to load balance OSP API services. This is captured in this document:

https://github.com/beekhof/osp-ha-deploy/blob/master/HA-keepalived.md

This will limit throughput to what a single node is capable of delivering, leaving scale-up as the only option. A superior design would be to assign a router VIP to each HAproxy instance, and then have the API VIP on the loopback interface. Quagga could then be used on each OSP controller to advertise a route to the API VIP, via the distinct router VIP on that controller using OSPF/BGP. The upstream router would peer with the OSP controller nodes, resulting in multiple routes to the API VIP, via each of the router VIPs, and balance flows across all HAproxy instances using 5-tuple ECMP hashing. If HAproxy fails a heartbeat, quagga should withdraw it's route, so that the upstream router will redistribute flows across the surviving routes. If an OSP controller completely crashes, the upstream peer will reconverge to the surviving routes.
Comment 2 Kyle Bader 2016-02-17 23:10:52 EST
Assuming single digit ms latency between sites by means of dwdm or other private transit (for galera-cluster), this could also enable a robust mechanism to provide a single API endpoint for a control plane spanning multiple sites.
Comment 3 Kyle Bader 2016-02-18 00:27:08 EST
Can someone add me to 1261979?
Comment 4 Mike Burns 2016-04-07 17:11:06 EDT
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.
Comment 7 Fabio Massimo Di Nitto 2017-10-19 11:25:22 EDT
Engineering has been evaluating the request and while the proposed architecture might work, there are several aspects that are problematic.

First of all, there are different projects covering routing protocols. All forks of zebra or quagga, but none of them has established itself as the de-facto standard in the industry. This is problematic from a support perspective and might require plenty resources just to support this single solution.

Second, the added complexity of this architecture might be a big adoption barrier for customers and to debug any issue. Routing protocols are not simple.

Third, while theoretically this solution might solve a bottleneck problem, the question really become: how often did we hit the limit? So far, we have never heard of any customer complaining about this specific issue.

Therefor we agreed to close this request as WONTFIX as Red Hat will not implement this feature.

Note You need to log in before you can comment on or make changes to this bug.