Bug 1309533 - [RFE] Heat template to configure quagga active/active/active HAproxy for OSP controllers
Summary: [RFE] Heat template to configure quagga active/active/active HAproxy for OSP ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-haproxy
Version: 12.0 (Pike)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Fabio Massimo Di Nitto
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks: 1419948 1458798
TreeView+ depends on / blocked
 
Reported: 2016-02-18 04:04 UTC by Kyle Bader
Modified: 2017-10-19 15:25 UTC (History)
32 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-19 15:25:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Kyle Bader 2016-02-18 04:04:00 UTC
Currently, HAproxy is configured in an active/passive way to load balance OSP API services. This is captured in this document:

https://github.com/beekhof/osp-ha-deploy/blob/master/HA-keepalived.md

This will limit throughput to what a single node is capable of delivering, leaving scale-up as the only option. A superior design would be to assign a router VIP to each HAproxy instance, and then have the API VIP on the loopback interface. Quagga could then be used on each OSP controller to advertise a route to the API VIP, via the distinct router VIP on that controller using OSPF/BGP. The upstream router would peer with the OSP controller nodes, resulting in multiple routes to the API VIP, via each of the router VIPs, and balance flows across all HAproxy instances using 5-tuple ECMP hashing. If HAproxy fails a heartbeat, quagga should withdraw it's route, so that the upstream router will redistribute flows across the surviving routes. If an OSP controller completely crashes, the upstream peer will reconverge to the surviving routes.

Comment 2 Kyle Bader 2016-02-18 04:10:52 UTC
Assuming single digit ms latency between sites by means of dwdm or other private transit (for galera-cluster), this could also enable a robust mechanism to provide a single API endpoint for a control plane spanning multiple sites.

Comment 3 Kyle Bader 2016-02-18 05:27:08 UTC
Can someone add me to 1261979?

Comment 4 Mike Burns 2016-04-07 21:11:06 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 7 Fabio Massimo Di Nitto 2017-10-19 15:25:22 UTC
Engineering has been evaluating the request and while the proposed architecture might work, there are several aspects that are problematic.

First of all, there are different projects covering routing protocols. All forks of zebra or quagga, but none of them has established itself as the de-facto standard in the industry. This is problematic from a support perspective and might require plenty resources just to support this single solution.

Second, the added complexity of this architecture might be a big adoption barrier for customers and to debug any issue. Routing protocols are not simple.

Third, while theoretically this solution might solve a bottleneck problem, the question really become: how often did we hit the limit? So far, we have never heard of any customer complaining about this specific issue.

Therefor we agreed to close this request as WONTFIX as Red Hat will not implement this feature.


Note You need to log in before you can comment on or make changes to this bug.