1309533 – [RFE] Heat template to configure quagga active/active/active HAproxy for OSP controllers

Bug 1309533 - [RFE] Heat template to configure quagga active/active/active HAproxy for OSP controllers

Summary: [RFE] Heat template to configure quagga active/active/active HAproxy for OSP ...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	puppet-haproxy
Sub Component:
Version:	12.0 (Pike)
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Fabio Massimo Di Nitto
QA Contact:	Udi Shkalim
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1419948 1458798
TreeView+	depends on / blocked

Reported:	2016-02-18 04:04 UTC by Kyle Bader
Modified:	2017-10-19 15:25 UTC (History)
CC List:	32 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-10-19 15:25:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Kyle Bader 2016-02-18 04:04:00 UTC

Currently, HAproxy is configured in an active/passive way to load balance OSP API services. This is captured in this document:

https://github.com/beekhof/osp-ha-deploy/blob/master/HA-keepalived.md

This will limit throughput to what a single node is capable of delivering, leaving scale-up as the only option. A superior design would be to assign a router VIP to each HAproxy instance, and then have the API VIP on the loopback interface. Quagga could then be used on each OSP controller to advertise a route to the API VIP, via the distinct router VIP on that controller using OSPF/BGP. The upstream router would peer with the OSP controller nodes, resulting in multiple routes to the API VIP, via each of the router VIPs, and balance flows across all HAproxy instances using 5-tuple ECMP hashing. If HAproxy fails a heartbeat, quagga should withdraw it's route, so that the upstream router will redistribute flows across the surviving routes. If an OSP controller completely crashes, the upstream peer will reconverge to the surviving routes.

Comment 2 Kyle Bader 2016-02-18 04:10:52 UTC

Assuming single digit ms latency between sites by means of dwdm or other private transit (for galera-cluster), this could also enable a robust mechanism to provide a single API endpoint for a control plane spanning multiple sites.

Comment 3 Kyle Bader 2016-02-18 05:27:08 UTC

Can someone add me to 1261979?

Comment 4 Mike Burns 2016-04-07 21:11:06 UTC

This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 7 Fabio Massimo Di Nitto 2017-10-19 15:25:22 UTC

Engineering has been evaluating the request and while the proposed architecture might work, there are several aspects that are problematic.

First of all, there are different projects covering routing protocols. All forks of zebra or quagga, but none of them has established itself as the de-facto standard in the industry. This is problematic from a support perspective and might require plenty resources just to support this single solution.

Second, the added complexity of this architecture might be a big adoption barrier for customers and to debug any issue. Routing protocols are not simple.

Third, while theoretically this solution might solve a bottleneck problem, the question really become: how often did we hit the limit? So far, we have never heard of any customer complaining about this specific issue.

Therefor we agreed to close this request as WONTFIX as Red Hat will not implement this feature.

Note You need to log in before you can comment on or make changes to this bug.

alan_bishop
arkady_kanevsky
bperkins
cdevine
christopher_dearborn
dbecker
gael_rehault
ipilcher
jcoufal
jdonohue
jjoyce
joherr
John_walsh
jschluet
j_t_williams
kbader
kschinck
kurt_hey
mburns
morazi
nlevine
rajini.karthik
randy_perryman
rhel-osp-director-maint
royoung
rsussman
slinaber
smerrow
sreichar
tvignaud
ushkalim
wayne_allen