Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1326224

Summary: [RFE] [Neutron] [LBaaS v2] Add process monitor for haproxy
Product: Red Hat OpenStack Reporter: Nir Magnezi <nmagnezi>
Component: openstack-neutron-lbaasAssignee: Nir Magnezi <nmagnezi>
Status: CLOSED ERRATA QA Contact: Toni Freger <tfreger>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0 (Mitaka)CC: amedeo.salvati, amuller, aperotti, apevec, astafeye, bperkins, dmacpher, jschluet, jthomas, lhh, lpeer, nlevinki, nyechiel, oblaut, pablo.iranzo, sclewis, slinaber
Target Milestone: Upstream M3Keywords: FutureFeature, OtherQA, Triaged
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
This enahncement implements the 'ProcessMonitor' class in the 'HaproxyNSDriver' class (v2), This class utilizes the 'external_process' module to monitor and respawn HAProxy processes if and when needed. The LBaaS agent (v2) loads 'external_process' related options and take a configured action when HAProxy dies unexpectedly.
Story Points: ---
Clone Of:
: 1415828 (view as bug list) Environment:
Last Closed: 2017-05-17 19:28:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1046780, 1273812, 1415828, 1431152    

Description Nir Magnezi 2016-04-12 08:19:07 UTC
Description of problem:
=======================
Bug 1325861 (launchpad #1565511) aims to solve cases where the lbaas agent goes offline.
To have a complete high-availability solution for lbaas agent with haproxy running in namespace, we would also want to handle a case where the haproxy process itself stopped.

This[1] neutron spec offers the following approach:
"We propose monitoring those processes, and taking a configurable action, making neutron more resilient to external failures."

[1] http://specs.openstack.org/openstack/neutron-specs/specs/juno/agent-child-processes-status.html

Comment 2 Assaf Muller 2016-06-04 01:21:01 UTC
*** Bug 1269981 has been marked as a duplicate of this bug. ***

Comment 10 Nir Yechiel 2016-09-01 14:53:17 UTC
Both patches look to be merged.

Comment 11 Assaf Muller 2016-09-01 19:55:32 UTC
https://review.openstack.org/#/c/327966/ was reverted, adding https://review.openstack.org/#/c/344658/ as an external tracker.

Comment 18 Assaf Muller 2017-01-13 20:15:51 UTC
Patch has been merged upstream.

Comment 23 Nir Magnezi 2017-02-15 13:40:32 UTC
How to test:
============
1. Create a Loadbalancer
2. Create a Listener
3. Create Pool and memebers
4. Verify loadbalancing functionality.
5. Kill the haproxy process
6. Wait for ~30 sec and see if it respawns.
7. Redo step #4

Comment 28 Alexander Stafeyev 2017-03-23 11:27:16 UTC
https://review.openstack.org/#/c/344658/21/neutron_lbaas/drivers/haproxy/namespace_driver.py@379 
fits what I have in my deployment. 

Verifying

Tnx

Comment 29 errata-xmlrpc 2017-05-17 19:28:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245