Bug 1253953

Summary: Update port bindings for master router when using l2pop
Product: Red Hat OpenStack Reporter: Benjamin Schmaus <bschmaus>
Component: openstack-neutronAssignee: Mike Kolesnik <mkolesni>
Status: CLOSED ERRATA QA Contact: Alexander Stafeyev <astafeye>
Severity: high Docs Contact:
Priority: high    
Version: 7.0 (Kilo)CC: amuller, chrisw, dmaley, johfulto, lpeer, mlopes, nyechiel, rhosp-bugs-internal, sclewis, tfreger, yeylon
Target Milestone: z3Keywords: ZStream
Target Release: 7.0 (Kilo)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-2015.1.2-1.el7ost Doc Type: Bug Fix
Doc Text:
Previously, when HA routers were scheduled to multiple nodes, each such replica of the router had its own copy of its internal and external ports, however, from neutron's perspective each such port was bound only to a single host. With HA routers, only one replica of the router is active at any point in time, but the router's ports may be bound to a host that is in standby mode. As a result, l2pop used the port binding information to configure flows. Since the neutron port for replicated interfaces could be bound to the wrong host, l2pop may have broken connectivity by configuring tunnel endpoints to the wrong host, or by configuring unicast openflow rules that point to a standby node. Additionally, some ML2 mechanism drivers would rely on the port binding information to configure ToR switches or other network gear, which was being misconfigured. With this update, whenever keepalived performs a state transition, it notifies the L3 agent, which then notifies the neutron-server. The server then updates the port's binding information to point to the new active node. As a result, l2pop and other ML2 mechanism drivers now have a correct view of the external environment, with router ports owned by HA routers always being bound to the active node.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-21 16:58:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Benjamin Schmaus 2015-08-16 01:13:00 UTC
Description of problem:


An HA port needs to point to the correct host (where the master router
is running) in order for L2Population to work.

Hence, this patch introduces two fixes:
* When a port owned by an HA router is up we make sure it points to the
  right node where the master is running, or a random node if there is
  no master yet (This corner case is fixed by the 2nd bullet point).

* When a L3 agent reports it's hosting a master, we need to update the
  port binding to the host the master is now running on. This fixes
  both routers with no elected master (Yet) and failovers.


Version-Release number of selected component (if applicable):
Red Hat OpenStack 7


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

This bug has been committed in upstream liberty.

The bug was:  https://bugs.launchpad.net/neutron/+bug/1365476

The code is located: https://review.openstack.org/#/c/211166/

Comment 6 Assaf Muller 2015-08-17 13:15:36 UTC
Patch proposed to upstream stable/kilo.

Comment 10 Assaf Muller 2015-09-19 23:08:42 UTC
The patch was merged in upstream stable/kilo branch and will be available in the next OSP 7 release.

Comment 12 Assaf Muller 2015-12-10 15:17:37 UTC
The patch was merged in time to be packaged in 2015.1.2.

Comment 17 Assaf Muller 2015-12-16 17:16:33 UTC
*** Bug 1260298 has been marked as a duplicate of this bug. ***

Comment 18 Toni Freger 2015-12-17 12:41:15 UTC
Ale, please verify this bug with OSP-d installed env.

Comment 19 Alexander Stafeyev 2015-12-21 06:52:50 UTC
openstack-neutron-ml2-2015.1.2-3.el7ost.noarch
openstack-neutron-bigswitch-lldp-2015.1.38-1.el7ost.noarch
python-neutronclient-2.4.0-2.el7ost.noarch
python-neutron-2015.1.2-3.el7ost.noarch
openstack-neutron-2015.1.2-3.el7ost.noarch
openstack-neutron-common-2015.1.2-3.el7ost.noarch
openstack-neutron-openvswitch-2015.1.2-3.el7ost.noarch
openstack-neutron-metering-agent-2015.1.2-3.el7ost.noarch
python-neutron-lbaas-2015.1.2-1.el7ost.noarch
openstack-neutron-lbaas-2015.1.2-1.el7ost.noarch

Comment 21 errata-xmlrpc 2015-12-21 16:58:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:2652

Comment 22 Alexander Stafeyev 2015-12-22 05:54:05 UTC
Details for comment 19: 

I have one VM with outside 
I have tunnels from the compute node to all 3 controllers. ( which tells us the fix is there) 
I dont have tunnel to another COMPUTE host ( no VMs there - l2pop is active and functioning) 

During failover there is no connectivity loss from the VM to 8.8.8.8


tnx

Comment 23 Assaf Muller 2017-05-08 13:47:11 UTC
*** Bug 1303769 has been marked as a duplicate of this bug. ***