Bug 1253953 - Update port bindings for master router when using l2pop
Update port bindings for master router when using l2pop
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
7.0 (Kilo)
x86_64 Linux
high Severity high
: z3
: 7.0 (Kilo)
Assigned To: Mike Kolesnik
Alexander Stafeyev
: ZStream
: 1260298 1303769 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2015-08-15 21:13 EDT by Benjamin Schmaus
Modified: 2017-05-08 09:47 EDT (History)
11 users (show)

See Also:
Fixed In Version: openstack-neutron-2015.1.2-1.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, when HA routers were scheduled to multiple nodes, each such replica of the router had its own copy of its internal and external ports, however, from neutron's perspective each such port was bound only to a single host. With HA routers, only one replica of the router is active at any point in time, but the router's ports may be bound to a host that is in standby mode. As a result, l2pop used the port binding information to configure flows. Since the neutron port for replicated interfaces could be bound to the wrong host, l2pop may have broken connectivity by configuring tunnel endpoints to the wrong host, or by configuring unicast openflow rules that point to a standby node. Additionally, some ML2 mechanism drivers would rely on the port binding information to configure ToR switches or other network gear, which was being misconfigured. With this update, whenever keepalived performs a state transition, it notifies the L3 agent, which then notifies the neutron-server. The server then updates the port's binding information to point to the new active node. As a result, l2pop and other ML2 mechanism drivers now have a correct view of the external environment, with router ports owned by HA routers always being bound to the active node.
Story Points: ---
Clone Of:
Last Closed: 2015-12-21 11:58:28 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1365476 None None None Never
OpenStack gerrit 211166 None None None Never

  None (edit)
Description Benjamin Schmaus 2015-08-15 21:13:00 EDT
Description of problem:

An HA port needs to point to the correct host (where the master router
is running) in order for L2Population to work.

Hence, this patch introduces two fixes:
* When a port owned by an HA router is up we make sure it points to the
  right node where the master is running, or a random node if there is
  no master yet (This corner case is fixed by the 2nd bullet point).

* When a L3 agent reports it's hosting a master, we need to update the
  port binding to the host the master is now running on. This fixes
  both routers with no elected master (Yet) and failovers.

Version-Release number of selected component (if applicable):
Red Hat OpenStack 7

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

This bug has been committed in upstream liberty.

The bug was:  https://bugs.launchpad.net/neutron/+bug/1365476

The code is located: https://review.openstack.org/#/c/211166/
Comment 6 Assaf Muller 2015-08-17 09:15:36 EDT
Patch proposed to upstream stable/kilo.
Comment 10 Assaf Muller 2015-09-19 19:08:42 EDT
The patch was merged in upstream stable/kilo branch and will be available in the next OSP 7 release.
Comment 12 Assaf Muller 2015-12-10 10:17:37 EST
The patch was merged in time to be packaged in 2015.1.2.
Comment 17 Assaf Muller 2015-12-16 12:16:33 EST
*** Bug 1260298 has been marked as a duplicate of this bug. ***
Comment 18 Toni Freger 2015-12-17 07:41:15 EST
Ale, please verify this bug with OSP-d installed env.
Comment 19 Alexander Stafeyev 2015-12-21 01:52:50 EST
Comment 21 errata-xmlrpc 2015-12-21 11:58:28 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

Comment 22 Alexander Stafeyev 2015-12-22 00:54:05 EST
Details for comment 19: 

I have one VM with outside 
I have tunnels from the compute node to all 3 controllers. ( which tells us the fix is there) 
I dont have tunnel to another COMPUTE host ( no VMs there - l2pop is active and functioning) 

During failover there is no connectivity loss from the VM to

Comment 23 Assaf Muller 2017-05-08 09:47:11 EDT
*** Bug 1303769 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.