Bug 1525520

Summary: OVN L3 plugin schedules routers to chassis without external connectivity regardless of bridge mappings
Product: Red Hat OpenStack Reporter: Miguel Angel Ajo <majopela>
Component: python-networking-ovnAssignee: Daniel Alvarez Sanchez <dalvarez>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 12.0 (Pike)CC: abregman, apevec, ccollett, lbopf, lhh, majopela, mariel, mlopes, nyechiel, sclewis
Target Milestone: z1Keywords: AutomationBlocker, Triaged, ZStream
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-networking-ovn-3.0.0-3.el7ost Doc Type: Known Issue
Doc Text:
For deployments using OVN as the ML2 mechanism driver, only nodes with connectivity to the external networks are eligible to schedule the router gateway ports on them. However, there is a known issue that marks all nodes as eligible, which becomes a problem when the Compute nodes do not have external connectivity. As a result, if a router gateway port is scheduled on Compute nodes without external connectivity, ingress and egress connections for the external networks will not work; in which case the router gateway port has to be rescheduled to a controller node. As a workaround, you can provide connectivity on all your compute nodes, or you can consider deleting NeutronBridgeMappings, or set it to datacentre:br-ex. For more information, see https://bugzilla.redhat.com/show_bug.cgi?id=1525520 and https://bugzilla.redhat.com/show_bug.cgi?id=1510879.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-30 20:02:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1510879    

Description Miguel Angel Ajo 2017-12-13 13:39:14 UTC
Description of problem:

Due to a missing patch in OSP12/Pike networking-ovn will schedule
gateways on chassis which could not have external connectivity
(which can be discovered by looking at the bridge mappings in the
southbound database).

This is an issue for compute nodes with no external connectivity,
routers will randomly fail.


How reproducible:

Very likely

Steps to Reproduce:

1. setup a multinode deployments, where computes have no bridge-mappings,
and have no external connectivity (br-ex typically).

2. create a bunch of routers, and text connectivity.

3. some of them won't respond to pings because it's on an isolated chassis.

Actual results:

Routers won't ping/work from outside because are isolated

Expected results:

Routers ping and work because they are scheduled to the right hosts.

Comment 4 Martin Lopes 2017-12-15 06:29:44 UTC
Adding release notes entry.

Comment 5 Daniel Alvarez Sanchez 2018-01-15 16:42:03 UTC
*** Bug 1530191 has been marked as a duplicate of this bug. ***

Comment 6 Daniel Alvarez Sanchez 2018-01-15 16:43:32 UTC
*** Bug 1530201 has been marked as a duplicate of this bug. ***

Comment 10 Eran Kuris 2018-01-28 13:03:35 UTC
fix verified. 
(overcloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
12   -p 2018-01-26.2
(overcloud) [root@controller-1 ~]# rpm -qa |grep python-networking-ovn-
python-networking-ovn-3.0.0-3.el7ost.noarch

After deployment verified that there is not br-ex on compute nodes
created a bunch of routers with external connectivity 
When checked the accessibility to external network it worked. 

Also, there are no failure on CI that related to this issue.

Comment 14 errata-xmlrpc 2018-01-30 20:02:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0245