Bug 1510879 - OVN deployments shouldn't have bridge-mappings on compute nodes
Summary: OVN deployments shouldn't have bridge-mappings on compute nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z1
: 12.0 (Pike)
Assignee: Daniel Alvarez Sanchez
QA Contact: Eran Kuris
URL:
Whiteboard:
: 1511493 (view as bug list)
Depends On: 1525520
Blocks: 1482694
TreeView+ depends on / blocked
 
Reported: 2017-11-08 11:35 UTC by Daniel Alvarez Sanchez
Modified: 2018-01-30 21:24 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-7.0.3-20.el7ost
Doc Type: Bug Fix
Doc Text:
All Compute and Controller nodes have bridge-mappings configured, and therefore are eligible to schedule routers. However, if you scheduled a router on a Compute node that doesn't have a connection to an external network, connectivity with the external network fails. This fix adds the ability to configure bridge-mappings in TripleO and in the director according to roles. This means that you can now exclude Compute nodes from router scheduling and maintain external network connectivity.
Clone Of:
Environment:
Last Closed: 2018-01-30 21:24:32 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0253 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 director Bug Fix Advisory 2018-02-16 03:41:33 UTC
OpenStack gerrit 518440 None master: MERGED tripleo-heat-templates: OVN: Provide the option to define NeutronBridgeMappings as a role parameter (I6a00b8dc1ff387cc5e... 2018-01-09 18:25:31 UTC
Launchpad 1730711 None None None 2017-11-08 11:35:35 UTC

Description Daniel Alvarez Sanchez 2017-11-08 11:35:35 UTC
When deploying setups with OVN as ML2 mechanism driver, ovn-bridge-mappings are configured in all nodes by TripleO/Director. This makes all nodes eligible for scheduling router gateway ports on them but the problem is that compute nodes don't have external connectivity. As a consequence, if a router gateway port is scheduled on compute nodes (which don't have external connectivity), connections to/from external networks won't work.

We can either:

1. Avoid setting bridge-mappings on compute nodes.
2. Setup external connectivity in compute nodes (like we do in DVR).

We reported this upstream [0] and Numan proposed a solution in T-H-T [1] to address the issue by following 1).

Right now, some tempest tests are failing in our CI randomly since sometimes routers get scheduled in compute nodes and connection to FIP's fail.
In order to make our CI green we need to fix this bug.

[0] https://bugs.launchpad.net/tripleo/+bug/1730711
[1] https://review.openstack.org/#/c/518440/

Comment 2 Numan Siddique 2017-11-10 09:01:10 UTC
Submitted the patch to fix this - https://review.openstack.org/#/c/518440/
and upstream bug link - https://bugs.launchpad.net/tripleo/+bug/1730711

Comment 3 Daniel Alvarez Sanchez 2017-11-10 13:23:18 UTC
(In reply to Numan Siddique from comment #2)
> Submitted the patch to fix this - https://review.openstack.org/#/c/518440/
> and upstream bug link - https://bugs.launchpad.net/tripleo/+bug/1730711

Added those already as external trackers when opened the BZ. Thanks Numan!
I think the initial direction to fix this was to setup a DVR-like environment for our OVN jobs. However it'd be better to apply the fix in my opinion since it would be aligned with upstream tripleo environments.

Thanks,
Daniel

Comment 4 Jakub Libosvar 2017-11-13 14:55:00 UTC
*** Bug 1511493 has been marked as a duplicate of this bug. ***

Comment 5 Eran Kuris 2017-11-27 13:50:17 UTC
Can you please provide the package version where I can find the fix, so I can verify I have the correct package?

Comment 6 Daniel Alvarez Sanchez 2017-11-27 14:06:25 UTC
Hi Eran, the fix is not yet merged [0]. Maybe we can use a mock build?
[0] https://code.engineering.redhat.com/gerrit/#/c/123217/

Comment 7 Eran Kuris 2017-11-27 14:54:14 UTC
(In reply to Daniel Alvarez Sanchez from comment #6)
> Hi Eran, the fix is not yet merged [0]. Maybe we can use a mock build?
> [0] https://code.engineering.redhat.com/gerrit/#/c/123217/

Ohh I thought its already merged. So yes when we get a new puddle of OSP12 I will add the fix manually and verify it.

Comment 8 Eran Kuris 2017-11-28 15:02:37 UTC
(In reply to Daniel Alvarez Sanchez from comment #6)
> Hi Eran, the fix is not yet merged [0]. Maybe we can use a mock build?
> [0] https://code.engineering.redhat.com/gerrit/#/c/123217/

test it with a mock build and its look ok.

Comment 9 Eran Kuris 2017-11-28 15:02:38 UTC
(In reply to Daniel Alvarez Sanchez from comment #6)
> Hi Eran, the fix is not yet merged [0]. Maybe we can use a mock build?
> [0] https://code.engineering.redhat.com/gerrit/#/c/123217/

test it with a mock build and its look ok.

Comment 10 Jon Schlueter 2018-01-09 18:43:44 UTC
Build openstack-tripleo-heat-templates-7.0.3-20.el7ost includes a patch from this bug, please update BZ state accordingly

Comment 14 Eran Kuris 2018-01-22 07:13:33 UTC
THe bug fix on:
(undercloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
12   -p 2018-01-16.2

(overcloud) [stack@undercloud-0 ~]$ rpm -qa |grep openstack-tripleo-heat-templates-7.0.
openstack-tripleo-heat-templates-7.0.3-21.el7ost.noarch

[root@controller-0 ~]# ovs-vsctl get open . external_ids  
{hostname="controller-0.localdomain", ovn-bridge-mappings="datacentre:br-ex,tenant:br-isolated", ovn-encap-ip="172.17.1.10", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.20:6642", system-id="bd4321c3-3ef8-40e6-bc19-5d2411da4fd8"}
[root@controller-1 ~]# ovs-vsctl get open . external_ids  
{hostname="controller-1.localdomain", ovn-bridge-mappings="datacentre:br-ex,tenant:br-isolated", ovn-encap-ip="172.17.1.17", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.20:6642", system-id="7cc86fdb-1570-4657-8ceb-e60a726be236"}
[root@controller-2 ~]# ovs-vsctl get open . external_ids  
{hostname="controller-2.localdomain", ovn-bridge-mappings="datacentre:br-ex,tenant:br-isolated", ovn-encap-ip="172.17.1.16", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.20:6642", system-id="2be430af-09c7-4970-8bf9-d8c7eb77e75c"}
[root@compute-0 ~]# ovs-vsctl get open . external_ids  
{hostname="compute-0.localdomain", ovn-encap-ip="172.17.1.24", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.20:6642", system-id="ea8526b8-b8b6-4a00-8dd9-2dbc97bc69b6"}
[root@compute-1 ~]# ovs-vsctl get open . external_ids  
{hostname="compute-1.localdomain", ovn-encap-ip="172.17.1.12", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.20:6642", system-id="5662f345-6ffd-4531-94de-fdb76eb045ff"}

Comment 20 errata-xmlrpc 2018-01-30 21:24:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0253


Note You need to log in before you can comment on or make changes to this bug.