Bug 1570499

Summary: OVN - gateways are scheduled on compute nodes and SNAT traffic goes through them
Product: Red Hat OpenStack Reporter: Eran Kuris <ekuris>
Component: openstack-tripleo-heat-templatesAssignee: anil venkata <vkommadi>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: amuller, bcafarel, dalvarez, jschluet, mburns, nyechiel, vkommadi
Target Milestone: betaKeywords: Triaged
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.0.2-6.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-27 13:52:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1566050    

Description Eran Kuris 2018-04-23 05:58:36 UTC
Description of problem:
After fresh install of OVN-dvr env, I noticed that ovn-cms-options not enabled by default by tripleO 

https://review.openstack.org/#/c/559806/2

Admin sets ovn-cms-options in external_ids as

ovs-vsctl set open .
   external_ids:ovn-cms-options="enable-chassis-as-gw"

to enable a chassis as a candidate for scheduling gateway router.
Networking-ovn will parse ovn-cms-options and select this chassis
if it has proper bridge mappings.

This helps admin to exclude compute nodes to host gateway routers as
they are more likely to be restarted for maintenance operations.

We follow this order for selecting candidates
1) candidates with ovn-cms-options and proper bridge mappings
2) if no candidates from 1), then chassis with proper
   bridge mappings

Version-Release number of selected component (if applicable):
(overcloud) [stack@undercloud-0 ~]$ rpm -qa |grep triple
ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch
openstack-tripleo-heat-templates-8.0.2-0.20180414062830.5f869f2.el7ost.noarch
puppet-tripleo-8.3.2-0.20180411174307.el7ost.noarch
python-tripleoclient-9.2.0-4.el7ost.noarch
openstack-tripleo-puppet-elements-8.0.0-1.el7ost.noarch
openstack-tripleo-image-elements-8.0.0-3.el7ost.noarch
openstack-tripleo-ui-8.3.1-2.el7ost.noarch
openstack-tripleo-validations-8.4.0-2.el7ost.noarch
openstack-tripleo-common-containers-8.6.1-0.20180410165748.4d8ca16.el7ost.noarch
openstack-tripleo-common-8.6.1-0.20180410165748.4d8ca16.el7ost.noarch
(overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version 


How reproducible:


Steps to Reproduce:
1. run deployment and check with this command  sudo ovs-vsctl list open 
2.external_ids        : {hostname="controller-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="datacentre:br-ex,tenant:br-isolated", ovn-cms-opti│········································
ons=enable-chassis-as-gw, ovn-encap-ip="172.17.2.12", ovn-encap-type=geneve, ovn-remote="tcp:172.17.1.18:6642", rundir="/var/run/openvswitch", system-│········································
id="c89a8e64-a565-4ec9-83fe-7dd8aa04813f"}                                                                                                            
3.

Actual results:


Expected results:


Additional info:

Comment 1 anil venkata 2018-04-23 06:34:36 UTC
Thanks Eran.

In overcloud_deploy.sh, we are using template file [1] ,  but this file is not having 

OVNCMSOptions: "enable-chassis-as-gw"

as per the commit https://github.com/openstack/tripleo-heat-templates/commit/71d59bb0a34349f3ed2b95d70452b771cc8039d2#diff-bdb9af0031906100cdbc39af8dfa1e6e

May be adding this to [1] helps resolve this issue or 
in ovn deployment, as this must be enabled by default in a controller node, can we add it to a generic template which runs in controller node?

[1] /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovn-dvr-ha.yaml

Comment 3 Daniel Alvarez Sanchez 2018-04-25 14:55:14 UTC
*** Bug 1571097 has been marked as a duplicate of this bug. ***

Comment 5 Daniel Alvarez Sanchez 2018-04-27 16:50:30 UTC
*** Bug 1566050 has been marked as a duplicate of this bug. ***

Comment 12 Eran Kuris 2018-05-06 12:16:21 UTC
(overcloud) [stack@undercloud-0 ~]$ rpm -qa |grep openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-8.0.2-9.el7ost.noarch
(overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
2018-05-02.5(

(overcloud) [stack@undercloud-0 ~]$ cat /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovn-dvr-ha.yaml
parameter_defaults:
    ControllerParameters:
        OVNCMSOptions: enable-chassis-as-gw
    NeutronEnableDHCPAgent: false
    NeutronEnableDVR: true
    NeutronMechanismDrivers: ovn
    NeutronNetworkType: geneve
    NeutronServicePlugins: qos,ovn-router,trunk
    NeutronTypeDrivers: geneve,vlan,flat
    NeutronVniRanges:
    - 1:65536
    OVNNeutronSyncMode: log
    OVNQosDriver: ovn-qos
    OVNTunnelEncapType: geneve
    OVNVifType: ovs
resource_registry:
    OS::TripleO::Docker::NeutronMl2PluginBase: ../../puppet/services/neutron-plugin-ml2-ovn.yaml
    OS::TripleO::Services::ComputeNeutronCorePlugin: OS::Heat::None
    OS::TripleO::Services::ComputeNeutronOvsAgent: OS::Heat::None
    OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None
    OS::TripleO::Services::NeutronL3Agent: OS::Heat::None
    OS::TripleO::Services::NeutronMetadataAgent: OS::Heat::None
    OS::TripleO::Services::NeutronOvsAgent: OS::Heat::None
    OS::TripleO::Services::OVNController: ../../docker/services/ovn-controller.yaml
    OS::TripleO::Services::OVNDBs: ../../docker/services/pacemaker/ovn-dbs.yaml
    OS::TripleO::Services::OVNMetadataAgent: ../../docker/services/ovn-metadata.yaml



checked that SNAT traffic goes through controller node.

Comment 16 errata-xmlrpc 2018-06-27 13:52:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086