Bug 1714691 - [OVN ] there is no IP for external_ids:ovn-encap-ip on compute nodes
Summary: [OVN ] there is no IP for external_ids:ovn-encap-ip on compute nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z8
: 13.0 (Queens)
Assignee: RHOS Maint
QA Contact: Yuri Obshansky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-28 15:30 UTC by Yuri Obshansky
Modified: 2019-09-03 16:55 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.3.1-55.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-03 16:55:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 662237 0 None MERGED Convert ServiceNetMap evals to hiera interpolation 2020-05-07 15:58:07 UTC
Red Hat Product Errata RHBA-2019:2624 0 None None None 2019-09-03 16:55:53 UTC

Description Yuri Obshansky 2019-05-28 15:30:30 UTC
Description of problem:
OSP 13 with OVN
Compute nodes deployed without IP on external_ids:ovn-encap-ip
As result, boot instance failed on this Compute node

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy OSP 13 with OVN
(undercloud) [stack@site-undercloud-0 ~]$ cat overcloud_deploy.sh
#!/bin/bash

openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server 192.168.24.1 \
-e /home/stack/osp-13-spine-leaf/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-n /home/stack/osp-13-spine-leaf/network/network_data.yaml \
-r /home/stack/osp-13-spine-leaf/roles/roles_data.yaml \
-e /home/stack/osp-13-spine-leaf/network/network-environment.yaml \
-e /home/stack/osp-13-spine-leaf/enable-tls.yaml \
-e /home/stack/osp-13-spine-leaf/inject-trust-anchor.yaml \
-e /home/stack/osp-13-spine-leaf/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /home/stack/osp-13-spine-leaf/hostnames.yml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovn-ha.yaml \
-e /home/stack/osp-13-spine-leaf/nodes_data.yaml \
-e /home/stack/osp-13-spine-leaf/debug.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \
-e /home/stack/osp-13-spine-leaf/ovn-extras.yaml \
-e /home/stack/osp-13-spine-leaf/l3_fip_qos.yaml \
-e /home/stack/osp-13-spine-leaf/docker-images.yaml \
--log-file overcloud_deployment_52.log
2. Probably Spine-Leaf network topology is not required
(Spine Leaf topology templates under
https://gitlab.cee.redhat.com/yobshans/rhos-qe-edge-stuff/tree/master/osp15 )
3. Boot instance 


Actual results:
| fault                               | {u'message': u'Binding failed for port 69625cb0-06f7-4be8-a89d-58ffb2db5d55, please check neutron logs for more information.', u'code': 500, u'details': u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1862, in _do_build_and_run_instance\n    filter_properties, request_spec)\n  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2142, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'created': u'2019-05-27T16:36:59Z'} |

Expected results:
Instance booted OK

Additional info:
Hi,

I checked the compute node that was having troubles and saw issues in
ovn-controller.
The encap-ip was missing, so I set it up manually there and now I
could boot instances there:

[heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl set open .
external_ids:ovn-encap-ip="172.19.2.19"

This is how it was set in TripleO queens [0] and how we set it now
master [1]. I'm not a TripleO expert but looks like fetching it from
hiera may be the right way to do it. I found the patch that changed
this [2].

I would suggest to open a BZ with the info I shared here and gather
info from TripleO folks to consider backporting [2] into 13.

[0] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/ovn-controller.yaml#L97
[1] https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/deployment/ovn/ovn-controller-container-puppet.yaml#L113
[2] https://opendev.org/openstack/tripleo-heat-templates/commit/3a7baa8fa6fa8dd6735f38d6236e8a2cb5d34659

Comment 1 Daniel Alvarez Sanchez 2019-05-28 16:00:30 UTC
Just a clarification on the Additional Info which I didn't add when I wrote it. When I debugged this setup I found that only compute1-0 was missing the encap-ip while the rest of the nodes were properly configured. I believe that the patch I linked there would fix it but Kamil will take a look to confirm.

Comment 2 Yuri Obshansky 2019-05-28 16:42:23 UTC
Update from fresh deployment
ovn-encap-ip is set only on Leaf0 Compute node
which is running with controllers 

[heat-admin@overcloud-compute0-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute0-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.5", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="8dfc9815-8c13-4bf0-9cfc-2138f5df0b8d"}

[heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute1-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="bbf1b7b0-4bf4-4048-8e74-feb619003ab7"}

[heat-admin@overcloud-compute2-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute2-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="c80fd4db-ff95-4bf6-9262-ea3cb06ff4c9"}

Comment 3 Yuri Obshansky 2019-05-29 13:52:45 UTC
As a workaround suggested by hjensas
I added the following parameters to nodes_data.yaml
    ovn::controller::ovn_encap_ip: "%{hiera('tenant1')}"
    ovn::controller::ovn_encap_ip: "%{hiera('tenant2')}"

Deployment created values ovn_encap_ip on Compute nodes:

[heat-admin@overcloud-compute0-0 ~]$  sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute0-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.20", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="e46ddc38-8335-4765-8877-7b9e0590ec64"}

[heat-admin@overcloud-compute1-0 ~]$  sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute1-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-ip="172.19.2.12", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="16fc6b95-08d0-4f1a-964f-a1104481b318"}

[heat-admin@overcloud-compute2-0 ~]$  sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute2-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-ip="172.19.3.12", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="90080e54-0b64-4bae-bde6-704867384d6b"}

Instances booted on different networks successfully and pingable.
FIP traffic is OK

(overcloud) [stack@site-undercloud-0 ~]$ openstack server list
+--------------------------------------+----------+--------+--------------------------------------+--------+---------+
| ID                                   | Name     | Status | Networks                             | Image  | Flavor  |
+--------------------------------------+----------+--------+--------------------------------------+--------+---------+
| 0a88a67a-2460-434f-93e2-157bf4979073 | vm-leaf1 | ACTIVE | private-leaf1=192.0.20.20, 10.0.20.7 | cirros | m1.tiny |
| bc8aee5c-e870-487d-adf7-a12063dadef9 | vm-leaf0 | ACTIVE | private-leaf0=192.0.10.4, 10.0.10.4  | cirros | m1.tiny |
+--------------------------------------+----------+--------+--------------------------------------+--------+---------+

(overcloud) [stack@site-undercloud-0 ~]$ ping 10.0.10.4
PING 10.0.10.4 (10.0.10.4) 56(84) bytes of data.
64 bytes from 10.0.10.4: icmp_seq=1 ttl=63 time=2.63 ms

(overcloud) [stack@site-undercloud-0 ~]$ ping 10.0.20.7
PING 10.0.20.7 (10.0.20.7) 56(84) bytes of data.
64 bytes from 10.0.20.7: icmp_seq=1 ttl=62 time=1.40 ms

Comment 4 Kamil Sambor 2019-05-30 14:59:29 UTC
Patch that you pointed [0] should solve issue, I propose backport upstream, when it will be merged, will do the same downstream.


https://opendev.org/openstack/tripleo-heat-templates/commit/3a7baa8fa6fa8dd6735f38d6236e8a2cb5d34659

Comment 15 Yuri Obshansky 2019-08-22 19:46:42 UTC
Issue has been fixed. Tested on OSP 13 2019-08-19.2
openstack-tripleo-heat-templates-8.3.1-76.el7ost.noarch

Overcloud deployed without changes in nodes_data.yaml
    ovn::controller::ovn_encap_ip: "%{hiera('tenant1')}"
    ovn::controller::ovn_encap_ip: "%{hiera('tenant2')}"

[heat-admin@overcloud-compute0-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute0-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.4", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="1a4dcb97-6517-4011-bc34-4e5f4ae75a34"}

[heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute1-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="235f5361-3211-4429-b55d-e892cbd376e1"}

[heat-admin@overcloud-compute2-0 ~]$ sudo ovs-vsctl get open . external_ids
{hostname="overcloud-compute2-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="56990f9f-4491-4893-89ef-11367d23b7f0"}

Status changed to verified

Comment 18 errata-xmlrpc 2019-09-03 16:55:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2624


Note You need to log in before you can comment on or make changes to this bug.