Description of problem: OSP 13 with OVN Compute nodes deployed without IP on external_ids:ovn-encap-ip As result, boot instance failed on this Compute node Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy OSP 13 with OVN (undercloud) [stack@site-undercloud-0 ~]$ cat overcloud_deploy.sh #!/bin/bash openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server 192.168.24.1 \ -e /home/stack/osp-13-spine-leaf/config_lvm.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -n /home/stack/osp-13-spine-leaf/network/network_data.yaml \ -r /home/stack/osp-13-spine-leaf/roles/roles_data.yaml \ -e /home/stack/osp-13-spine-leaf/network/network-environment.yaml \ -e /home/stack/osp-13-spine-leaf/enable-tls.yaml \ -e /home/stack/osp-13-spine-leaf/inject-trust-anchor.yaml \ -e /home/stack/osp-13-spine-leaf/public_vip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \ -e /home/stack/osp-13-spine-leaf/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-ovn-ha.yaml \ -e /home/stack/osp-13-spine-leaf/nodes_data.yaml \ -e /home/stack/osp-13-spine-leaf/debug.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \ -e /home/stack/osp-13-spine-leaf/ovn-extras.yaml \ -e /home/stack/osp-13-spine-leaf/l3_fip_qos.yaml \ -e /home/stack/osp-13-spine-leaf/docker-images.yaml \ --log-file overcloud_deployment_52.log 2. Probably Spine-Leaf network topology is not required (Spine Leaf topology templates under https://gitlab.cee.redhat.com/yobshans/rhos-qe-edge-stuff/tree/master/osp15 ) 3. Boot instance Actual results: | fault | {u'message': u'Binding failed for port 69625cb0-06f7-4be8-a89d-58ffb2db5d55, please check neutron logs for more information.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1862, in _do_build_and_run_instance\n filter_properties, request_spec)\n File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2142, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'created': u'2019-05-27T16:36:59Z'} | Expected results: Instance booted OK Additional info: Hi, I checked the compute node that was having troubles and saw issues in ovn-controller. The encap-ip was missing, so I set it up manually there and now I could boot instances there: [heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl set open . external_ids:ovn-encap-ip="172.19.2.19" This is how it was set in TripleO queens [0] and how we set it now master [1]. I'm not a TripleO expert but looks like fetching it from hiera may be the right way to do it. I found the patch that changed this [2]. I would suggest to open a BZ with the info I shared here and gather info from TripleO folks to consider backporting [2] into 13. [0] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/ovn-controller.yaml#L97 [1] https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/deployment/ovn/ovn-controller-container-puppet.yaml#L113 [2] https://opendev.org/openstack/tripleo-heat-templates/commit/3a7baa8fa6fa8dd6735f38d6236e8a2cb5d34659
Just a clarification on the Additional Info which I didn't add when I wrote it. When I debugged this setup I found that only compute1-0 was missing the encap-ip while the rest of the nodes were properly configured. I believe that the patch I linked there would fix it but Kamil will take a look to confirm.
Update from fresh deployment ovn-encap-ip is set only on Leaf0 Compute node which is running with controllers [heat-admin@overcloud-compute0-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute0-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.5", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="8dfc9815-8c13-4bf0-9cfc-2138f5df0b8d"} [heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute1-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="bbf1b7b0-4bf4-4048-8e74-feb619003ab7"} [heat-admin@overcloud-compute2-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute2-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.10:6642", rundir="/var/run/openvswitch", system-id="c80fd4db-ff95-4bf6-9262-ea3cb06ff4c9"}
As a workaround suggested by hjensas I added the following parameters to nodes_data.yaml ovn::controller::ovn_encap_ip: "%{hiera('tenant1')}" ovn::controller::ovn_encap_ip: "%{hiera('tenant2')}" Deployment created values ovn_encap_ip on Compute nodes: [heat-admin@overcloud-compute0-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute0-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.20", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="e46ddc38-8335-4765-8877-7b9e0590ec64"} [heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute1-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-ip="172.19.2.12", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="16fc6b95-08d0-4f1a-964f-a1104481b318"} [heat-admin@overcloud-compute2-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute2-0.localdomain", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-ip="172.19.3.12", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.15:6642", rundir="/var/run/openvswitch", system-id="90080e54-0b64-4bae-bde6-704867384d6b"} Instances booted on different networks successfully and pingable. FIP traffic is OK (overcloud) [stack@site-undercloud-0 ~]$ openstack server list +--------------------------------------+----------+--------+--------------------------------------+--------+---------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+----------+--------+--------------------------------------+--------+---------+ | 0a88a67a-2460-434f-93e2-157bf4979073 | vm-leaf1 | ACTIVE | private-leaf1=192.0.20.20, 10.0.20.7 | cirros | m1.tiny | | bc8aee5c-e870-487d-adf7-a12063dadef9 | vm-leaf0 | ACTIVE | private-leaf0=192.0.10.4, 10.0.10.4 | cirros | m1.tiny | +--------------------------------------+----------+--------+--------------------------------------+--------+---------+ (overcloud) [stack@site-undercloud-0 ~]$ ping 10.0.10.4 PING 10.0.10.4 (10.0.10.4) 56(84) bytes of data. 64 bytes from 10.0.10.4: icmp_seq=1 ttl=63 time=2.63 ms (overcloud) [stack@site-undercloud-0 ~]$ ping 10.0.20.7 PING 10.0.20.7 (10.0.20.7) 56(84) bytes of data. 64 bytes from 10.0.20.7: icmp_seq=1 ttl=62 time=1.40 ms
Patch that you pointed [0] should solve issue, I propose backport upstream, when it will be merged, will do the same downstream. https://opendev.org/openstack/tripleo-heat-templates/commit/3a7baa8fa6fa8dd6735f38d6236e8a2cb5d34659
Issue has been fixed. Tested on OSP 13 2019-08-19.2 openstack-tripleo-heat-templates-8.3.1-76.el7ost.noarch Overcloud deployed without changes in nodes_data.yaml ovn::controller::ovn_encap_ip: "%{hiera('tenant1')}" ovn::controller::ovn_encap_ip: "%{hiera('tenant2')}" [heat-admin@overcloud-compute0-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute0-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf0:br-ex", ovn-encap-ip="172.19.1.4", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="1a4dcb97-6517-4011-bc34-4e5f4ae75a34"} [heat-admin@overcloud-compute1-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute1-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf1:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="235f5361-3211-4429-b55d-e892cbd376e1"} [heat-admin@overcloud-compute2-0 ~]$ sudo ovs-vsctl get open . external_ids {hostname="overcloud-compute2-0.redhat.local", ovn-bridge=br-int, ovn-bridge-mappings="leaf2:br-ex", ovn-encap-type=geneve, ovn-remote="tcp:172.25.1.11:6642", rundir="/var/run/openvswitch", system-id="56990f9f-4491-4893-89ef-11367d23b7f0"} Status changed to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2624