Created attachment 1569623 [details] neutron_logs Description of problem: Despite "gre" being explicitely set in network/network-environment.yaml before overcloud deployment, "vxlan" seem to be set in neutron config anyway. # server.log#controller-0 2019-05-15 17:37:18.019 8 DEBUG oslo_service.service [-] ml2.extension_drivers = ['qos', 'port_security'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.020 8 DEBUG oslo_service.service [-] ml2.external_network_type = None log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.020 8 DEBUG oslo_service.service [-] ml2.mechanism_drivers = ['openvswitch'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.020 8 DEBUG oslo_service.service [-] ml2.overlay_ip_version = 4 log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.020 8 DEBUG oslo_service.service [-] ml2.path_mtu = 0 log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.021 8 DEBUG oslo_service.service [-] ml2.physical_network_mtus = [] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.021 8 DEBUG oslo_service.service [-] ml2.tenant_network_types = ['vxlan'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.021 8 DEBUG oslo_service.service [-] ml2.type_drivers = ['vxlan', 'vlan', 'flat', 'gre'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.021 8 DEBUG oslo_service.service [-] ml2_type_flat.flat_networks = ['datacentre'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.022 8 DEBUG oslo_service.service [-] ml2_type_gre.tunnel_id_ranges = ['1:4094'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.022 8 DEBUG oslo_service.service [-] ml2_type_vlan.network_vlan_ranges = ['tenant:1000:2000'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.022 8 DEBUG oslo_service.service [-] ml2_type_vxlan.vni_ranges = ['1:4094'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.023 8 DEBUG oslo_service.service [-] ml2_type_vxlan.vxlan_group = 224.0.0.1 log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 2019-05-15 17:37:18.023 8 DEBUG oslo_service.service [-] OVS_DRIVER.vnic_type_blacklist = [] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2579 # Could not bind port on controller-1 2019-05-15 17:50:43.504 30 DEBUG neutron.plugins.ml2.drivers.mech_agent [req-66064cdf-1845-45dc-91d0-0b9e5c956977 - - - - -] Checking segment: {'id': '8f15b7ff-ac97-4763-94c8-230ee3797cd5', 'network_type': 'flat', 'physical_network':'datac entre', 'segmentation_id': None, 'network_id': '76be908b-9084-42a5-b84a-f8822da2a22b'} for mappings: {} with network types: ['gre', 'local', 'flat', 'vlan'] check_segment_for_agent /usr/lib/python3.6/site-packages/neutron/ plugins/ml2/d rivers/mech_agent.py:334 5193 2019-05-15 17:50:43.505 30 DEBUG neutron.plugins.ml2.drivers.mech_agent [req-66064cdf-1845-45dc-91d0-0b9e5c956977 - - - - -] Network 76be908b-9084-42a5-b84a-f8822da2a22b with segment 8f15b7ff-ac97-4763-94c8-230ee3797cd5 is connected t o physical network datacentre, but agent controller-1.localdomain reported physical networks {}. The physical network must be configured on the agent if binding is to succeed. check_segment_for_agent /usr/lib/python3.6/site-packages/neutro n/plugins/ml2/drivers/mech_agent.py:362 5194 2019-05-15 17:50:43.505 30 ERROR neutron.plugins.ml2.managers [req-66064cdf-1845-45dc-91d0-0b9e5c956977 - - - - -] Failed to bind port 2efa0986-ee1f-48cd-a774-c1e23bfb07b0 on host controller-1.localdomain for vnic_type normal using se gments [{'id': '8f15b7ff-ac97-4763-94c8-230ee3797cd5', 'network_type': 'flat', 'physical_network': 'datacentre', 'segmentation_id': None, 'network_id': '76be908b-9084-42a5-b84a-f8822da2a22b'}] # Vxlan support doesn't seem to be included (controller-2) 2019-05-15 17:50:27.697 25 DEBUG neutron.plugins.ml2.drivers.mech_agent [req-5e3c8a2d-af2c-4296-9b13-bda0e50de49c 95facc015ab140bba1948c6ee99dedde f1a98eec448e4abc8cba1e4b537f76cc - default default] Network 759fe946-8d57-469e-838d-aea34632 f223 with segment 144f1f86-093d-4953-9ecd-59ca3ca7284f is type of vxlan but agent controller-2.localdomain or mechanism driver only support ['gre', 'local', 'flat', 'vlan']. check_segment_for_agent /usr/lib/python3.6/site-packages/neutron/ plugins/ml2/drivers/mech_agent.py:346 # Script used for deployment #!/bin/bash openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server 10.35.255.6 \ -e /home/stack/virt/config_lvm.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovs.yaml \ -e /home/stack/virt/inject-trust-anchor.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e ~/containers-prepare-parameter.yaml \ --log-file overcloud_deployment_80.log $ cat virt/network/network-environment.yaml | grep gre NeutronNetworkType: gre NeutronTunnelTypes: gre # section in /etc/neutron/plugins/ml2/ml2_conf.ini on controller-0 (podman exec -it neutron_api /bin/bash) [ml2] type_drivers=vxlan,vlan,flat,gre tenant_network_types=vxlan mechanism_drivers=openvswitch,l2population path_mtu=0 extension_drivers=qos,port_security overlay_ip_version=4 Version-Release number of selected component (if applicable): OSP15, RHOS_TRUNK-15.0-RHEL-8-20190509.n.1 How reproducible: 100% Steps to Reproduce: 1. Deploy OSP15 using InfraRed, topology 1 UC, 3 Controllers, 2 Compute nodes, LVM, OVS, GRE, IPV4 2. Run full Tempest after deployment Actual results: Tempest is not able to pass setupClass for several tempest.api.compute.* and tempest.scenario.* tests, usually ends with error: tempest.exceptions.BuildErrorException: Server 112e1a58-b0af-4e52-8386-8f3548fb856b failed to build and is in ERROR status Details: {'code': 500, 'created': '2019-05-15T18:08:20Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 112e1a58-b0af-4e52-8386-8f3548fb856b.'} Expected results: Tempest passes 100% Additional info: Thanks to afazekas to initially figuring out the root cause Neutron logs from controllers collected Workaround (on each controller node): 1) Change tenant_network_types manually to "gre" in /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/ml2_conf.ini 2) Restart all neutron-related podman containers: $ podman restart neutron_metadata_agent neutron_l3_agent neutron_dhcp neutron_api neutron_ovs_agent neutron-haproxy-qrouter* 3) Rerun failing Tempest tests
NeutronNetworkType in your network-environment.yaml file is probably being overridden by the NeutronNetworkType definition in environments/services/neutron-ovs.yaml. Try changing the order of the files in the command line. Generally speaking user environment files should appear after the packaged environment files to avoid packaged variants overriding site/deployment specific settings.
I can confirm moving the -e /home/stack/virt/network/network-environment.yaml \ to the end of the list of yamls included in the overcloud deployment script correctly configures "tenant_network_types=gre" in /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/ml2_conf.ini on controllers which is then propagated into the neutron container.
*** Bug 1710442 has been marked as a duplicate of this bug. ***
*** Bug 1710453 has been marked as a duplicate of this bug. ***