Bug 1238217

Summary: Overcloud floating IPs not working when deploying overcloud with external network on tagged vlan
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: rhosp-directorAssignee: Dan Sneddon <dsneddon>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: high Docs Contact:
Priority: high    
Version: 7.0 (Kilo)CC: amuller, dmacpher, dprince, dsneddon, gfidente, gkeegan, hbrock, mburns, mcornea, rhel-osp-director-maint, rhos-flags, rrosa
Target Milestone: gaKeywords: Triaged
Target Release: DirectorFlags: dmacpher: needinfo+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-40.el7ost python-rdomanager-oscplugin-0.0.8-37.el7ost Doc Type: Known Issue
Doc Text:
The lack of a CLI parameter to set NeutronExternalNetworkBridge caused problems when assigning floating IPs. This means the only way to set this parameter is through custom environment file for network isolation. For example: parameter_defaults: # Set to "br-ex" if External is on native VLAN NeutronExternalNetworkBridge: "''" Set this parameter to '' if the floating IP network is on a VLAN and to 'br-ex' if on a native VLAN on the br-ex bridge. This configuration allows the Neutron bridge mapping to work correctly for the environment. This is documented in the Red Hat Enterprise Linux OpenStack Platform 7 Director Installation and Usage guide
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-05 13:58:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
env setup
none
openvswitch_agent.log from controller where floating IP router is active
none
output of "ovs-vsctl list port"
none
output of "ip address" on active controller
none
plugin.ini from active controller
none
l3_agent.ini from active controller none

Description Marius Cornea 2015-07-01 12:19:42 UTC
Created attachment 1045036 [details]
env setup

Description of problem:
I'm doing a 1 ctrl and 1 compute deployment with network isolation, external network is set up on a tagged vlan. The resulted overcloud ext-net netowrk shows as provider:network_type gre and I believe it should be vlan. Also it doesn't show any physical_network name adn segmentation_id set to 2. 

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with this:
overcloud deploy --plan-uuid 97b1bce8-e497-4ae6-ad56-ddd5e070b0af  --control-scale 1 --compute-scale 1 --ceph-storage-scale 0 --block-storage-scale 0 --swift-storage-scale 0 -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml --network-cidr 172.16.17.0/24 --floating-ip-cidr 10.35.173.0/24 --floating-ip-start=10.35.173.110 --floating-ip-end=10.35.173.150 --bm-network-gateway=10.35.173.254

Actual results:
I cannot reach the IP set on the external facing interface of the router namespace on the controller.

Expected results:
I should be able to reach the router external IP address from outside.

Additional info:
I believe we need to provide the type, physical network name and segmentation id for the external net via the deploy command but I'm not sure if it supports that. Attaching the environment file, nic template and resulted overcloud network that describe my setup.

Comment 3 Hugh Brock 2015-07-01 13:35:44 UTC
Actually, I'm not sure we should be configuring overcloud Neutron automatically *at all* -- seems to me we should leave that to the cloud admin to do post deployment. 

Alternatively, if it's *easy* to provide the above information and have it run in the post-deploy config stage, that's OK -- but I'd rather take it out entirely than have it be wrong.

--Hugh

Comment 4 chris alfonso 2015-07-01 17:39:07 UTC
Do you have a doc link handy we can point QE to use to configure the overcloud network. We can open an RFE for UCLI to do this at a later time, perhaps.

Comment 5 Dan Sneddon 2015-07-01 18:20:32 UTC
Marius, I'd like to investigate this further. Can you give me an example of the neutron network settings that would work in your environment?

I think we may end up just documenting how to create the network after the fact for GA, but I still want to make sure we are documenting the right thing and that we have a good idea how to automate it (if need be) in the future.

Comment 6 Dan Sneddon 2015-07-01 19:48:51 UTC
Looking more closely at the environment, it appears that the external net is not being created correctly. The ext-net that the installer sets up is being set up as a gre network. We actually want the ext-net to be connected to a VLAN on br-ex.

I may not be up to date on the latest neutron features, but this used to be set up by creating a network with provider:network_type=vlan and provider:segmentation_id set to the VLAN ID of the external network (see https://www.rdoproject.org/Neutron_with_OVS_and_VLANs). Alternately, the same guide recommends using a flat network when using a dedicated interface or the native VLAN.

One interesting data point is that with the network set up this way, it actually works with br-ex either on a dedicated interface or the native VLAN. I think this is just because the segmentation ID doesn't get used so the end result is an untagged OVS port. I think that's why this wasn't caught before, because we weren't doing testing with VLAN trunking, but instead using dedicated interfaces or putting external on the native VLAN.

We may be able to implement this by always creating an external net with type VLAN, but leave the segmentation_id blank if we are using the native VLAN.

Can someone look at these networks and provide an assessment?


    [stack@rhos-compute-node-13 ~]$ neutron net-show ext-net
    +---------------------------+--------------------------------------+
    | Field                     | Value                                |
    +---------------------------+--------------------------------------+
    | admin_state_up            | True                                 |
    | id                        | 00f07737-4bf8-4c51-b8cb-61f56d30c3c5 |
    | mtu                       | 0                                    |
    | name                      | ext-net                              |
    | provider:network_type     | gre                                  |
    | provider:physical_network |                                      |
    | provider:segmentation_id  | 3                                    |
    | router:external           | True                                 |
    | shared                    | False                                |
    | status                    | ACTIVE                               |
    | subnets                   | e849d632-485a-463e-8c13-63d590792e81 |
    | tenant_id                 | 941391fe92d946c3bd63acb37dd49029     |
    +---------------------------+--------------------------------------+
    [stack@rhos-compute-node-13 ~]$ neutron net-show default-net
    +---------------------------+--------------------------------------+
    | Field                     | Value                                |
    +---------------------------+--------------------------------------+
    | admin_state_up            | True                                 |
    | id                        | 1c7bbff4-d44d-4f35-b895-de319343e7f5 |
    | mtu                       | 0                                    |
    | name                      | default-net                          |
    | provider:network_type     | gre                                  |
    | provider:physical_network |                                      |
    | provider:segmentation_id  | 1                                    |
    | router:external           | False                                |
    | shared                    | True                                 |
    | status                    | ACTIVE                               |
    | subnets                   | c28449a5-6fee-449b-b25b-a222ef14a2ec |
    | tenant_id                 | 941391fe92d946c3bd63acb37dd49029     |
    +---------------------------+--------------------------------------+

Comment 7 Assaf Muller 2015-07-01 21:52:24 UTC
You can skip to 'Summary' if you aren't interested in the details.

If you set 'external_network_bridge' in l3_agent.ini to the empty string '' (This is NOT the default), external networks will behave the same as provider bridge networks. This means that external router interfaces will NOT be plugged to br-ex. Rather, they will plug in to br-int. The installer will then create a bridge (You can call it br-ex, but for clarity's sake let's call it br-eth). The OVS agent will interconnect br-int with br-eth automatically. In ovs_plugin.ini you set up bridge mappings between the physical network (physnet) the external network will use. Let's call it external_physnet. The bridge mappings will look like: external_physnet:br-eth.

Now the installer GUI will (Or already does, who knows!) ask the user how to set up the external network. The NIC name, of course, should this network be VLAN tagged, and if so with what VLAN ID. The installer will then hook up br-eth to a physical NIC that was chosen by the user. The installer will then do a post-installation step (When all APIs are available), and create the external network either flat or VLAN tagged and Neutron will take care of it for you.

Flat:
neutron net-create --router:external=True --provider_network_type flat --provider:physical_network external_physnet public

VLAN:
neutron net-create --router:external=True --provider_network_type vlan --provider:physical_network --provider:segmentation_id 42 public

Summary:
l3_agent.ini:
external_network_bridge = ''

ovs_plugin.ini:
bridge_mappings = external_physnet:br-eth

Installer:
* Create OVS bridge called 'br-eth', connect it to the physical NIC the user chose.
* Create the external network either flat or VLAN as detailed above.

Comment 8 Marius Cornea 2015-07-02 09:37:29 UTC
Here are the neutron settings in the config files. I'm trying to add vlan provider network, create a router and set it to use the vlan network as gateway but I don't seem to get any success. I'll describe below the steps. Let me know if I'm missing something. Thanks

[root@overcloud-controller-0 neutron]# grep external_network_bridge l3_agent.ini  | grep -v ^#
external_network_bridge = br-ex

[root@overcloud-controller-0 neutron]# grep bridge_mappings  plugins/openvswitch/ovs_neutron_plugin.ini | grep -v ^#
bridge_mappings =datacentre:br-ex

neutron net-create --provider:physical_network=datacentre --provider:network_type=vlan --provider:segmentation_id=188 --shared --router:external --shared ext-net
neutron subnet-create ext-net 10.35.173.0/24 --name ext-net-subnet --gateway=10.35.173.254 --allocation-pool start=10.35.173.120,end=10.35.173.130 --disable-dhcp
neutron router-create router
neutron router-gateway-set router ext-net

[root@overcloud-controller-0 ~]# ip netns exec qrouter-9abb764c-ae44-43a8-9d9e-de81dd536f93 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
16: qg-cad8fc68-43: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:9d:aa:65 brd ff:ff:ff:ff:ff:ff
    inet 10.35.173.120/24 brd 10.35.173.255 scope global qg-cad8fc68-43
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe9d:aa65/64 scope link 
       valid_lft forever preferred_lft forever
[root@overcloud-controller-0 ~]# ip netns exec qrouter-9abb764c-ae44-43a8-9d9e-de81dd536f93 ping 10.35.173.254
PING 10.35.173.254 (10.35.173.254) 56(84) bytes of data.
^C
--- 10.35.173.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

[root@overcloud-controller-0 ~]# ovs-vsctl show
0cbb8301-7887-42b7-b1ee-c107d41728c4
    Bridge br-tun
        fail_mode: secure
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port "gre-ac101c0b"
            Interface "gre-ac101c0b"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.16.28.10", out_key=flow, remote_ip="172.16.28.11"}
        Port br-tun
            Interface br-tun
                type: internal
    Bridge br-int
        fail_mode: secure
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port br-int
            Interface br-int
                type: internal
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
    Bridge br-ex
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port "p1p2"
            Interface "p1p2"
        Port "vlan188"
            tag: 188
            Interface "vlan188"
                type: internal
        Port "vlan184"
            tag: 184
            Interface "vlan184"
                type: internal
        Port br-ex
            Interface br-ex
                type: internal
        Port "qg-cad8fc68-43"
            Interface "qg-cad8fc68-43"
                type: internal
    ovs_version: "2.3.1-git3282e51"

[root@overcloud-controller-0 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 84:8f:69:fb:c6:43 brd ff:ff:ff:ff:ff:ff
    inet 192.0.2.21/24 brd 192.0.2.255 scope global dynamic em1
       valid_lft 79155sec preferred_lft 79155sec
    inet 192.0.2.19/32 brd 192.0.2.255 scope global em1
       valid_lft forever preferred_lft forever
    inet 192.0.2.20/32 brd 192.0.2.255 scope global em1
       valid_lft forever preferred_lft forever
    inet6 fe80::868f:69ff:fefb:c643/64 scope link 
       valid_lft forever preferred_lft forever
3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 84:8f:69:fb:c6:44 brd ff:ff:ff:ff:ff:ff
4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether a0:36:9f:22:e7:00 brd ff:ff:ff:ff:ff:ff
    inet 172.16.28.10/24 brd 172.16.28.255 scope global p1p1
       valid_lft forever preferred_lft forever
    inet6 fe80::a236:9fff:fe22:e700/64 scope link 
       valid_lft forever preferred_lft forever
5: p1p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP qlen 1000
    link/ether a0:36:9f:22:e7:02 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::a236:9fff:fe22:e702/64 scope link 
       valid_lft forever preferred_lft forever
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether 2a:9c:b1:93:f6:1b brd ff:ff:ff:ff:ff:ff
7: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether a0:36:9f:22:e7:02 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::a236:9fff:fe22:e702/64 scope link 
       valid_lft forever preferred_lft forever
8: vlan188: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 8e:f1:dc:d5:a3:85 brd ff:ff:ff:ff:ff:ff
    inet 10.35.173.11/24 brd 10.35.173.255 scope global vlan188
       valid_lft forever preferred_lft forever
    inet 10.35.173.10/32 brd 10.35.173.255 scope global vlan188
       valid_lft forever preferred_lft forever
    inet6 fe80::8cf1:dcff:fed5:a385/64 scope link 
       valid_lft forever preferred_lft forever
9: vlan184: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 32:79:c7:ee:32:36 brd ff:ff:ff:ff:ff:ff
    inet 10.35.169.11/24 brd 10.35.169.255 scope global vlan184
       valid_lft forever preferred_lft forever
    inet 10.35.169.10/32 brd 10.35.169.255 scope global vlan184
       valid_lft forever preferred_lft forever
    inet6 fe80::3079:c7ff:feee:3236/64 scope link 
       valid_lft forever preferred_lft forever
10: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether 2e:1e:ed:79:00:4a brd ff:ff:ff:ff:ff:ff
11: br-tun: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether 8a:d0:4d:fc:a9:43 brd ff:ff:ff:ff:ff:ff

Comment 9 Dan Prince 2015-07-05 18:33:46 UTC
Good information in this ticket. I was able to reproduce the initial bug, and the suggested Neutron configuration changes seem to resolve the issues.

From an installer standpoint I think we need two patches to help resolve this. One patch for tripleo-heat-templates, and then another for python-rdomanager-oscplugin:

1) https://review.openstack.org/#/c/198557/ (Add NeutronExternalNetworkBridge parameter)

This patch will effectively set the external_network_bridge to '' when network isolation is enabled via the installer.

2) https://review.gerrithub.io/238578 (Add --external-net-segementation-id to deploy).

This patch adds the ability to set the segmentation ID to oscplugin. Simply adding the segmentation ID to the external network settings should automatically flip the network type to 'vlan' and things should get configured properly thereafter. See also: http://git.openstack.org/cgit/openstack/os-cloud-config/tree/os_cloud_config/neutron.py#n89

Comment 11 Dan Sneddon 2015-07-08 22:21:04 UTC
Patch to the T-H-T has been merged: https://review.openstack.org/#/c/198557

Comment 13 Marius Cornea 2015-07-14 21:10:11 UTC
Patch doesn't work when using Tuskar. I ran it by the following command and external_network_bridge  was still set to br-ex in l3_agent.ini. 

openstack overcloud deploy --plan-uuid 48066c3d-1a9a-4311-9000-c1eda3aa19c0  --control-scale 3 --compute-scale 1 --ceph-storage-scale 0 --block-storage-scale 0 --swift-storage-scale 0 -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/network-environment.yaml --network-cidr 192.168.0.0/24 --floating-ip-cidr=172.16.23.0/24 --floating-ip-start=172.16.23.100 --floating-ip-end=172.16.23.150 --bm-network-gateway=172.16.23.251 --external-net-segmentation-id 10

When running deploy with heat templates (--use-tripleo-heat-templates) I still couldn't reach the l3 agent:

[root@overcloud-controller-2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP qlen 1000
    link/ether 00:54:87:1b:7f:72 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether f6:51:87:b4:eb:9a brd ff:ff:ff:ff:ff:ff
4: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 00:54:87:1b:7f:72 brd ff:ff:ff:ff:ff:ff
    inet 192.0.2.19/24 brd 192.0.2.255 scope global dynamic br-ex
       valid_lft 80853sec preferred_lft 80853sec
    inet 192.0.2.15/32 brd 192.0.2.255 scope global br-ex
       valid_lft forever preferred_lft forever
    inet6 fe80::254:87ff:fe1b:7f72/64 scope link 
       valid_lft forever preferred_lft forever
5: vlan10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 56:ff:07:8b:91:83 brd ff:ff:ff:ff:ff:ff
    inet 172.16.23.13/24 brd 172.16.23.255 scope global vlan10
       valid_lft forever preferred_lft forever
    inet6 fe80::54ff:7ff:fe8b:9183/64 scope link 
       valid_lft forever preferred_lft forever
6: vlan20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 16:8c:df:31:34:29 brd ff:ff:ff:ff:ff:ff
    inet 172.16.20.15/24 brd 172.16.20.255 scope global vlan20
       valid_lft forever preferred_lft forever
    inet 172.16.20.10/32 brd 172.16.20.255 scope global vlan20
       valid_lft forever preferred_lft forever
    inet6 fe80::148c:dfff:fe31:3429/64 scope link 
       valid_lft forever preferred_lft forever
7: vlan30: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 66:fa:8b:71:de:a7 brd ff:ff:ff:ff:ff:ff
    inet 172.16.21.14/24 brd 172.16.21.255 scope global vlan30
       valid_lft forever preferred_lft forever
    inet6 fe80::64fa:8bff:fe71:dea7/64 scope link 
       valid_lft forever preferred_lft forever
8: vlan40: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 86:12:b7:c8:56:a1 brd ff:ff:ff:ff:ff:ff
    inet 172.16.19.13/24 brd 172.16.19.255 scope global vlan40
       valid_lft forever preferred_lft forever
    inet6 fe80::8412:b7ff:fec8:56a1/64 scope link 
       valid_lft forever preferred_lft forever
9: vlan50: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether c6:27:40:e9:3e:2c brd ff:ff:ff:ff:ff:ff
    inet 172.16.22.13/24 brd 172.16.22.255 scope global vlan50
       valid_lft forever preferred_lft forever
    inet6 fe80::c427:40ff:fee9:3e2c/64 scope link 
       valid_lft forever preferred_lft forever
10: br-int: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 5a:e6:07:4f:39:41 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::58e6:7ff:fe4f:3941/64 scope link 
       valid_lft forever preferred_lft forever
11: br-tun: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether ca:6d:a9:46:95:4e brd ff:ff:ff:ff:ff:ff
    inet6 fe80::c86d:a9ff:fe46:954e/64 scope link 
       valid_lft forever preferred_lft forever

[root@overcloud-controller-2 ~]# ip netns list
qrouter-c0213f3e-0677-4b02-be49-921360fec0ca
qdhcp-0225016f-556d-41d6-88cc-a3280664bcdf

[root@overcloud-controller-2 ~]# ip netns exec qrouter-c0213f3e-0677-4b02-be49-921360fec0ca ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
13: qr-ef845bb1-9a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:c4:15:64 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global qr-ef845bb1-9a
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fec4:1564/64 scope link 
       valid_lft forever preferred_lft forever
14: qg-b1baab17-25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:6c:26:69 brd ff:ff:ff:ff:ff:ff
    inet 172.16.23.100/24 brd 172.16.23.255 scope global qg-b1baab17-25
       valid_lft forever preferred_lft forever
    inet 172.16.23.101/32 brd 172.16.23.101 scope global qg-b1baab17-25
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe6c:2669/64 scope link 
       valid_lft forever preferred_lft forever

[root@overcloud-controller-2 ~]# ip netns exec qrouter-c0213f3e-0677-4b02-be49-921360fec0ca ping 172.16.23.251
PING 172.16.23.251 (172.16.23.251) 56(84) bytes of data.
From 172.16.23.100 icmp_seq=1 Destination Host Unreachable
From 172.16.23.100 icmp_seq=2 Destination Host Unreachable
From 172.16.23.100 icmp_seq=3 Destination Host Unreachable
From 172.16.23.100 icmp_seq=4 Destination Host Unreachable
^C
--- 172.16.23.251 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4000ms
pipe 4
[root@overcloud-controller-2 ~]# ovs-vsctl show 
f02d0970-55c9-4f66-8865-b19aced52d55
    Bridge br-ex
        Port "vlan20"
            tag: 20
            Interface "vlan20"
                type: internal
        Port "eth0"
            Interface "eth0"
        Port "vlan30"
            tag: 30
            Interface "vlan30"
                type: internal
        Port br-ex
            Interface br-ex
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port "vlan10"
            tag: 10
            Interface "vlan10"
                type: internal
        Port "vlan40"
            tag: 40
            Interface "vlan40"
                type: internal
        Port "vlan50"
            tag: 50
            Interface "vlan50"
                type: internal
    Bridge br-int
        fail_mode: secure
        Port br-int
            Interface br-int
                type: internal
        Port "tapbc9c7e3c-6e"
            tag: 1
            Interface "tapbc9c7e3c-6e"
                type: internal
        Port "qr-ef845bb1-9a"
            tag: 4095
            Interface "qr-ef845bb1-9a"
                type: internal
        Port "qg-b1baab17-25"
            tag: 4095
            Interface "qg-b1baab17-25"
                type: internal
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
    Bridge br-tun
        fail_mode: secure
        Port "gre-ac10160b"
            Interface "gre-ac10160b"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.16.22.13", out_key=flow, remote_ip="172.16.22.11"}
        Port "gre-ac10160a"
            Interface "gre-ac10160a"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.16.22.13", out_key=flow, remote_ip="172.16.22.10"}
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port br-tun
            Interface br-tun
                type: internal
        Port "gre-ac10160c"
            Interface "gre-ac10160c"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.16.22.13", out_key=flow, remote_ip="172.16.22.12"}
    ovs_version: "2.3.1-git3282e51"

Comment 14 Dan Sneddon 2015-07-14 22:25:06 UTC
Created attachment 1052083 [details]
openvswitch_agent.log from controller where floating IP router is active

This was taken from the active controller where the external network is on a VLAN. The bridge mappings were set to datacentre:br-vlan, and the vlans are on top of br-vlan.

Comment 15 Dan Sneddon 2015-07-14 22:25:43 UTC
Created attachment 1052089 [details]
output of "ovs-vsctl list port"

Comment 16 Dan Sneddon 2015-07-14 22:27:18 UTC
Created attachment 1052090 [details]
output of "ip address" on active controller

Comment 17 Dan Sneddon 2015-07-14 22:33:32 UTC
Created attachment 1052091 [details]
plugin.ini from active controller

Comment 18 Dan Sneddon 2015-07-14 22:33:53 UTC
Created attachment 1052092 [details]
l3_agent.ini from active controller

Comment 19 Dan Sneddon 2015-07-14 22:39:26 UTC
I worked with Marius today, and we determined that the CLI is now working as expected. When you use a segmentation ID, it will create the ext-net as a VLAN network with the correct segmentation ID. However, the floating IPs aren't usable. The qg- and qr- ports in OVS have a segmentation id of 4095.

It seems that the neutron bridge mappings are correct, and I can't see anything wrong with the resulting configs.

I have attached the following logs from the active controller to help with troubleshooting:

plugin.ini
l3_agent.ini
openvswitch_agent.log
output of "ip address"
output of "ovs-vsctl list port"

I would like to have someone with extensive Neutron troubleshooting skills have a look at these logs and offer any suggestions. I think Marius can make the evnironment available. Adding a NEEDINFO for Assaf Muller.

Comment 20 Assaf Muller 2015-07-15 16:18:38 UTC
I need access to the environment to resolve this ASAP. Also pinged Marius on IRC.

Comment 21 Dan Sneddon 2015-07-15 16:19:54 UTC
I had a theory that this is the issue:

tenant_network_types = gre

And that it needed to be changed to:

tenant_network_types = gre,vlan

In order for the VLAN IDs to be set correctly on the patch ports.

Marius tested this, however, and got the same result.

Comment 23 Assaf Muller 2015-07-16 14:49:32 UTC
First update:

The qr (Internal router port) having vlan 4095 was suspicious to me because that has nothing to do with external connectivity, that's just a normal port and it was failing to bind.

Partial neutron agent-list output:
http://pastebin.test.redhat.com/297963

Is showing that on a single machine, OVS and L3/DHCP agents are reporting using a different hostname, which means that binding DHCP and router ports won't work. Per an IRC conversation this is already being handled.

I'm removing the neutron-n-* hostname in-place currently so I can troubleshoot the external connectivity issue.

Comment 24 Assaf Muller 2015-07-16 15:06:37 UTC
Alright that was the issue. Once I removed the host value from all of the Neutron conf files and used neutron agent-delete on the "old" agents using the "neutron-n-X" hostname, I could create a router, add an internal and external interface successfully and ping the router's external device from other nodes.

Comment 25 Marius Cornea 2015-07-16 15:22:52 UTC
I confirm it's working after the changes. It looks that we need to remove NeutronScale in order to get consistent hostnames. 

[stack@instack ~]$ neutron router-show tmp
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                                                                                                     |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                                                                                                                      |
| distributed           | False                                                                                                                                                                                     |
| external_gateway_info | {"network_id": "097a8151-a00f-43de-8775-d75183de3f2d", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "4ab154d3-50b6-4afa-9087-1b6983daef9a", "ip_address": "172.16.23.101"}]} |
| ha                    | False                                                                                                                                                                                     |
| id                    | 754c72b2-81b4-4e81-9128-c864d6f13453                                                                                                                                                      |
| name                  | tmp                                                                                                                                                                                       |
| routes                |                                                                                                                                                                                           |
| status                | ACTIVE                                                                                                                                                                                    |
| tenant_id             | c243d62651c54451b165eaa34942dac8                                                                                                                                                          |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
[root@overcloud-controller-1 ~]# ip netns exec qrouter-754c72b2-81b4-4e81-9128-c864d6f13453 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
14: qr-d12b8561-40: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:f5:10:c1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.5/24 brd 192.168.0.255 scope global qr-d12b8561-40
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fef5:10c1/64 scope link 
       valid_lft forever preferred_lft forever
15: qg-8693ab55-a5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:ef:61:e7 brd ff:ff:ff:ff:ff:ff
    inet 172.16.23.101/24 brd 172.16.23.255 scope global qg-8693ab55-a5
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:feef:61e7/64 scope link 
       valid_lft forever preferred_lft forever
[root@overcloud-controller-1 ~]# ip netns exec qrouter-754c72b2-81b4-4e81-9128-c864d6f13453 ping 172.16.23.251
PING 172.16.23.251 (172.16.23.251) 56(84) bytes of data.
64 bytes from 172.16.23.251: icmp_seq=1 ttl=64 time=1.36 ms
64 bytes from 172.16.23.251: icmp_seq=2 ttl=64 time=0.421 ms

Comment 27 Marius Cornea 2015-07-21 09:38:31 UTC
Deployed with Tuskar, then created a tenant an vlan external network, added a router and was able to reach floating IPs:

[stack@instack ~]$ cat network-environment.yaml 
parameters:
 Controller-1::NeutronExternalNetworkBridge: "''"

parameter_defaults:
  InternalApiNetCidr: 172.16.20.0/24
  StorageNetCidr: 172.16.21.0/24
  StorageMgmtNetCidr: 172.16.19.0/24
  TenantNetCidr: 172.16.22.0/24
  ExternalNetCidr: 172.16.23.0/24
  InternalApiAllocationPools: [{'start': '172.16.20.10', 'end': '172.16.20.100'}]
  StorageAllocationPools: [{'start': '172.16.21.10', 'end': '172.16.21.100'}]
  StorageMgmtAllocationPools: [{'start': '172.16.19.10', 'end': '172.16.19.100'}]
  TenantAllocationPools: [{'start': '172.16.22.10', 'end': '172.16.22.100'}]
  ExternalAllocationPools: [{'start': '172.16.23.10', 'end': '172.16.23.100'}]
  ExternalInterfaceDefaultRoute: 172.16.23.251

openstack overcloud deploy --control-scale 3 --compute-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml  --plan overcloud

neutron net-create ext_net --provider:network_type vlan  --provider:physical_network  datacentre  --provider:segmentation_id 10  --router:external
neutron  subnet-create ext_net --allocation-pool  start=172.16.23.110,end=172.16.23.150  --gateway 172.16.23.251 --cidr 172.16.23.0/24 --enable_dhcp=False
neutron net-create tenant-net
neutron subnet-create tenant-net 192.168.0.0/24 --name tenant-subnet --gateway 192.168.0.1
neutron router-create  tenant-router
neutron router-interface-add tenant-router tenant-subnet
neutron router-gateway-set tenant-router ext_net

Comment 28 Dan Sneddon 2015-07-29 06:21:14 UTC
I updated the doc text to reflect the fact that the external bridge parameter that is prefaced with Controller-1:: must be in the parameters: section, not the parameter_defaults section.

Adding a needinfo for Dan Macpherson, just to make sure he sees this.

Comment 30 errata-xmlrpc 2015-08-05 13:58:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549