Bug 1561880 - OVN does not work with VLAN tenant network as expected [Regression- from ml2/ovs] [NEEDINFO]
Summary: OVN does not work with VLAN tenant network as expected [Regression- from ml2/...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z6
: 15.0 (Stein)
Assignee: Numan Siddique
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On: 1766930 1673027 1683568
Blocks: 1701963 1508449 1570834 1570843 1571653 1613384
TreeView+ depends on / blocked
 
Reported: 2018-03-29 06:03 UTC by Eran Kuris
Modified: 2020-06-30 11:31 UTC (History)
21 users (show)

Fixed In Version: openvswitch-2.9.0-97.el7fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-30 11:29:40 UTC
Target Upstream Version:
ekuris: needinfo? (nusiddiq)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Launchpad 1765691 None None None 2018-04-20 12:23:37 UTC
OpenStack gerrit 563048 None ABANDONED Use geneve mtu for vlan 2020-09-26 13:51:53 UTC

Internal Links: 1570834

Description Eran Kuris 2018-03-29 06:03:44 UTC
Description of problem:
OVN lack support VLAN tenant networks.
It is a regression from ML2/OVS.
It effects on SRIOV customers that can't 

Version-Release number of selected component (if applicable):
OSP13 -p 2018-03-02.2

How reproducible:
always

Steps to Reproduce:
1. deploy OSP13HA- OVN setup 
2. verify that VLAN set as tenant network type 
3. create an internal network with VLAN tenant network
4. boot VM with floating IP 

Actual results:
connectivity not work 

Expected results:


Additional info:

Comment 5 anil venkata 2018-04-10 07:35:21 UTC
Hi Eran

I have created tenant network like below
neutron net-create --provider:network_type vlan --provider:physical_network datacentre --provider:segmentation_id 20 net2

(overcloud) [stack@undercloud ~]$ neutron net-show net2
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | True                                 |
| availability_zone_hints   |                                      |
| availability_zones        |                                      |
| created_at                | 2018-04-09T06:50:20Z                 |
| description               |                                      |
| dns_domain                |                                      |
| id                        | 16fa5757-a44e-472a-a2f8-997107b378c7 |
| ipv4_address_scope        |                                      |
| ipv6_address_scope        |                                      |
| mtu                       | 1500                                 |
| name                      | net2                                 |
| port_security_enabled     | True                                 |
| project_id                | e8b9795c7ab346d5bc0143bebf61bb00     |
| provider:network_type     | vlan                                 |
| provider:physical_network | datacentre                           |
| provider:segmentation_id  | 20                                   |
| qos_policy_id             |                                      |
| revision_number           | 4                                    |
| router:external           | False                                |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   | 4bfbf751-e7f1-42dc-be0a-3eea317600b2 |
| tags                      |                                      |
| tenant_id                 | e8b9795c7ab346d5bc0143bebf61bb00     |
| updated_at                | 2018-04-09T06:51:48Z                 |
+---------------------------+--------------------------------------+


I am able to spawn a vm and associate floatingip and ping this floatingip. Everything is working fine.

In bug description you said - 
Steps to Reproduce:
1. deploy OSP13HA- OVN setup 
2. verify that VLAN set as tenant network type 
3. create an internal network with VLAN tenant network
4. boot VM with floating IP 


How to verify step 2 and 3? i.e
2. verify that VLAN set as tenant network type 
3. create an internal network with VLAN tenant network

Comment 6 Eran Kuris 2018-04-10 08:19:18 UTC
(In reply to anil venkata from comment #5)
> Hi Eran
> 
> I have created tenant network like below
> neutron net-create --provider:network_type vlan --provider:physical_network
> datacentre --provider:segmentation_id 20 net2
> 
> (overcloud) [stack@undercloud ~]$ neutron net-show net2
> neutron CLI is deprecated and will be removed in the future. Use openstack
> CLI instead.
> +---------------------------+--------------------------------------+
> | Field                     | Value                                |
> +---------------------------+--------------------------------------+
> | admin_state_up            | True                                 |
> | availability_zone_hints   |                                      |
> | availability_zones        |                                      |
> | created_at                | 2018-04-09T06:50:20Z                 |
> | description               |                                      |
> | dns_domain                |                                      |
> | id                        | 16fa5757-a44e-472a-a2f8-997107b378c7 |
> | ipv4_address_scope        |                                      |
> | ipv6_address_scope        |                                      |
> | mtu                       | 1500                                 |
> | name                      | net2                                 |
> | port_security_enabled     | True                                 |
> | project_id                | e8b9795c7ab346d5bc0143bebf61bb00     |
> | provider:network_type     | vlan                                 |
> | provider:physical_network | datacentre                           |
> | provider:segmentation_id  | 20                                   |
> | qos_policy_id             |                                      |
> | revision_number           | 4                                    |
> | router:external           | False                                |
> | shared                    | False                                |
> | status                    | ACTIVE                               |
> | subnets                   | 4bfbf751-e7f1-42dc-be0a-3eea317600b2 |
> | tags                      |                                      |
> | tenant_id                 | e8b9795c7ab346d5bc0143bebf61bb00     |
> | updated_at                | 2018-04-09T06:51:48Z                 |
> +---------------------------+--------------------------------------+
> 
> 
> I am able to spawn a vm and associate floatingip and ping this floatingip.
> Everything is working fine.
> 
> In bug description you said - 
> Steps to Reproduce:
> 1. deploy OSP13HA- OVN setup 
> 2. verify that VLAN set as tenant network type 
> 3. create an internal network with VLAN tenant network
> 4. boot VM with floating IP 
> 
> 
> How to verify step 2 and 3? i.e
> 2. verify that VLAN set as tenant network type 
You did with netwok show command.

> 3. create an internal network with VLAN tenant network

when I created the internal network I did not provide: --provider:network_type vlan --provider:physical_network
> datacentre --provider:segmentation_id 20 net2

It takes this information from the neutron config files.
I am deploying now setup so you can take a look at it.

Comment 7 anil venkata 2018-04-10 08:51:46 UTC
Thanks Eran, please let me know once the setup is ready. I will look at that.

Comment 8 anil venkata 2018-04-10 11:54:12 UTC
In Eran's setup I see nova failing to bind the neutron port with below error

55d647008528b25a38ffc9c6 - default default] Refusing to bind port 425e9063-a428-4f78-9369-ac9865ae2a04 on host compute-0.localdomain due to the OVN chassis bridge mapping physical networks [] not supporting physical network: datacentre

This is because, there are no bridge mappings in compute node(I logged into compute node and checked for bridge mappings).

I see bridge mappings configured for controller but not for computes.
We might be missing some configuration in THT?

Eran is saying that he used the same templates for osp12 without ovn and they worked.

Fixing the THT may fix these jobs?
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-13_director-rhel-virthost-3cont_2comp-ipv4-vlan/
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron/job/DFG-network-neutron-12-director-hybrid-3cont-2comp-ipv4-vlan-multiple-nic-sriov-hybrid-ha/

Comment 9 anil venkata 2018-04-10 11:59:14 UTC
@Eran

Please use neutron-ovn-dvr.yam to create bridge mappings in compute nodes.

Comment 10 Numan Siddique 2018-04-10 14:42:31 UTC
Also please note that, for vlan provider/tenant networks in OVN we need to configure ovn-bridge-mappings on compute nodes. So as Anil has suggested we need to use neutron-ovn-dvr.yaml file for OVN vlan jobs.

Comment 16 Daniel Alvarez Sanchez 2018-04-17 13:45:08 UTC
A possible solution that popped out during our meeting today is to check the network type during the postcommit [0] at network creation time and adjust the MTU accordingly if it's VLAN.

[0] https://github.com/openstack/networking-ovn/blob/master/networking_ovn/ml2/mech_driver.py#L296

Comment 18 Assaf Muller 2018-04-23 13:39:22 UTC
From Network DFG triage call:

We should add a validation that blocks green field deployments that try to use OVN with VLAN tenant networks. Anil will send a patch.

Comment 23 anil venkata 2018-04-25 13:00:11 UTC
Reported https://bugzilla.redhat.com/show_bug.cgi?id=1571653 to disable vlan tenant networks.

Comment 25 anil venkata 2018-04-27 10:19:26 UTC
Assaf already removed blocker flag from this bug. As this is not a blocker bug I am removing Regression keyword to make it a non blocker.

Comment 30 anil venkata 2018-06-26 07:23:24 UTC
vlan tenant network not allowed in osp13. A tripleo package openstack-tripleo-heat-templates-8.0.2-7.el7ost is released with the gerrit change https://code.engineering.redhat.com/gerrit/#/c/137225/ which doesn't allow users to create vlan tenant network(see bug https://bugzilla.redhat.com/show_bug.cgi?id=1571653).

As tenant vlans are not supported in osp13, we are closing this bug.

Comment 32 Assaf Muller 2018-07-16 12:54:40 UTC
Reopening to track the ongoing work in core OVN to fix the VLAN tenant networks issue so we can test and deliver it on OSP 13 z-stream.

Comment 47 Eran Kuris 2019-02-27 14:34:04 UTC
Blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1683568

Comment 50 Rashid Khan 2019-03-08 15:30:25 UTC
OVS side is fixed and available in 2.9. 
Can we now move this to OSP

Comment 53 Eran Kuris 2019-03-12 06:46:56 UTC
Blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1683568

Comment 58 Lon Hohberger 2019-03-15 10:33:58 UTC
According to our records, this should be resolved by openvswitch-2.9.0-97.el7fdp.  This build is available now.

Comment 60 Eran Kuris 2019-03-28 07:00:05 UTC
The target release is osp13z6 moving to modify until we get a puddle.

Comment 64 Eran Kuris 2019-04-22 12:12:40 UTC
After debugging the issue with Numan on my setup  I re-open the issue.
the finding was related to routing centralized.
basically, the switch sends the packet to the controller-0
when controller-1 becomes master (when we bring down the Geneve interface in controller-0), I think controller-1 should send a GARP saying 10.0.1.1 is with me.  The switch will forward the packet to controller-1 from then on.
since that doesn't happen the traffic gets blocked.

Numan probably will add more info. 

thanks, Numan for helping find the specific problem.

Comment 67 Jakub Libosvar 2019-10-08 13:49:36 UTC
The work around this bug has been tracked for OSP 15. Please don't reschedule the target release, however feel free to clone this bug and ask for backport once it's fixed.


Note You need to log in before you can comment on or make changes to this bug.