Bug 1907779 - VLAN Transparency: packets dropped with flat provider network
Summary: VLAN Transparency: packets dropped with flat provider network
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z6
: 16.1 (Train on RHEL 8.2)
Assignee: Slawek Kaplonski
QA Contact: Eduardo Olivares
URL:
Whiteboard:
Depends On:
Blocks: 1846019 1914816
TreeView+ depends on / blocked
 
Reported: 2020-12-15 08:26 UTC by Eduardo Olivares
Modified: 2022-10-03 14:47 UTC (History)
9 users (show)

Fixed In Version: python-networking-ovn-7.3.1-1.20201114024055.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1914816 (view as bug list)
Environment:
Last Closed: 2021-05-26 13:50:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 770345 0 None MERGED Change how ovn mech driver checks if vlan transparency is supported 2021-02-17 07:18:36 UTC
Red Hat Issue Tracker OSP-553 0 None None None 2022-10-03 14:47:05 UTC
Red Hat Product Errata RHBA-2021:2097 0 None None None 2021-05-26 13:51:21 UTC

Description Eduardo Olivares 2020-12-15 08:26:29 UTC
Description of problem:
An external provider network is configured with VLAN Transparency. Port security is disabled. A subnet is created with default gateway 10.0.0.1, which is an IP located on the hypervisor server (I am using an OSP virt environment for this test), specifically on its interface 'external'.

A RHEL VM is created with a port connected to that network. Once that VM starts, a VLAN interface (vlan100) is created on its interface eth0. The IP from eth0 (10.0.0.242/24) is moved to vlan100.

The hypervisor's network configuration is modified:
1) add external.100 VLAN interface on external
2) add 10.0.0.250/24 on external.100
3) remove route to 10.0.0.0/24 from external (now traffic to 10.0.0.0/24 is routed through external.100)


Run from the VM (no responses are received)
# arping -I vlan100 10.0.0.250


On the compute node, ARP requests and replies are captured on ens5 interface (packets are tagged with vlan 100)
[root@compute-0 ~]# tcpdump -vne -i ens5
tcpdump: listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes
08:21:09.442069 fa:16:3e:44:ad:84 > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 100, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.250 (Broadcast) tell 10.0.0.242, length 28                         
08:21:09.442235 52:54:00:9a:8d:95 > fa:16:3e:44:ad:84, ethertype 802.1Q (0x8100), length 46: vlan 100, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.0.250 is-at 52:54:00:9a:8d:95, length 28                               


On the compute node, only ARP requests are captured on br-ex interface (packets are tagged with vlan 100)
[root@compute-0 ~]# tcpdump -vne -i br-ex 
tcpdump: listening on br-ex, link-type EN10MB (Ethernet), capture size 262144 bytes
08:22:56.473129 fa:16:3e:44:ad:84 > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 100, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.250 (Broadcast) tell 10.0.0.242, length 28
08:22:57.473341 fa:16:3e:44:ad:84 > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 100, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.250 (Broadcast) tell 10.0.0.242, length 28


It looks like the following rule is dropping packets:
 cookie=0x0, duration=54139.119s, table=34, n_packets=52568, n_bytes=2426347, idle_age=0, priority=100,reg10=0/0x1,reg14=0x4,reg15=0x4,metadata=0x4 actions=drop                                                                             




Version-Release number of selected component (if applicable):
RHOS-16.1-RHEL-8-20201203.n.0
ovn2.13-20.09.0-17.el8fdp.x86_64
python3-networking-ovn-7.3.1-1.20201114024043.el8ost.noarch <- this packet was installed manually, it is part of the original OSP compose

How reproducible:
100%

Comment 1 Eduardo Olivares 2020-12-16 16:14:29 UTC
a similar test doesn't fail with vlan provider networks

Comment 2 Slawek Kaplonski 2021-01-08 16:04:33 UTC
After investigation it seems for me like it is ovn issue. In case of flat neutron network openflow rule created in table=0 by ovn is like:

 cookie=0x779c3409, duration=5186.275s, table=0, n_packets=316, n_bytes=37480, priority=100,in_port="patch-br-int-to",vlan_tci=0x0000/0x1000 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x2->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)


So if we have there network with vlan configured, it don't match that rule (and any other rule in the table=0) and such packet isn't processed by other rules.

When I added rule like:

 cookie=0x0, duration=450.544s, table=0, n_packets=70, n_bytes=7194, in_port="patch-br-int-to" actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x2->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)

Connectivity was working fine.

Comment 25 errata-xmlrpc 2021-05-26 13:50:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097


Note You need to log in before you can comment on or make changes to this bug.