Bug 1819055 - Flow rule missing for DVR trunk port
Summary: Flow rule missing for DVR trunk port
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z12
: 13.0 (Queens)
Assignee: Slawek Kaplonski
QA Contact: Alex Katz
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-31 05:21 UTC by Brendan Shephard
Modified: 2023-10-06 19:36 UTC (History)
10 users (show)

Fixed In Version: openstack-neutron-12.1.1-17.el7ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-24 11:53:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1870114 0 None None None 2020-04-01 11:09:11 UTC
OpenStack gerrit 716642 0 None MERGED Add trunk subports to be one of dvr serviced device owners 2021-02-05 10:37:29 UTC
Red Hat Issue Tracker OSP-29420 0 None None None 2023-10-06 19:36:08 UTC
Red Hat Product Errata RHBA-2020:2724 0 None None None 2020-06-24 11:53:32 UTC

Description Brendan Shephard 2020-03-31 05:21:43 UTC
Description of problem:
When using DVR with Neutron trunk ports, such as with OCP and Kuryr. Currently, a flow rule is not created for PODs on the br-int bridge

Version-Release number of selected component (if applicable):
openstack-neutron-12.1.1-6.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OpenStack with DVR
2. Try to deploy OCP3 or 4 with Kuryr


Actual results:
Observe the POD IP's and MAC addresses missing from the Flow rules, table 1 of br-int.

We can SYN sent from the Master, to the LB. We can see the ACK come back from LB and land on the physical interface of the Hypervisor, however the packet never makes it through br-int since the OpenFlow rule is missing.



Expected results:
OpenFlow rules would be added for the PODs to allow them to work with DVR.


Additional info:
We can fix this temporarily like this:

diff --git a/neutron/common/utils.py b/neutron/common/utils.py
index ce95bae766..ddbd25d36b 100644
--- a/neutron/common/utils.py
+++ b/neutron/common/utils.py
@@ -175,7 +175,7 @@ def get_other_dvr_serviced_device_owners():
     """
     return [n_const.DEVICE_OWNER_LOADBALANCER,
             n_const.DEVICE_OWNER_LOADBALANCERV2,
-            n_const.DEVICE_OWNER_DHCP]
+            n_const.DEVICE_OWNER_DHCP, "trunk:subport"]

After applying this fix, we can see the flow rules are created and the PODs start working:

 cookie=0xb4e096cd7cd41ef2, duration=6589.071s, table=4, n_packets=40236, n_bytes=3006552, idle_age=0, priority=1,tun_id=0x4f actions=mod_vlan_vid:36,resubmit(,9)
 
cookie=0xb4e096cd7cd41ef2, duration=6503.529s, table=9, n_packets=2018471, n_bytes=824557716, idle_age=0, priority=1,dl_src=fa:16:3f:86:aa:62 actions=output:1
 
 cookie=0x8dfffa79c8ad5fd8, duration=6532.523s, table=0, n_packets=2019192, n_bytes=824921989, idle_age=0, priority=2,in_port=2,dl_src=fa:16:3f:86:aa:62 actions=resubmit(,1)
 
 cookie=0x8dfffa79c8ad5fd8, duration=6535.165s, table=1, n_packets=127740, n_bytes=82727443, idle_age=0, priority=4,dl_vlan=35,dl_dst=fa:16:3e:67:5f:f3 actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.150s, table=1, n_packets=0, n_bytes=0, idle_age=59929, priority=4,dl_vlan=35,dl_dst=fa:16:3e:5f:39:0c actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.007s, table=1, n_packets=0, n_bytes=0, idle_age=58397, priority=4,dl_vlan=37,dl_dst=fa:16:3e:8c:f6:5f actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.005s, table=1, n_packets=0, n_bytes=0, idle_age=58211, priority=4,dl_vlan=37,dl_dst=fa:16:3e:8c:87:18 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.002s, table=1, n_packets=0, n_bytes=0, idle_age=58285, priority=4,dl_vlan=37,dl_dst=fa:16:3e:59:46:08 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.992s, table=1, n_packets=0, n_bytes=0, idle_age=58071, priority=4,dl_vlan=37,dl_dst=fa:16:3e:a8:bc:2a actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.938s, table=1, n_packets=0, n_bytes=0, idle_age=60294, priority=4,dl_vlan=33,dl_dst=fa:16:3e:00:46:56 actions=mod_dl_src:fa:16:3e:60:8b:2e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.932s, table=1, n_packets=0, n_bytes=0, idle_age=58269, priority=4,dl_vlan=37,dl_dst=fa:16:3e:e0:ff:63 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.929s, table=1, n_packets=996185, n_bytes=393594000, idle_age=1, priority=4,dl_vlan=35,dl_dst=fa:16:3e:10:d0:8f actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.915s, table=1, n_packets=0, n_bytes=0, idle_age=58394, priority=4,dl_vlan=37,dl_dst=fa:16:3e:17:22:f7 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6565.061s, table=1, n_packets=231411, n_bytes=18103710, idle_age=0, priority=1 actions=drop
 
cookie=0xc3829487a610f91c, duration=63152.193s, table=1, n_packets=1831803, n_bytes=832269353, idle_age=0, priority=1,dl_vlan=36,dl_src=fa:16:3e:d4:43:1e actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 cookie=0xc3829487a610f91c, duration=63152.102s, table=1, n_packets=45295, n_bytes=3446743, idle_age=6, priority=1,dl_vlan=37,dl_src=fa:16:3e:6c:0c:97 actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 cookie=0xc3829487a610f91c, duration=63151.930s, table=1, n_packets=0, n_bytes=0, idle_age=63223, priority=1,dl_vlan=38,dl_src=fa:16:3e:a1:88:fe actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 
 
[root@bm-chci-0 ~]# ovs-ofctl add-flow br-int 'cookie=0x8dfffa79c8ad5fd8, duration=8100.536s, table=1, priority=4,dl_vlan=36,dl_dst=fa:16:3e:76:6a:72 actions=mod_dl_src:fa:16:3e:6c:0c:97,resubmit(,60)'
[root@bm-chci-0 ~]# ovs-ofctl add-flow br-int 'cookie=0x8dfffa79c8ad5fd8, duration=8100.536s, table=1, priority=4,dl_vlan=36,dl_dst=fa:16:3e:9f:79:d6 actions=mod_dl_src:fa:16:3e:6c:0c:97,resubmit(,60)'



Before the fix, the pods are CrashLoopBackOff:
webconsole-b5c499c98-bjj4q   1/1       Running            0          10s       10.11.0.29   master-2.ocp311.lab.diktio.net   <none>
webconsole-b5c499c98-ssldr   0/1       CrashLoopBackOff   6          12m       10.11.0.28   master-0.ocp311.lab.diktio.net   <none>
webconsole-b5c499c98-tvww5   1/1       Running            0          41m       10.11.0.7    master-1.ocp311.lab.diktio.net   <none>



After fixing the OpenFlow rules:

[root@master-2 ~]# oc delete pod webconsole-b5c499c98-ssldr
pod "webconsole-b5c499c98-ssldr" deleted
[root@master-2 ~]# oc get pods -o wide
NAME                         READY     STATUS    RESTARTS   AGE       IP           NODE                             NOMINATED NODE
webconsole-b5c499c98-bjj4q   1/1       Running   0          30s       10.11.0.29   master-2.ocp311.lab.diktio.net   <none>
webconsole-b5c499c98-tvww5   1/1       Running   0          41m       10.11.0.7    master-1.ocp311.lab.diktio.net   <none>
webconsole-b5c499c98-w4c56   1/1       Running   0          11s       10.11.0.42   master-0.ocp311.lab.diktio.net   <none>

Comment 2 Brendan Shephard 2020-03-31 21:20:42 UTC
Sorry, my description was slightly misleading. I will re-phrase. 

Prior to the fix, these are the OpenFlow rules that exist:

br-tun:
cookie=0xb4e096cd7cd41ef2, duration=6589.071s, table=4, n_packets=40236, n_bytes=3006552, idle_age=0, priority=1,tun_id=0x4f actions=mod_vlan_vid:36,resubmit(,9)
 
cookie=0xb4e096cd7cd41ef2, duration=6503.529s, table=9, n_packets=2018471, n_bytes=824557716, idle_age=0, priority=1,dl_src=fa:16:3f:86:aa:62 actions=output:1
 

br-int
 cookie=0x8dfffa79c8ad5fd8, duration=6532.523s, table=0, n_packets=2019192, n_bytes=824921989, idle_age=0, priority=2,in_port=2,dl_src=fa:16:3f:86:aa:62 actions=resubmit(,1)
 
 cookie=0x8dfffa79c8ad5fd8, duration=6535.165s, table=1, n_packets=127740, n_bytes=82727443, idle_age=0, priority=4,dl_vlan=35,dl_dst=fa:16:3e:67:5f:f3 actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.150s, table=1, n_packets=0, n_bytes=0, idle_age=59929, priority=4,dl_vlan=35,dl_dst=fa:16:3e:5f:39:0c actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.007s, table=1, n_packets=0, n_bytes=0, idle_age=58397, priority=4,dl_vlan=37,dl_dst=fa:16:3e:8c:f6:5f actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.005s, table=1, n_packets=0, n_bytes=0, idle_age=58211, priority=4,dl_vlan=37,dl_dst=fa:16:3e:8c:87:18 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6535.002s, table=1, n_packets=0, n_bytes=0, idle_age=58285, priority=4,dl_vlan=37,dl_dst=fa:16:3e:59:46:08 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.992s, table=1, n_packets=0, n_bytes=0, idle_age=58071, priority=4,dl_vlan=37,dl_dst=fa:16:3e:a8:bc:2a actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.938s, table=1, n_packets=0, n_bytes=0, idle_age=60294, priority=4,dl_vlan=33,dl_dst=fa:16:3e:00:46:56 actions=mod_dl_src:fa:16:3e:60:8b:2e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.932s, table=1, n_packets=0, n_bytes=0, idle_age=58269, priority=4,dl_vlan=37,dl_dst=fa:16:3e:e0:ff:63 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.929s, table=1, n_packets=996185, n_bytes=393594000, idle_age=1, priority=4,dl_vlan=35,dl_dst=fa:16:3e:10:d0:8f actions=mod_dl_src:fa:16:3e:d4:43:1e,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6534.915s, table=1, n_packets=0, n_bytes=0, idle_age=58394, priority=4,dl_vlan=37,dl_dst=fa:16:3e:17:22:f7 actions=mod_dl_src:fa:16:3e:a1:88:fe,resubmit(,60)
 cookie=0x8dfffa79c8ad5fd8, duration=6565.061s, table=1, n_packets=231411, n_bytes=18103710, idle_age=0, priority=1 actions=drop
 
 cookie=0xc3829487a610f91c, duration=63152.193s, table=1, n_packets=1831803, n_bytes=832269353, idle_age=0, priority=1,dl_vlan=36,dl_src=fa:16:3e:d4:43:1e actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 cookie=0xc3829487a610f91c, duration=63152.102s, table=1, n_packets=45295, n_bytes=3446743, idle_age=6, priority=1,dl_vlan=37,dl_src=fa:16:3e:6c:0c:97 actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 cookie=0xc3829487a610f91c, duration=63151.930s, table=1, n_packets=0, n_bytes=0, idle_age=63223, priority=1,dl_vlan=38,dl_src=fa:16:3e:a1:88:fe actions=mod_dl_src:fa:16:3f:86:aa:62,resubmit(,2)
 

These OpenFlow rules don't include the rules for our Trunk port traffic. So we need to add them manually like so:
[root@bm-chci-0 ~]# ovs-ofctl add-flow br-int 'cookie=0x8dfffa79c8ad5fd8, duration=8100.536s, table=1, priority=4,dl_vlan=36,dl_dst=fa:16:3e:76:6a:72 actions=mod_dl_src:fa:16:3e:6c:0c:97,resubmit(,60)'
[root@bm-chci-0 ~]# ovs-ofctl add-flow br-int 'cookie=0x8dfffa79c8ad5fd8, duration=8100.536s, table=1, priority=4,dl_vlan=36,dl_dst=fa:16:3e:9f:79:d6 actions=mod_dl_src:fa:16:3e:6c:0c:97,resubmit(,60)'




By applying this fix in the utils.py file, we can resolve this issue:
diff --git a/neutron/common/utils.py b/neutron/common/utils.py
index ce95bae766..ddbd25d36b 100644
--- a/neutron/common/utils.py
+++ b/neutron/common/utils.py
@@ -175,7 +175,7 @@ def get_other_dvr_serviced_device_owners():
     """
     return [n_const.DEVICE_OWNER_LOADBALANCER,
             n_const.DEVICE_OWNER_LOADBALANCERV2,
-            n_const.DEVICE_OWNER_DHCP]
+            n_const.DEVICE_OWNER_DHCP, "trunk:subport"]

Comment 20 errata-xmlrpc 2020-06-24 11:53:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2724


Note You need to log in before you can comment on or make changes to this bug.