Bug 1654840 - Routers hosted on one of two networker nodes unable to access external network
Summary: Routers hosted on one of two networker nodes unable to access external network
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 13.0 (Queens)
Assignee: Rodolfo Alonso
QA Contact: Roee Agiman
URL:
Whiteboard:
: 1654836 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-29 19:42 UTC by Lars Kellogg-Stedman
Modified: 2022-07-09 13:49 UTC (History)
8 users (show)

Fixed In Version: openstack-neutron-12.0.5-5.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-30 17:23:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
openvswitch-agent.log after the restart (55.75 KB, text/plain)
2018-11-30 14:57 UTC, Lars Kellogg-Stedman
no flags Details
ovs-agent log (51.84 KB, text/plain)
2018-12-05 18:31 UTC, Lars Kellogg-Stedman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1697243 0 None None None 2018-12-07 17:57:20 UTC
OpenStack gerrit 587244 0 None None None 2018-12-11 13:54:42 UTC
Red Hat Issue Tracker OSP-17450 0 None None None 2022-07-09 13:49:14 UTC
Red Hat Product Errata RHSA-2019:0935 0 None None None 2019-04-30 17:23:46 UTC

Description Lars Kellogg-Stedman 2018-11-29 19:42:57 UTC
We have two networker nodes in our OSP 13 deployment.  We have an external
network attached to br-ex that works fine on one node, but routers attached to
that network on the second networker are unable to pass packets.  Looking at
the openflow roles on the working node we see the following, which
looks approximately correct:

    [root@neu-17-11-nc2 ~]# ovs-ofctl dump-flows br-ex
     cookie=0xb2514f67fe7d2fc1, duration=87362.294s, table=0, n_packets=2563737, n_bytes=251925611, priority=4,in_port="phy-br-ex",dl_vlan=12 actions=strip_vlan,NORMAL
     cookie=0x79c8c5bf077fd19e, duration=87362.244s, table=0, n_packets=47, n_bytes=2814, priority=4,in_port="phy-br-ex",dl_vlan=14 actions=strip_vlan,NORMAL
     cookie=0xb2514f67fe7d2fc1, duration=87378.996s, table=0, n_packets=221465, n_bytes=12102660, priority=2,in_port="phy-br-ex" actions=drop
     cookie=0xb2514f67fe7d2fc1, duration=87379.025s, table=0, n_packets=5451790, n_bytes=18038331531, priority=0 actions=NORMAL

Given that we have in `ovs-vsctl show` ports tagged with vlan 12
associated with that external network:

            Port "qg-f921a854-20"
                tag: 12
                Interface "qg-f921a854-20"
                    type: internal

On the other networker, the openflow rules for br-ex look wrong:

    [root@neu-19-11-nc1 ~]# ovs-ofctl dump-flows br-ex
     cookie=0x951ea463c71bfcae, duration=111477.400s, table=0, n_packets=816, n_bytes=83579, priority=2,in_port="phy-br-ex" actions=drop
     cookie=0x951ea463c71bfcae, duration=111477.407s, table=0, n_packets=0, n_bytes=0, priority=0 actions=NORMAL

As you might expect given the above rule, routers on this host that
are attached to the external network are unable to pass packets.

If we power off one of the networker nodes so that everything fails
back to a single host, the problem goes away and all routers attached
to the external network work correctly.

Comment 1 Lars Kellogg-Stedman 2018-11-29 20:29:22 UTC
After restarting the node, I've been able to reproduce the problem.

Here's a router on nc1:

    [root@neu-19-11-nc1 ~]# ip netns exec qrouter-4cb73b3c-d4d1-4add-9191-3d60e3abb0f7 ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    72: ha-3f3d5c15-a8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1000
        link/ether fa:16:3e:2a:ce:1e brd ff:ff:ff:ff:ff:ff
        inet 169.254.192.7/18 brd 169.254.255.255 scope global ha-3f3d5c15-a8
           valid_lft forever preferred_lft forever
        inet 169.254.0.4/24 scope global ha-3f3d5c15-a8
           valid_lft forever preferred_lft forever
        inet6 fe80::f816:3eff:fe2a:ce1e/64 scope link
           valid_lft forever preferred_lft forever
    73: qg-824a6859-ae: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9050 qdisc noqueue state UNKNOWN group default qlen 1000
        link/ether fa:16:3e:d4:2e:b0 brd ff:ff:ff:ff:ff:ff
        inet 128.31.27.33/22 scope global qg-824a6859-ae
           valid_lft forever preferred_lft forever
        inet6 fe80::f816:3eff:fed4:2eb0/64 scope link nodad
           valid_lft forever preferred_lft forever
    
    
    [root@neu-19-11-nc1 ~]# ip netns exec qrouter-4cb73b3c-d4d1-4add-9191-3d60e3abb0f7 ip route
    default via 128.31.24.1 dev qg-824a6859-ae
    128.31.24.0/22 dev qg-824a6859-ae proto kernel scope link src 128.31.27.33
    169.254.0.0/24 dev ha-3f3d5c15-a8 proto kernel scope link src 169.254.0.4
    169.254.192.0/18 dev ha-3f3d5c15-a8 proto kernel scope link src 169.254.192.7

That's plumbed into ovs:

    [root@neu-19-11-nc1 ~]# ovs-vsctl show
    0e32ae2f-6c78-49bd-8045-8fc4ac32f425
        Manager "ptcp:6640:127.0.0.1"
            is_connected: true
        Bridge br-sahara
            Controller "tcp:127.0.0.1:6633"
            fail_mode: secure
            Port br-sahara
                Interface br-sahara
                    type: internal
            Port "p3p1.207"
                Interface "p3p1.207"
            Port phy-br-sahara
                Interface phy-br-sahara
                    type: patch
                    options: {peer=int-br-sahara}
        Bridge br-int
            Controller "tcp:127.0.0.1:6633"
                is_connected: true
            fail_mode: secure
            Port "qr-0e9a51d1-49"
                tag: 1
                Interface "qr-0e9a51d1-49"
                    type: internal
            Port "qg-f4ff7324-e2"
                tag: 11
                Interface "qg-f4ff7324-e2"
                    type: internal
            Port "ha-a7a451bc-cc"
                tag: 5
                Interface "ha-a7a451bc-cc"
                    type: internal
            Port "qr-225e951a-b4"
                tag: 9
                Interface "qr-225e951a-b4"
                    type: internal
            Port "ha-3fda299f-48"
                tag: 10
                Interface "ha-3fda299f-48"
                    type: internal
            Port "qr-ae359120-fd"
                tag: 8
                Interface "qr-ae359120-fd"
                    type: internal
            Port "tapae6b4ee5-26"
                tag: 1
                Interface "tapae6b4ee5-26"
                    type: internal
            Port "qg-b1c270f1-e3"
                tag: 11
                Interface "qg-b1c270f1-e3"
                    type: internal
            Port "tap5f0515ca-50"
                tag: 3
                Interface "tap5f0515ca-50"
                    type: internal
            Port "qg-26f86c24-c2"
                tag: 12
                Interface "qg-26f86c24-c2"
                    type: internal
            Port "ha-590394df-3e"
                tag: 6
                Interface "ha-590394df-3e"
                    type: internal
            Port "qg-f921a854-20"
                tag: 11
                Interface "qg-f921a854-20"
                    type: internal
            Port "ha-72270a12-02"
                tag: 6
                Interface "ha-72270a12-02"
                    type: internal
            Port "tapc99ad948-26"
                tag: 2
                Interface "tapc99ad948-26"
                    type: internal
            Port "qr-c024d5f4-bd"
                tag: 8
                Interface "qr-c024d5f4-bd"
                    type: internal
            Port patch-tun
                Interface patch-tun
                    type: patch
                    options: {peer=patch-int}
            Port int-br-sahara
                Interface int-br-sahara
                    type: patch
                    options: {peer=phy-br-sahara}
            Port "qg-824a6859-ae"
                tag: 11
                Interface "qg-824a6859-ae"
                    type: internal
            Port "ha-06b73a2b-dd"
                tag: 6
                Interface "ha-06b73a2b-dd"
                    type: internal
            Port "ha-3f3d5c15-a8"
                tag: 6
                Interface "ha-3f3d5c15-a8"
                    type: internal
            Port "qg-438c3ad9-6c"
                tag: 11
                Interface "qg-438c3ad9-6c"
                    type: internal
            Port "tap007e5914-de"
                tag: 9
                Interface "tap007e5914-de"
                    type: internal
            Port "qg-265ebbb0-63"
                tag: 11
                Interface "qg-265ebbb0-63"
                    type: internal
            Port "qr-0ef11b01-d8"
                tag: 2
                Interface "qr-0ef11b01-d8"
                    type: internal
            Port "qr-860aa65b-20"
                tag: 4
                Interface "qr-860aa65b-20"
                    type: internal
            Port int-br-ex
                Interface int-br-ex
                    type: patch
                    options: {peer=phy-br-ex}
            Port "qr-34b1985c-e5"
                tag: 7
                Interface "qr-34b1985c-e5"
                    type: internal
            Port "tapaf7d1ae1-37"
                tag: 4
                Interface "tapaf7d1ae1-37"
                    type: internal
            Port "tap37377660-0d"
                tag: 8
                Interface "tap37377660-0d"
                    type: internal
            Port "tapb219b9ce-aa"
                tag: 7
                Interface "tapb219b9ce-aa"
                    type: internal
            Port br-int
                Interface br-int
                    type: internal
            Port "ha-01711c4c-72"
                tag: 5
                Interface "ha-01711c4c-72"
                    type: internal
        Bridge br-ex
            Controller "tcp:127.0.0.1:6633"
                is_connected: true
            fail_mode: secure
            Port phy-br-ex
                Interface phy-br-ex
                    type: patch
                    options: {peer=int-br-ex}
            Port "p3p1.3802"
                Interface "p3p1.3802"
            Port br-ex
                Interface br-ex
                    type: internal
        Bridge br-tun
            Controller "tcp:127.0.0.1:6633"
                is_connected: true
            fail_mode: secure
            Port "vxlan-ac104011"
                Interface "vxlan-ac104011"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.17"}
            Port "vxlan-ac104010"
                Interface "vxlan-ac104010"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.16"}
            Port "vxlan-ac104026"
                Interface "vxlan-ac104026"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.38"}
            Port "vxlan-ac104017"
                Interface "vxlan-ac104017"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.23"}
            Port "vxlan-ac10401d"
                Interface "vxlan-ac10401d"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.29"}
            Port "vxlan-ac104022"
                Interface "vxlan-ac104022"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.34"}
            Port "vxlan-ac10400a"
                Interface "vxlan-ac10400a"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.10"}
            Port "vxlan-ac10400b"
                Interface "vxlan-ac10400b"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.11"}
            Port "vxlan-ac10400c"
                Interface "vxlan-ac10400c"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.12"}
            Port patch-int
                Interface patch-int
                    type: patch
                    options: {peer=patch-tun}
            Port "vxlan-ac104013"
                Interface "vxlan-ac104013"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.19"}
            Port "vxlan-ac104020"
                Interface "vxlan-ac104020"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.32"}
            Port "vxlan-ac10400d"
                Interface "vxlan-ac10400d"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.13"}
            Port "vxlan-ac104015"
                Interface "vxlan-ac104015"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.21"}
            Port br-tun
                Interface br-tun
                    type: internal
            Port "vxlan-ac104014"
                Interface "vxlan-ac104014"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.20"}
            Port "vxlan-ac104016"
                Interface "vxlan-ac104016"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.22"}
            Port "vxlan-ac10400e"
                Interface "vxlan-ac10400e"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.14"}
            Port "vxlan-ac10401a"
                Interface "vxlan-ac10401a"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.26"}
            Port "vxlan-ac104019"
                Interface "vxlan-ac104019"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.25"}
            Port "vxlan-ac104012"
                Interface "vxlan-ac104012"
                    type: vxlan
                    options: {df_default="true", in_key=flow, local_ip="172.16.64.31", out_key=flow, remote_ip="172.16.64.18"}
        ovs_version: "2.9.0"

But from inside that namespace I'm unable to ping the default gateway:

    [root@neu-19-11-nc1 ~]# ip netns exec qrouter-4cb73b3c-d4d1-4add-9191-3d60e3abb0f7 ping 128.31.24.1
    PING 128.31.24.1 (128.31.24.1) 56(84) bytes of data.
    From 128.31.27.33 icmp_seq=1 Destination Host Unreachable
    From 128.31.27.33 icmp_seq=2 Destination Host Unreachable
    From 128.31.27.33 icmp_seq=3 Destination Host Unreachable

Tracing on the physical interface (p3p1.3802), I see:

    [root@neu-19-11-nc1 ~]# tcpdump -n -i p3p1.3802 host 128.31.27.33
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on p3p1.3802, link-type EN10MB (Ethernet), capture size 262144 bytes
    15:27:18.848713 ARP, Request who-has 128.31.24.1 tell 128.31.27.33, length 28
    15:27:18.851510 ARP, Reply 128.31.24.1 is-at 54:1e:56:d9:6a:c0, length 46
    15:27:19.851643 ARP, Request who-has 128.31.24.1 tell 128.31.27.33, length 28
    15:27:19.852550 ARP, Reply 128.31.24.1 is-at 54:1e:56:d9:6a:c0, length 46
    15:27:20.853643 ARP, Request who-has 128.31.24.1 tell 128.31.27.33, length 28
    15:27:20.854105 ARP, Reply 128.31.24.1 is-at 54:1e:56:d9:6a:c0, length 46

It looks like the ARP replies aren't getting delivered to the
namespace (and a tcpdump on qg-824a6859-ae inside the namespace
confirms that).

Comment 2 Lars Kellogg-Stedman 2018-11-29 22:16:41 UTC
At this point, `dump-flows br-ex` shows:

    [root@neu-19-11-nc1 ~]# ovs-ofctl dump-flows br-ex
     cookie=0x387b6837449cb19f, duration=8004.916s, table=0, n_packets=70, n_bytes=3780, priority=4,in_port="phy-br-ex",dl_vlan=11 actions=strip_vlan,NORMAL
     cookie=0xe27a7150f56cfc50, duration=8004.869s, table=0, n_packets=0, n_bytes=0, priority=4,in_port="phy-br-ex",dl_vlan=12 actions=strip_vlan,NORMAL

Comment 3 Lars Kellogg-Stedman 2018-11-30 12:28:47 UTC
It's because br-ex is missing the final rule:

    priority=0 actions=NORMAL

If I manually add that:

    ovs-ofctl add-flow br-ex priority=0,actions=NORMAL

Then I am able to successfully ping out from the namespace:

    [root@neu-19-11-nc1 ~]# ip netns exec qrouter-4cb73b3c-d4d1-4add-9191-3d60e3abb0f7 ping 128.31.24.1
    PING 128.31.24.1 (128.31.24.1) 56(84) bytes of data.
    64 bytes from 128.31.24.1: icmp_seq=1 ttl=64 time=26.6 ms
    64 bytes from 128.31.24.1: icmp_seq=2 ttl=64 time=0.572 ms

Comment 4 Lars Kellogg-Stedman 2018-11-30 14:44:03 UTC
haleyb asked if restarting the ovs agent would restore the rule.  I cleared the rules on br-ex like this:

    ovs-vsctl del-flows br-ex

And then restarted the ovs agent:

    docker restart neutron_ovs_agent

And after that the rules were not restored:

    [root@neu-19-11-nc1 ~]# ovs-ofctl dump-flows br-ex
    [root@neu-19-11-nc1 ~]#

Comment 5 Lars Kellogg-Stedman 2018-11-30 14:50:34 UTC
(In that previous comment, it should read "ovs-ofctl del-flows br-ex")

Comment 6 Lars Kellogg-Stedman 2018-11-30 14:57:42 UTC
Created attachment 1510177 [details]
openvswitch-agent.log after the restart

I've attached the contents of /var/log/containers/neutron/openvswitch-agent.log

Comment 7 Rodolfo Alonso 2018-11-30 16:03:02 UTC
Hello Lars:

The physical bridges are reconfigured when a new port is added. Every OVS agent should receive an event informing about this new change. Then, during the RPC loop, the following nested call is done:
- neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron.agent:OVSNeutronAgent --> process_network_ports --> treat_vif_port --> port_bound --> provision_loval_vlan --> (net type == FLAT [1]) phys_br.provision_local_vlan.

Is there when a new set of rules is applied. In shake of clarity, I'll write the ovs_ofctl code:
  table_id = constants.LOCAL_VLAN_TRANSLATION if distributed else 0
  if segmentation_id is None:
     self.add_flow(table=table_id,
                   priority=4,
                   in_port=port,
                   dl_vlan=lvid,
                   actions="strip_vlan,normal")
  else:
     self.add_flow(table=table_id,
                   priority=4,
                   in_port=port,
                   dl_vlan=lvid,
                   actions="mod_vlan_vid:%s,normal" % segmentation_id)

So by default, the br-ex rules will have the default "drop" rule. Once you add a new port (to be precise, once you bind a port to the integration bridge), the agent will be informed and will create the rules in the physical bridges.

Can you please add a new port and see what the rules are?


[1] As far as I see in the flows, your internal network is flat.

Comment 8 Lars Kellogg-Stedman 2018-11-30 16:14:45 UTC
Rodolfo,

I've created a new vm, but I wouldn't expect that to update rules on br-ex, since the vm isn't connected directly to the external network.  In any case, it didn't seem to have any impact.

Just for kicks, I also tried:

- Creating new tenant network
- Connecting that to the router using 'openstack router add subnet'
- Creating a new floating ip and binding it to the new vm

I also tried resetting the external gateway on the router.

Nothing had any effect; br-ex still has no rules since I cleared them manually:

    [root@neu-19-11-nc1 ~]# ovs-ofctl dump-flows br-ex
    [root@neu-19-11-nc1 ~]#

Comment 9 Lars Kellogg-Stedman 2018-11-30 18:17:07 UTC
While rodolfo was investigating, we somehow managed to reproduce the problem on nc2.  I powered off nc1 again and restart the ovs-agent container on nc2, and this seemed to restore network access for people.

This time, when nc1 came back up, the flow rules on br-ex seem correct:

    [root@neu-19-11-nc1 ~]# ovs-ofctl dump-flows br-ex
     cookie=0xf2b17fdd4638f215, duration=220.728s, table=0, n_packets=9, n_bytes=546, priority=4,in_port="phy-br-ex",dl_vlan=12 actions=strip_vlan,NORMAL
     cookie=0xeb7d6f5281b073d5, duration=220.679s, table=0, n_packets=0, n_bytes=0, priority=4,in_port="phy-br-ex",dl_vlan=13 actions=strip_vlan,NORMAL
     cookie=0xf2b17fdd4638f215, duration=234.111s, table=0, n_packets=793, n_bytes=47678, priority=2,in_port="phy-br-ex" actions=drop
     cookie=0xf2b17fdd4638f215, duration=234.128s, table=0, n_packets=3538, n_bytes=213749, priority=0 actions=NORMAL

Comment 10 Rodolfo Alonso 2018-12-05 18:13:06 UTC
Hello Lars:

Can I access again to your system?

I've been testing in a development environment. Every time I restart an OVS agent, during the first loop, all ports in the switch are treated. In this first loop, the polling_manager informs about all the ports present in the switch. In this process, the VLAN segments are provisioned in all bridges, br-int and physical.

I've tested several times and I always see the flows created in the physical bridge.

Did you stop OVS/restart in any network controller?

Do you have different provider networks in each network controller?

Now you have nc2 and nc1 online again, is that correct? If you:
- stop the agent in any nc
- delete the OF rules
- start again the agent
--> do you have the OF rules restored?

Regards.

Comment 11 Lars Kellogg-Stedman 2018-12-05 18:31:12 UTC
Created attachment 1511862 [details]
ovs-agent log

Rodolfo,

Since these systems are now live, our ability to restart services is somewhat constrained.  However, due the failures last week, we have all routers running on nc2 right now, and a single test router live on nc1.  That means I should be able to muck around on nc1 without causing an interruption.

If I stop the agent on nc1:

    # docker stop neutron_ovs_agent

Delete the OF rules:

    # ovs-ofctl del-flows br-ex
    # ovs-ofctl dump-flows br-ex
    #

Restart the agent:

    # docker start neutron_ovs_agent

And wait a bit, I end up with no flows on br-ex:

    # ovs-ofctl dump-flows br-ex
    #

I've attached the resulting openvswitch-agent.log

Comment 12 Rodolfo Alonso 2018-12-07 15:09:55 UTC
Hello Lars:

I think I missed one step: to reboot the openvswitch service. When OVS agent detects the OVS has been restarted, tries to provision again the local VLAN [1] and then is when the phys bridge OFs are provisioned again [2]. Notice "provisioning_needed" in [1] will be True if the OVS has been restarted.

I don't know why the OF rules in phys bridge (br-ex) were deleted or not set the first time, but at least you'll have those OF rules set rebooting the OVS service.

Can you do this?

Thank you in advance.

[1] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L822-L824
[2] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L652

Comment 13 Lars Kellogg-Stedman 2018-12-07 15:43:08 UTC
If I start with no flows on br-ex:


    # ovs-ofctl dump-flows br-ex
    #

And restart openvswitch:

    # systemctl restart openvswitch

Then the VLAN-specific rules appear to get re-recreated on the bridge:

    # ovs-ofctl dump-flows br-ex
     cookie=0xfd5505371947e8d6, duration=22.335s, table=0, n_packets=0, n_bytes=0, priority=4,in_port="phy-br-ex",dl_vlan=17 actions=strip_vlan,NORMAL
     cookie=0xef1d5f842d3c95c4, duration=22.306s, table=0, n_packets=0, n_bytes=0, priority=4,in_port="phy-br-ex",dl_vlan=16 actions=strip_vlan,NORMAL

However, the bridge is still missing the final rule that would permit
inbound traffic to work correctly:

     cookie=0xe1da5be8a5448d27, duration=596287.858s, table=0, n_packets=22959775, n_bytes=12405178755, priority=0 actions=NORMAL

Comment 14 Lars Kellogg-Stedman 2018-12-07 16:11:36 UTC
Aftrer also restarting the neutron_ovs_agent container, the rules on br-ex still look the same.

Comment 15 Lars Kellogg-Stedman 2018-12-07 17:08:26 UTC
Rodolfo found this:

  https://ask.openstack.org/en/question/110544/ovs-connection-to-local-of-controller-failed-when-having-two-flat-provider-networks/

"This turned out to be an issue with Ryu (native mode) OF controller. When configured with multiple provider flat network, the controller seems rejecting connection from the two external br's.

So I switch the of_interface mode back to ovs-ofctl, the br connections works and flows are what I would expect."

Comment 16 Rodolfo Alonso 2018-12-07 17:57:21 UTC
According to [1], Ryu native controller is not working well with two physical bridges. We switched back to ovs-ofctl controller and it's working well.

This problem is also related to the problem seen with the ovs-ofctl connection reset. Both bridges, br-sahara and br-ex, where resetting the connection every second.

Reviewing the ovs-agent logs, I can see there are four bridges:
- br-in
- br-tun
- br-ex
- br-sahara

br-ex and br-sahara have the same datapath_id. This was reported in [2].


[1] https://ask.openstack.org/en/question/110544/ovs-connection-to-local-of-controller-failed-when-having-two-flat-provider-networks/
[2] https://bugs.launchpad.net/neutron/+bug/1697243

Comment 17 Lars Kellogg-Stedman 2018-12-07 20:07:26 UTC
*** Bug 1654836 has been marked as a duplicate of this bug. ***

Comment 18 Lars Kellogg-Stedman 2018-12-07 20:08:47 UTC
The workaround for us was to include the following in our deployment configuration:

  NetworkerDeployedServerExtraConfig:
    neutron::agents::ml2::ovs::of_interface: ovs-ofctl

This causes the openvswitch-agent to use ovs-ofctl for controlling ovs, rather than the native interface via python-ryu.

Comment 21 Daniel Alvarez Sanchez 2019-01-29 14:21:25 UTC
Patch merged upstream: https://review.openstack.org/#/c/587244/

Comment 32 errata-xmlrpc 2019-04-30 17:23:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0935


Note You need to log in before you can comment on or make changes to this bug.