I can see two ways to handle the initial performance issue that had been observed: - put the bridge interfaces down, but we then need two blocks (neighbour + routing information) for tunnels over IP, like VxLAN. OVS currently relies on the linux kernel to provide both functionality. An internal cache is filled with informations coming from netlink. The cache can be manually populated too, but any update coming from netlink would flush the whole cache. - have the PMD threads "offload" the syscall/transmission of a packet on the bridge interface to a non PMD thread. This "offload" requires a non-blocking communication channel.
Talked to Christophe, focusing on the first proposed solution. We agreed on a debugging session on his platform, most likely this week.
Writing my current notes following debugging sessions with Christophe. We can make use of a "dummy" netdev. - This netdev carries the ip address of the tunnel endpoint and is put in the bridge receiving the encapsulated traffic. Either os-net-config or neutron must configure this ip address by calling ovs-appctl netdev-dummy/ip4address and enabling vxlan listener by configuring a route for the listening ip. # ovs-appctl netdev-dummy/ip4addr dummy0 16.0.0.2/24 # ovs-appctl ovs/route/add 16.0.0.2/32 dummy0 # ovs-appctl ovs/route/add 16.0.0.0/24 dummy0 - No IP address is put on the bridge netdev itself, which is an issue for neutron that checks for this ip address. So a command has been added in ovs to dump ip addresses. # ovs-appctl ovs/ip/show dummy0 16.0.0.2/24 - This netdev itself replies to ARP request. icmpv6 is not handled, I wrote a patch for it. - One additional problem identified during these sessions is that ovs 2.11 is missing the upstream change "userspace: Enable non-bridge port as tunnel endpoint.". - By default, the dummy netdev decides on a mac address with a fixed format, I will investigate this before submitting all those changes upstream. Christophe wants to test a rpm I provided him with the current changes.
Revisiting the problem and trying to summarize. This issue comes from the use of NORMAL actions in OVS pipeline, that have a negative impact on performance. On a bridge, such a NORMAL actions means that packets for unknown destinations are flooded and end up on a tap iface. The workaround is to put the tap ifaces down to solve this. In OVS-DPDK + vxlan setups, putting the tap iface down is a problem as the kernel is used to fulfill functions needed by OVS: - provide routing informations, to know how to route packets into a IP tunnel, - provide neighbour info, to know how to build the outer IP header when encapsulating packets, - reply to ARP when the remote tunnel endpoint tries to resolve this side of the IP tunnel, Now, if we reconsider this in the light of OVN getting into RHOSP and configuring OVS. To route packets through tunnels, OVN writes fully described OF rules. So there would be no need to keep the tap iface up. Besides, OVN does not rely on NORMAL actions in its pipeline, so the performance issue should not occur anyway.
As we move to OVN, we won't pursue in fixing this bug, and we will re-open it if OVN requires tap interfaces to be up, in the future.