This feature [0] being implemented in core OVN needs its counterpart in networking-ovn. Given how that patch is designed, I'm summarizing what needs to be done in networking-ovn: [0] https://patchwork.ozlabs.org/patch/1025421/
This feature [0] being implemented in core OVN needs its counterpart in networking-ovn. Given how that patch is designed, I'm summarizing what needs to be done in networking-ovn: - When a SRIOV port is created in Neutron, networking-ovn has to create an OVN 'external' port. - At this point, networking-ovn has to figure out if the port belongs to a subnet which is connected to a router with a gw port: - If so, set the requested-chassis option to the chassis where the gw port is scheduled with highest priority. - Else, schedule it to any network/controller node (would be better if we could pin a subnet to a certain node). - If the chassis hosting a gw port as mater goes down, core OVN will automatically failover the port (BFD monitoring) to the next highest prio chassis available. However, the external port is not moved so we need to monitor the event and move all external ports scheduled there (same when the chassis comes back). - If a subnet which was not connected to a gw router gets connected to one, move all the 'external' ports to the chassis where the gw router is highest prio. The reason we want the external port to live in the same chassis as the associated gw port is that otherwise, the MAC address of the router port will flap in the ToR as it'll be advertised by both the chassis hosting the gw port and the chassis hosting the external port for the SRIOV instance. It's not a trivial change and it would be best if the HA/scheduling could happen at core OVN level. [0] https://patchwork.ozlabs.org/patch/1025421/
Is there anything we could do in core OVN itself to make sure the requested chassis goes along with the router master instead of manually tracking it from networking-ovn? This is equivalent to the old neutron dhcp-agent failover, but also worse, because if the controller fails to move the requested chassis along it could also flap the router mac. ovn-controller is able to determine itself when it's the master (or the slave for an specific router/router-port).
This issue has conditional approval for 16.1 Z1 release, it must be in the first compose and tested before release of 16.1.1. If not, we will move to TM=Z2.
Bot should be giving this the rhos-16.1 flag as it has devel/QE/PM acks?
Eran, 16.1 GA is closed to Blockers Only now with strict criteria. To have something included it has to go through the blockers Process. 16.1.1 is targeted for mid/late Aug.
According to our records, this should be resolved by python-networking-ovn-7.2.1-0.20200611111150.18fabca.el8ost. This build is available now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openstack-neutron bug fix advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3568