Description of problem: When a neutron HA router fails over, the l3 agent where the router is now master informs neutron server of this transition. Thus the router port bindings (router port's binding:host_id) are updated to point to the correct server. networking-vpp mechanism driver (https://github.com/openstack/networking-vpp) depends on this to provide the L2 plumbing for the router interface on the new control server. There is a bug in the def _update_router_port_bindings(self, context, states, host) method in neutron/db/l3_hamode_db.py which prevents gateway ports from being identified and updated. A bug fix has been applied to upstream neutron and is present in stable/newton@ https://github.com/openstack/neutron/commit /f6b3d25c6ef335ff891030b8e34c1d27f45b896c and master@https://github.com/openstack/neutron/commit/d8334b41d2c5bcd4916347d20008b1538d48b0ef The current version of neutron in OSP10 is openstack-neutron.noarch 1:9.2.0-6.el7ost and it does not have it. This bug request is to make the fix available. Version-Release number of selected component (if applicable): openstack-neutron.noarch 1:9.2.0-6.el7ost
This will be automatically included in the next OSP 10 rebase and minor release.
I see that you've attached a sev1 case. Can you help explain why is it a sev1? What is the impact of the issue? As far as I can tell the API will return an out of date host for the external router port binding, but there is no other effect. In any case, if it's indeed an urgent severity issue for the customer and they cannot wait for the next OSP 10 release (Estimated to be around 1 month) then you can always ask for a hotfix.
Hi Assaf, This is a sev1 bug because it breaks the HA implementation of neutron routers when using VPP as the mechanism driver (https://github.com/openstack/networking-vpp) "I can tell the API will return an out of date host for the external router port binding, but there is no other effect" It is not only an API update but an ML2 update_port_precommit() call is also made as part of this. Thus when a HA failover event happens on the neutron router, VPP depends on ML2 to make this call, to rebind all the router ports to the new correct host. With this bug, router gateway interfaces are not rebound correctly and the new (master) neutron router is not able to forward packets to the external world. Hence we would need this fix to be backported as soon as possible.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2663