Hide Forgot
Description of problem: l3 agent does fullsync multiple times a day causing high load on neutron node. This is approximately 300 times a day : logs/neutron/l3-agent.log | head -n1 ; grep fullsync:True neutron1_neutron_logs/neutron/l3-agent.log | tail -n1 ; done neutron1_neutron_logs/neutron/l3-agent.log 280 2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 neutron2_neutron_logs/neutron/l3-agent.log 280 2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 neutron3_neutron_logs/neutron/l3-agent.log 280 2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497 Version-Release number of selected component (if applicable): python-openvswitch-2.4.0-1.el7.noarch openstack-neutron-common-2015.1.2-9.el7ost.noarch openstack-neutron-lbaas-2015.1.2-1.el7ost.noarch python-neutronclient-2.4.0-2.el7ost.noarch openstack-neutron-openvswitch-2015.1.2-9.el7ost.noarch openstack-neutron-ml2-2015.1.2-9.el7ost.noarch openvswitch-2.4.0-1.el7.x86_64 python-neutron-lbaas-2015.1.2-1.el7ost.noarch python-neutron-2015.1.2-9.el7ost.noarch openstack-neutron-2015.1.2-9.el7ost.noarch How reproducible: Always at customer end . Steps to Reproduce: 1. 2. 3. Actual results: l3 agent does fullsync of router & the neutron node load is high Expected results: Unnecessary fullsync is avoided so as to reduce load on neutron node . Additional info:
A patch that could at least mitigate performance hit is: https://review.openstack.org/#/c/259510/
Note that if we go cherry-picking the patch mentioned in comment 3, we need to follow up with https://review.openstack.org/#/q/Id0565e11b3023a639589f2734488029f194e2f9d
Another patch that should solve the issue of metadata driver callback notification raising an exception when trying to clean up firewall rules for a non-existent router namespace is: https://review.openstack.org/#/c/274053/5 I believe with those three patches in OSP7, we should see the issue gone.
The relevant patches have been backported and have merged. I'm making final confirmations that these patches indeed fix the issue and if they do I'll package the fixes ASAP. John.
The package 'openstack-neutron-2015.1.2-12.el7ost' contains the patches. Please let us know if this mitigates the problems encountered. John.
Hi John, Can you pls suggest reproduction steps in order to verify the fix. I assume it is an HA topology. BR
The code updates doesn't exist in the files. checked the following : /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py /usr/lib/python2.7/site-packages/neutron/agent/metadata/driver.py [root@overcloud-controller-1 ~]# rpm -qa | grep neutron-2 python-neutron-2015.1.2-13.el7ost.noarch openstack-neutron-2015.1.2-13.el7ost.noarch BR
Hi Alex, The 4th patch in the chain (which contains the files you mentioned) was not yet merge (it was not necessary for the hotfix), so this is why the changes to the files you mentioned didn't appear in the code. I confirmed that the other 3 hotfixes did indeed appear in the version you specified. I'll get back to you ASAP with reproduction steps. John.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1062.html