Bug 1322348 - l3 agent does fullsync multiple times a day causing high load on neutron node.
Summary: l3 agent does fullsync multiple times a day causing high load on neutron node.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 7.0 (Kilo)
Hardware: All
OS: Linux
high
high
Target Milestone: async
: 7.0 (Kilo)
Assignee: John Schwarz
QA Contact: Alexander Stafeyev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-30 11:22 UTC by Jaison Raju
Modified: 2019-10-10 11:43 UTC (History)
10 users (show)

Fixed In Version: openstack-neutron-2015.1.2-12.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 16:02:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1062 0 normal SHIPPED_LIVE openstack-neutron bug fix advisory 2016-05-12 20:02:27 UTC

Description Jaison Raju 2016-03-30 11:22:22 UTC
Description of problem:
l3 agent does fullsync multiple times a day causing high load on neutron node.

This is approximately 300 times a day :

logs/neutron/l3-agent.log | head -n1 ; grep fullsync:True neutron1_neutron_logs/neutron/l3-agent.log | tail -n1 ; done
neutron1_neutron_logs/neutron/l3-agent.log
280
2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497
2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497
neutron2_neutron_logs/neutron/l3-agent.log
280
2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497
2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497
neutron3_neutron_logs/neutron/l3-agent.log
280
2016-03-28 03:40:35.033 861646 DEBUG neutron.agent.l3.agent [req-34991c5d-1d4a-4470-bfcd-e96ce8d0ae7d ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497
2016-03-29 00:50:17.588 322713 DEBUG neutron.agent.l3.agent [req-f8b44eb0-0497-425c-b742-e7b765d2df41 ] Starting periodic_sync_routers_task - fullsync:True periodic_sync_routers_task /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:497

Version-Release number of selected component (if applicable):
python-openvswitch-2.4.0-1.el7.noarch
openstack-neutron-common-2015.1.2-9.el7ost.noarch
openstack-neutron-lbaas-2015.1.2-1.el7ost.noarch
python-neutronclient-2.4.0-2.el7ost.noarch
openstack-neutron-openvswitch-2015.1.2-9.el7ost.noarch
openstack-neutron-ml2-2015.1.2-9.el7ost.noarch
openvswitch-2.4.0-1.el7.x86_64
python-neutron-lbaas-2015.1.2-1.el7ost.noarch
python-neutron-2015.1.2-9.el7ost.noarch
openstack-neutron-2015.1.2-9.el7ost.noarch


How reproducible:
Always at customer end .

Steps to Reproduce:
1.
2.
3.

Actual results:
l3 agent does fullsync of router & the neutron node load is high

Expected results:
Unnecessary fullsync is avoided so as to reduce load on neutron node .

Additional info:

Comment 3 Ihar Hrachyshka 2016-03-30 15:01:51 UTC
A patch that could at least mitigate performance hit is: https://review.openstack.org/#/c/259510/

Comment 4 Ihar Hrachyshka 2016-03-30 15:03:29 UTC
Note that if we go cherry-picking the patch mentioned in comment 3, we need to follow up with https://review.openstack.org/#/q/Id0565e11b3023a639589f2734488029f194e2f9d

Comment 5 Ihar Hrachyshka 2016-03-30 15:24:38 UTC
Another patch that should solve the issue of metadata driver callback notification raising an exception when trying to clean up firewall rules for a non-existent router namespace is: https://review.openstack.org/#/c/274053/5

I believe with those three patches in OSP7, we should see the issue gone.

Comment 9 John Schwarz 2016-04-05 14:13:49 UTC
The relevant patches have been backported and have merged. I'm making final confirmations that these patches indeed fix the issue and if they do I'll package the fixes ASAP.

John.

Comment 10 John Schwarz 2016-04-06 10:22:38 UTC
The package 'openstack-neutron-2015.1.2-12.el7ost' contains the patches. Please let us know if this mitigates the problems encountered.

John.

Comment 20 Alexander Stafeyev 2016-04-19 06:09:26 UTC
Hi John, 
Can you pls suggest reproduction steps in order to verify the fix. 

I assume it is an HA topology. 

BR

Comment 21 Alexander Stafeyev 2016-04-26 06:15:46 UTC
The code updates doesn't exist in the files. 

checked the following : 
/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py 
/usr/lib/python2.7/site-packages/neutron/agent/metadata/driver.py

[root@overcloud-controller-1 ~]# rpm -qa | grep neutron-2
python-neutron-2015.1.2-13.el7ost.noarch
openstack-neutron-2015.1.2-13.el7ost.noarch



BR

Comment 22 John Schwarz 2016-04-26 15:14:00 UTC
Hi Alex,

The 4th patch in the chain (which contains the files you mentioned) was not yet merge (it was not necessary for the hotfix), so this is why the changes to the files you mentioned didn't appear in the code. I confirmed that the other 3 hotfixes did indeed appear in the version you specified.

I'll get back to you ASAP with reproduction steps.

John.

Comment 25 errata-xmlrpc 2016-05-12 16:02:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1062.html


Note You need to log in before you can comment on or make changes to this bug.