Bug 1528325

Summary: neutron-ovs-cleanup failing when there are too many ports
Product: Red Hat OpenStack Reporter: Daniel Alvarez Sanchez <dalvarez>
Component: openstack-neutronAssignee: Terry Wilson <twilson>
Status: CLOSED WONTFIX QA Contact: Toni Freger <tfreger>
Severity: high Docs Contact:
Priority: high    
Version: 11.0 (Ocata)CC: akaris, amuller, chrisw, jlibosva, nyechiel, pmorey, ragiman, slinaber, srevivo, twilson
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-10.0.4-5.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1541494 (view as bug list) Environment:
Last Closed: 2018-05-24 15:52:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1541494, 1541496    

Description Daniel Alvarez Sanchez 2017-12-21 14:44:58 UTC
Description of problem:

When OVS database is large, ovs-cleanup script times out and fails to clean the ports. In large environments this can be an issue because when there's leftovers they won't get cleaned up.


Version-Release number of selected component (if applicable):
openstack-neutron-10.0.3-1.el7ost.noarch 


Actual results:

2017-12-12 16:14:18.114 104948 INFO neutron.common.config [-] Logging enabled!
2017-12-12 16:14:18.114 104948 INFO neutron.common.config [-] /usr/bin/neutron-ovs-cleanup version 10.0.3
2017-12-12 16:14:18.314 104948 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
2017-12-12 16:14:18.314 104948 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands [-] Error executing command
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands Traceback (most recent call last):
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/native/commands.py", line 36, in execute
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands     txn.add(self)
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/api.py", line 79, in __exit__
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands     self.result = self.commit()
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py", line 73, in commit
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands     'timeout': self.timeout})
2017-12-12 16:14:37.369 104948 ERROR neutron.agent.ovsdb.native.commands TimeoutException: Commands [DbListCommand(if_exists=True, records=[u'ha-f9af3f28-c8', u'tap6f51366b-7f', u'tap9d6b4ac9-ea', u'tapc9bccd94-00', u'tap29ea8fb2-9d', 
.....
88aacf8d-2a', u'tap54e8b635-fc', u'tapa46a4411-41', u'ha-abc0de78-ae', u'tap48d691ea-f1', u'tap52afe9d6-62'], table=Interface, columns=['name', 'external_ids', 'ofport'])] exceeded timeout 10 seconds
2017-12-12 16:14:38.696 104948 ERROR neutron 


Additional info:

* Total ports are 4965:
$ cat ovs-vsctl_-t_5_show  | grep Port | wc -l
4965

* qr ports are 260:
$ cat ovs-vsctl_-t_5_show  | grep Port | grep "qr-" | wc -l
260

* qg ports are 260:
$ cat ovs-vsctl_-t_5_show  | grep Port | grep "qg-" | wc -l
263

* tap ports (DHCP) are 3312:
$ cat ovs-vsctl_-t_5_show  | grep Port | grep "tap" | wc -l
3312

* ha ports are 1107:
$ cat ovs-vsctl_-t_5_show  | grep Port | grep "ha-" | wc -l
1107

* vxlan ports are 15:
$ cat ovs-vsctl_-t_5_show  | grep Port | grep "vxlan" | wc -l
15

Comment 1 Jakub Libosvar 2018-02-01 15:08:07 UTC
Terry has already patch up for review in upstream

Comment 14 Assaf Muller 2018-05-24 15:52:37 UTC
Closing as OSP 11 is EOL, there will be no more updates.