Description of problem: "We just ran into a case where the openvswitch agent (local dev destack, current master branch) eats 100% of CPU time. Pyflame profiling show the time being largely spent in neutron.agent.linux.ip_conntrack, line 95. https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_conntrack.py#L95 The code around this line is: while True: pool.spawn_n(self._process_queue) The documentation of eventlet.spawn_n says: "The same as spawn(), but itβs not possible to know how the function terminated (i.e. no return value or exceptions). This makes execution faster. See spawn_n for more details." I suspect that GreenPool.spawn_n may behave similarly. It seems plausible that spawn_n is returning very quickly because of some error, and then all time is quickly spent in a short circuited while loop." https://bugs.launchpad.net/neutron/+bug/1750777 Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Deploy a RDO openstack cloud (queens) 2. execute the "top" command 3. neutron-openvswitch will use 100% of cpu time (compute and nodes) Actual results: neutron-openvswitch uses 100% cpu time Expected results: neutron-openvswitch does not use 100% cpu time Additional info: https://bugs.launchpad.net/neutron/+bug/1750777
The fix is in upstream stable/queens as of April 5th: https://review.openstack.org/#/c/554258/