Observed at journal logs a repetition of executing a kill operation coming utils.py and a exemption triggered when invoking greenthread.sleep(0) ending with neutron AssertionError: do not call blocking functions from the mainloop. Neutron rootwrap daemon is recorded as succesfully stopped. 4147595 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.818 457714 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=600551 4147596 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.832 457714 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:158 4147597 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.833 457714 CRITICAL neutron [-] AssertionError: do not call blocking functions from the mainloop 4147598 2017-10-15 14:21:37.833 457714 ERROR neutron Traceback (most recent call last): 4147599 2017-10-15 14:21:37.833 457714 ERROR neutron File "/bin/neutron-keepalived-state-change", line 10, in <module> 4147600 2017-10-15 14:21:37.833 457714 ERROR neutron sys.exit(main()) 4147601 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/cmd/keepalived_state_change.py", line 19, in main 4147602 2017-10-15 14:21:37.833 457714 ERROR neutron keepalived_state_change.main() 4147603 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 157, in main 4147604 2017-10-15 14:21:37.833 457714 ERROR neutron cfg.CONF.monitor_cidr).start() 4147605 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/daemon.py", line 249, in start 4147606 2017-10-15 14:21:37.833 457714 ERROR neutron self.run() 4147607 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 70, in run 4147608 2017-10-15 14:21:37.833 457714 ERROR neutron for iterable in self.monitor: 4147609 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/async_process.py", line 256, in _iter_queue 4147610 2017-10-15 14:21:37.833 457714 ERROR neutron yield queue.get(block=block) 4147611 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/queue.py", line 313, in get 4147612 2017-10-15 14:21:37.833 457714 ERROR neutron return waiter.wait() 4147613 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/queue.py", line 141, in wait 4147614 2017-10-15 14:21:37.833 457714 ERROR neutron return get_hub().switch() 4147615 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch 4147616 2017-10-15 14:21:37.833 457714 ERROR neutron return self.greenlet.switch() 4147617 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 346, in run 4147618 2017-10-15 14:21:37.833 457714 ERROR neutron self.wait(sleep_time) 4147619 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 85, in wait 4147620 2017-10-15 14:21:37.833 457714 ERROR neutron presult = self.do_poll(seconds) 4147621 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/hubs/epolls.py", line 62, in do_poll 4147622 2017-10-15 14:21:37.833 457714 ERROR neutron return self.poll.poll(seconds) 4147623 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 134, in handle_sigterm 4147624 2017-10-15 14:21:37.833 457714 ERROR neutron self._kill_monitor() 4147625 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 131, in _kill_monitor 4147626 2017-10-15 14:21:37.833 457714 ERROR neutron run_as_root=True) 4147627 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 244, in kill_process 4147628 2017-10-15 14:21:37.833 457714 ERROR neutron execute(['kill', '-%d' % signal, pid], run_as_root=run_as_root) 4147629 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 164, in execute 4147630 2017-10-15 14:21:37.833 457714 ERROR neutron greenthread.sleep(0) 4147631 2017-10-15 14:21:37.833 457714 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 31, in sleep 4147632 2017-10-15 14:21:37.833 457714 ERROR neutron assert hub.greenlet is not current, 'do not call blocking functions from the mainloop' 4147633 2017-10-15 14:21:37.833 457714 ERROR neutron AssertionError: do not call blocking functions from the mainloop 4147634 2017-10-15 14:21:37.833 457714 ERROR neutron 4147635 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.836 457714 INFO oslo_rootwrap.client [-] Stopping rootwrap daemon process with pid=600551 4147636 Oct 15 14:21
Ihar has experience with this trace, he's gonna provide more information.
The error message was always there, but recently we started to log keepalived-state-change messages to syslog, at which point it started to trigger monitoring tools. There is no known functional deficiency when the error happens. It occurs when the router is deleted, and the -change daemon receives a SIGTERM. At this point, it's not a problem that the daemon doesn't shuts down gracefully, since the process has nothing to clean up after itself anyway. I agree that this should be solved, but lowering the priority/severity to reflect the description above.
Actually, I believe I won't find time for the bug in next weeks, unassigning.
Fails ROI & prioritization, unfortunately our capacity is limited and we have more urgent issues to fix.