Bug 1502282

Summary: neutron-keepalived-state-change faiining with AssertionError: do not call blocking functions from the mainloop
Product: Red Hat OpenStack Reporter: Pablo Caruana <pcaruana>
Component: openstack-neutronAssignee: Assaf Muller <amuller>
Status: CLOSED WONTFIX QA Contact: Toni Freger <tfreger>
Severity: low Docs Contact:
Priority: medium    
Version: 11.0 (Ocata)CC: akaris, amuller, bperkins, chrisw, ihrachys, ipetrova, jlibosva, nyechiel, ojanas, pablo.iranzo, srevivo
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-01 12:26:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pablo Caruana 2017-10-15 14:08:34 UTC
Observed at journal logs a repetition of  executing a kill operation coming utils.py and a exemption triggered when invoking   greenthread.sleep(0) ending with neutron AssertionError: do not call blocking functions from the mainloop.
Neutron rootwrap daemon is recorded as succesfully stopped.

4147595 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.818 457714 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=600551
4147596 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.832 457714 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:158
4147597 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.833 457714 CRITICAL neutron [-] AssertionError: do not call blocking functions from the mainloop
4147598   2017-10-15 14:21:37.833 457714 ERROR neutron Traceback (most recent call last):
4147599   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/bin/neutron-keepalived-state-change", line 10, in <module>
4147600   2017-10-15 14:21:37.833 457714 ERROR neutron     sys.exit(main())
4147601   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/cmd/keepalived_state_change.py", line 19, in main
4147602   2017-10-15 14:21:37.833 457714 ERROR neutron     keepalived_state_change.main()
4147603   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 157, in main
4147604   2017-10-15 14:21:37.833 457714 ERROR neutron     cfg.CONF.monitor_cidr).start()
4147605   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/daemon.py", line 249, in start
4147606   2017-10-15 14:21:37.833 457714 ERROR neutron     self.run()
4147607   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 70, in run
4147608   2017-10-15 14:21:37.833 457714 ERROR neutron     for iterable in self.monitor:
4147609   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/async_process.py", line 256, in _iter_queue
4147610   2017-10-15 14:21:37.833 457714 ERROR neutron     yield queue.get(block=block)
4147611   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/queue.py", line 313, in get
4147612   2017-10-15 14:21:37.833 457714 ERROR neutron     return waiter.wait()
4147613   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/queue.py", line 141, in wait
4147614   2017-10-15 14:21:37.833 457714 ERROR neutron     return get_hub().switch()
4147615   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
4147616   2017-10-15 14:21:37.833 457714 ERROR neutron     return self.greenlet.switch()
4147617   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 346, in run
4147618   2017-10-15 14:21:37.833 457714 ERROR neutron     self.wait(sleep_time)
4147619   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 85, in wait
4147620   2017-10-15 14:21:37.833 457714 ERROR neutron     presult = self.do_poll(seconds)
4147621   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/hubs/epolls.py", line 62, in do_poll
4147622   2017-10-15 14:21:37.833 457714 ERROR neutron     return self.poll.poll(seconds)
4147623   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 134, in handle_sigterm
4147624   2017-10-15 14:21:37.833 457714 ERROR neutron     self._kill_monitor()
4147625   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/keepalived_state_change.py", line 131, in _kill_monitor
4147626   2017-10-15 14:21:37.833 457714 ERROR neutron     run_as_root=True)
4147627   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 244, in kill_process
4147628   2017-10-15 14:21:37.833 457714 ERROR neutron     execute(['kill', '-%d' % signal, pid], run_as_root=run_as_root)
4147629   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 164, in execute
4147630   2017-10-15 14:21:37.833 457714 ERROR neutron     greenthread.sleep(0)
4147631   2017-10-15 14:21:37.833 457714 ERROR neutron   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 31, in sleep
4147632   2017-10-15 14:21:37.833 457714 ERROR neutron     assert hub.greenlet is not current, 'do not call blocking functions from the mainloop'
4147633   2017-10-15 14:21:37.833 457714 ERROR neutron AssertionError: do not call blocking functions from the mainloop
4147634   2017-10-15 14:21:37.833 457714 ERROR neutron
4147635 Oct 15 14:21:37 esclos041019.eecloud.nsn-net.net neutron-keepalived-state-change[457714]: 2017-10-15 14:21:37.836 457714 INFO oslo_rootwrap.client [-] Stopping rootwrap daemon process with pid=600551
4147636 Oct 15 14:21

Comment 3 Jakub Libosvar 2017-10-16 13:41:43 UTC
Ihar has experience with this trace, he's gonna provide more information.

Comment 7 Ihar Hrachyshka 2017-10-17 21:53:08 UTC
The error message was always there, but recently we started to log keepalived-state-change messages to syslog, at which point it started to trigger monitoring tools. There is no known functional deficiency when the error happens. It occurs when the router is deleted, and the -change daemon receives a SIGTERM. At this point, it's not a problem that the daemon doesn't shuts down gracefully, since the process has nothing to clean up after itself anyway.

I agree that this should be solved, but lowering the priority/severity to reflect the description above.

Comment 9 Ihar Hrachyshka 2017-10-20 17:44:13 UTC
Actually, I believe I won't find time for the bug in next weeks, unassigning.

Comment 10 Assaf Muller 2018-04-01 12:26:58 UTC
Fails ROI & prioritization, unfortunately our capacity is limited and we have more urgent issues to fix.