Description of problem: When starting nova-network immediately before nova-cert: systemctl start openstack-nova-network.service openstack-nova-cert.service nova-network crashes. When started in the opposite order: systemctl start openstack-nova-cert.service openstack-nova-network.service everything works fine. Everything also works when any other nova-* service is substituted for nova-cert in the first command above. Log: Sep 16 14:36:18 mrg-04 nova-network[17166]: Traceback (most recent call last): Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 336, in fire_timers Sep 16 14:36:18 mrg-04 nova-network[17166]: timer() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 56, in __call__ Sep 16 14:36:18 mrg-04 nova-network[17166]: cb(*args, **kw) Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 192, in main Sep 16 14:36:18 mrg-04 nova-network[17166]: result = function(*args, **kwargs) Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/service.py", line 101, in run_server Sep 16 14:36:18 mrg-04 nova-network[17166]: server.start() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/service.py", line 162, in start Sep 16 14:36:18 mrg-04 nova-network[17166]: self.manager.init_host() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/network/manager.py", line 1702, in init_host Sep 16 14:36:18 mrg-04 nova-network[17166]: self.l3driver.initialize() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/network/l3.py", line 84, in initialize Sep 16 14:36:18 mrg-04 nova-network[17166]: linux_net.init_host() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/network/linux_net.py", line 442, in init_host Sep 16 14:36:18 mrg-04 nova-network[17166]: add_snat_rule(ip_range) Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/network/linux_net.py", line 432, in add_snat_rule Sep 16 14:36:18 mrg-04 nova-network[17166]: iptables_manager.apply() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/nova/utils.py", line 944, in inner Sep 16 14:36:18 mrg-04 nova-network[17166]: retval = f(*args, **kwargs) Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/lockfile/__init__.py", line 233, in __exit__ Sep 16 14:36:18 mrg-04 nova-network[17166]: self.release() Sep 16 14:36:18 mrg-04 nova-network[17166]: File "/usr/lib/python2.7/site-packages/lockfile/linklockfile.py", line 52, in release Sep 16 14:36:18 mrg-04 nova-network[17166]: raise NotLocked Sep 16 14:36:18 mrg-04 nova-network[17166]: NotLocked Sep 16 14:36:18 mrg-04 nova-network[17166]: 2012-09-16 14:36:18 CRITICAL nova [-] Sep 16 14:36:18 mrg-04 systemd[1]: openstack-nova-network.service: main process exited, code=exited, status=1 Sep 16 14:36:18 mrg-04 systemd[1]: Unit openstack-nova-network.service entered failed state. The cause is the SELinux policy for nova-cert. This prevents nova-cert from determining if nova-network is running by doing kill(pid, 0) in nova.utils.cleanup_file_locks(). The nova utils code interprets the denial of permission as meaning that the process is not running, and consequently deletes any locks that are in use. When nova-network, which is indeed still running, goes to delete its lock and finds it gone, it raises an exception. Audit log: type=AVC msg=audit(1347820578.676:7219): avc: denied { signull } for pid=1716 7 comm="nova-cert" scontext=system_u:system_r:nova_cert_t:s0 tcontext=system_u:s ystem_r:nova_network_t:s0 tclass=process Version-Release number of selected component (if applicable): openstack-nova-2012.1.1-15.fc17 openstack-nova-network-2012.1.1-15.fc17 openstack-nova-cert-2012.1.1-15.fc17 selinux-policy-3.10.0-146.fc17 How reproducible: 100% on a very fast machine in the lab. The timing may be different on other machines. Steps to Reproduce: 1. systemctl start openstack-nova-network.service openstack-nova-cert.service 2. ... 3. Profit! Actual results: # systemctl status openstack-nova-network.serviceopenstack-nova-network.service - OpenStack Nova Network Server Loaded: loaded (/usr/lib/systemd/system/openstack-nova-network.service; disabled) Active: failed (Result: exit-code) since Sun, 16 Sep 2012 14:36:18 -0400; 7s ago Process: 17166 ExecStart=/usr/bin/nova-network --config-file /etc/nova/nova.conf --logfile /var/log/nova/network.log (code=exited, status=1/FAILURE) CGroup: name=systemd:/system/openstack-nova-network.service Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: File "... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: iptabl... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: File "... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: retval... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: File "... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: self.r... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: File "... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: raise ... Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: NotLocked Sep 16 14:36:18 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17166]: 2012-0... Expected results: # systemctl status openstack-nova-network.serviceopenstack-nova-network.service - OpenStack Nova Network Server Loaded: loaded (/usr/lib/systemd/system/openstack-nova-network.service; disabled) Active: active (running) since Sun, 16 Sep 2012 14:39:46 -0400; 3s ago Main PID: 17224 (nova-network) CGroup: name=systemd:/system/openstack-nova-network.service └ 17224 /usr/bin/python /usr/bin/nova-network --config-fil... Sep 16 14:39:46 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17241]: nova : TTY=unk... Sep 16 14:39:46 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17244]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17247]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17250]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17253]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17256]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17259]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17262]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com sudo[17265]: nova : TTY=unk... Sep 16 14:39:47 mrg-04.mpc.lab.eng.bos.redhat.com nova-network[17224]: 2012-0... Additional info: It seems odd that SELinux denies permission to do kill(pid, 0), since this does not actually send a signal. However, another potential fix would be to check the errno of the OSError exception raised in nova.utils.cleanup_file_locks(), rather than just assuming that an exception means the process is not running.
Thanks for the excellent analysis Zane. Note we got rid of cleanup_file_locks completely in the OpenStack Folsom release. It's tempting to backport that patch, but I've proposed a simpler fix at: https://review.openstack.org/13106 thanks!
This is now upstream and should be released very soon for F17
This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.