Description of problem: When testing OSP13 with the latest RHEL 7.6 partner snapshot, neutron-openvswitch-agent was found to be constantly crashing and restarting on the undercloud with the following traceback: 2018-09-13 13:41:05.030 30641 ERROR neutron Traceback (most recent call last): 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/bin/neutron-openvswitch-agent", line 10, in <module> 2018-09-13 13:41:05.030 30641 ERROR neutron sys.exit(main()) 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/cmd/eventlet/plugins/ovs_neutron_agent.py", line 20, in main 2018-09-13 13:41:05.030 30641 ERROR neutron agent_main.main() 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/main.py", line 47, in main 2018-09-13 13:41:05.030 30641 ERROR neutron mod.main() 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/main.py", line 35, in main 2018-09-13 13:41:05.030 30641 ERROR neutron 'neutron.plugins.ml2.drivers.openvswitch.agent.' 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/ryu/base/app_manager.py", line 375, in run_apps 2018-09-13 13:41:05.030 30641 ERROR neutron hub.joinall(services) 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 103, in joinall 2018-09-13 13:41:05.030 30641 ERROR neutron t.wait() 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait 2018-09-13 13:41:05.030 30641 ERROR neutron return self._exit_event.wait() 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 125, in wait 2018-09-13 13:41:05.030 30641 ERROR neutron current.throw(*self._exc) 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main 2018-09-13 13:41:05.030 30641 ERROR neutron result = function(*args, **kwargs) 2018-09-13 13:41:05.030 30641 ERROR neutron File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 65, in _launch 2018-09-13 13:41:05.030 30641 ERROR neutron raise e 2018-09-13 13:41:05.030 30641 ERROR neutron Exception: Failed to spawn rootwrap process. 2018-09-13 13:41:05.030 30641 ERROR neutron stderr: 2018-09-13 13:41:05.030 30641 ERROR neutron sudo: PAM account management error: Authentication service cannot retrieve authentication info Version-Release number of selected component (if applicable): OSP13 Puddle 2018-09-11.1 with RHEL 7.6 Partner Snapshot 2 How reproducible: Deploy an OSP undercloud with the versions above and observe tracebacks in /var/log/neutron/openvswitch-agent.log Steps to Reproduce: 1. Deploy an undercloud based on a RHEL 7.5 image 2. Enable OSP puddle and RHEL 7.6 snapshot repositories 3. yum update -y 4. Install undercloud normally 5. Observe tracebacks in /var/log/neutron/openvswitch-agent.log 6. Observe avc deny messages in /var/log/audit/audit.log Actual results: neutron-openvswitch-agent is in a crash loop and there are many SELinux denials logged Expected results: neutron-openvswitch-agent should run normally Additional info: [root@undercloud selinux]# cat /var/log/audit/audit.log | audit2allow #============= neutron_t ============== allow neutron_t chkpwd_exec_t:file { execute execute_no_trans open read }; allow neutron_t pam_var_run_t:file { read write }; allow neutron_t sendmail_exec_t:file execute; allow neutron_t shadow_t:file { getattr open read }; allow neutron_t sudo_db_t:dir search; allow neutron_t var_log_t:file { create open }; [root@undercloud selinux]# rpm -qa | grep openstack-selinux openstack-selinux-0.8.14-14.el7ost.noarch [root@undercloud selinux]# rpm -qa | grep selinux openvswitch-selinux-extra-policy-1.0-5.el7fdp.noarch libselinux-utils-2.5-14.1.el7.x86_64 openstack-selinux-0.8.14-14.el7ost.noarch libselinux-python-2.5-14.1.el7.x86_64 selinux-policy-3.13.1-223.el7.noarch libselinux-2.5-14.1.el7.x86_64 container-selinux-2.68-1.el7.noarch selinux-policy-targeted-3.13.1-223.el7.noarch libselinux-ruby-2.5-14.1.el7.x86_64
allow neutron_t chkpwd_exec_t:file { execute execute_no_trans open read }; ^ This has been seen before. It seems how chkpwd_unix is executed changed, or otherwise, there is something different in the sudo stack that breaks existing policies here.
This behavior does not occur on 7.5 and prior.
Is there an audit.log we can attach here?
I did not capture the audit log from the DCI environment before it was destroyed. I also won't be able to work on reproducing this issue in my own lab until next week at the earliest.
Created attachment 1487734 [details] audit.log gathered with permissive mode set audit.lig gathered by our QA colleagues with SELinux in Permissive mode
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3435