Description of problem: Deployment of ceph-metrics fails at collectd stage because the service fails to start. This is displayed in the system journal: Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "python" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "threshold" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "aggregation" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: cephmetrics: Event messages enabled for target http://CLIENT0:8080/events/ Jan 09 11:52:26 MON0 collectd[16916]: Unhandled python exception in loading module: OSError: [Errno 13] Permission denied Jan 09 11:52:26 MON0 collectd[16916]: Traceback (most recent call last): Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib64/collectd/cephmetrics/cephmetrics.py", line 129, in configure_callback Jan 09 11:52:26 MON0 collectd[16916]: CEPH.probe() Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib64/collectd/cephmetrics/cephmetrics.py", line 51, in probe Jan 09 11:52:26 MON0 collectd[16916]: self.iscsi = ISCSIGateway(self, self.cluster_name) Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib64/collectd/cephmetrics/collectors/iscsi.py", line 111, in __init__ Jan 09 11:52:26 MON0 collectd[16916]: self._root = RTSRoot() Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib/python2.7/site-packages/rtslib_fb/root.py", line 74, in __init__ Jan 09 11:52:26 MON0 collectd[16916]: mount_configfs() Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib/python2.7/site-packages/rtslib_fb/utils.py", line 438, in mount_configfs Jan 09 11:52:26 MON0 collectd[16916]: stderr=subprocess.PIPE) Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__ Jan 09 11:52:26 MON0 collectd[16916]: errread, errwrite) Jan 09 11:52:26 MON0 collectd[16916]: File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child Jan 09 11:52:26 MON0 collectd[16916]: raise child_exception Jan 09 11:52:26 MON0 collectd[16916]: OSError: [Errno 13] Permission denied Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "cpu" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "memory" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "interface" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: plugin_load: plugin "write_graphite" successfully loaded. Jan 09 11:52:26 MON0 collectd[16916]: Error: Reading the config file failed! Jan 09 11:52:26 MON0 collectd[16916]: Read the logs for details. Jan 09 11:52:26 MON0 systemd[1]: collectd.service: main process exited, code=exited, status=1/FAILURE In audit.log you can see that SELinux is blocking collectd from accessing resources: [...] type=AVC msg=audit(1515500056.751:5010): avc: denied { getattr } for pid=25396 comm="gwcli" path="/usr/bin/rpm" dev="dm-0" ino=25314520 scontext=system_u:system_r:collectd_t:s0 tcontext=system_u:object_r:rpm_exec_t:s0 tclass=file type=SYSCALL msg=audit(1515500056.751:5010): arch=c000003e syscall=4 success=no exit=-13 a0=2d79270 a1=7ffc0fa916b0 a2=7ffc0fa916b0 a3=7f6821e26610 items=0 ppid=25388 pid=25396 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="gwcli" exe="/usr/bin/python2.7" subj=system_u:system_r:collectd_t:s0 key=(null) type=PROCTITLE msg=audit(1515500056.751:5010): proctitle=2F7573722F62696E2F707974686F6E002F7573722F62696E2F6777636C69002D76 type=AVC msg=audit(1515500056.818:5011): avc: denied { search } for pid=25388 comm="collectd" name="/" dev="configfs" ino=7700 scontext=system_u:system_r:collectd_t:s0 tcontext=system_u:object_r:configfs_t:s0 tclass=dir type=SYSCALL msg=audit(1515500056.818:5011): arch=c000003e syscall=6 success=no exit=-13 a0=55a7666312a0 a1=7ffe455935a0 a2=7ffe455935a0 a3=0 items=0 ppid=1 pid=25388 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="collectd" exe="/usr/sbin/collectd" subj=system_u:system_r:collectd_t:s0 key=(null) type=PROCTITLE msg=audit(1515500056.818:5011): proctitle="/usr/sbin/collectd" type=AVC msg=audit(1515500056.818:5012): avc: denied { execute } for pid=25405 comm="collectd" name="mount" dev="dm-0" ino=25314940 scontext=system_u:system_r:collectd_t:s0 tcontext=system_u:object_r:mount_exec_t:s0 tclass=file type=SYSCALL msg=audit(1515500056.818:5012): arch=c000003e syscall=59 success=no exit=-13 a0=55a7666312a0 a1=55a766631c50 a2=7ffe45595538 a3=0 items=0 ppid=25388 pid=25405 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="collectd" exe="/usr/sbin/collectd" subj=system_u:system_r:collectd_t:s0 key=(null) type=PROCTITLE msg=audit(1515500056.818:5012): proctitle="/usr/sbin/collectd" type=SERVICE_START msg=audit(1515500056.830:5013): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=collectd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' [...] Version-Release number of selected component (if applicable): ceph-selinux-12.2.1-40.el7cp.x86_64 selinux-policy-targeted-3.13.1-166.el7_4.7.noarch pcp-selinux-3.11.8-7.el7.x86_64 collectd-python-5.7.2-1.el7cp.x86_64 collectd-5.7.2-1.el7cp.x86_64 cephmetrics-collectors-1.0-7.el7cp.x86_64 cephmetrics-1.0-7.el7cp.x86_64 cephmetrics-ansible-1.0-7.el7cp.x86_64 cephmetrics-grafana-plugins-1.0-7.el7cp.x86_64 How reproducible: Install ceph-metrics with SELinux enabled on the Ceph nodes. Additional info: collectd starts normally as soon as I temporarily disable SELinux: setenforce 0
We no longer use collectd in 3.1, re-targetting for 3.0 z4.
This should be fixed by this upstream PR: https://github.com/ceph/cephmetrics/pull/186
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2177