Description of problem ====================== This is tracker BZ for SELinux support in RHGS WA, so that: * we don't have to disable SELinux system wide (switching whole OS to permissive or disabled state) * we have either permissive or enforcing domains for RHGS WA components Version-Release number of selected component ============================================ tendrl-selinux-1.5.3-2.el7rhgs.noarch
Moving this BZ into SELinux component.
Status update: there are many avc denials on both server and storage machines. I need to investigate deeper and see which ones are actually a problem for us (some of those, eg. from permissive domains, are expected). Checking with tendrl-selinux-1.5.4-1.el7rhgs.noarch [root@mbukatov-usm1-server ~] # ausearch -m avc | grep ^time | wc -l 799 [root@mbukatov-usm1-gl1 ~]# ausearch -m avc | grep ^time | wc -l 756
Test Environment ================ Testing with the following packages. On Gluster Storage Servers:: libselinux-2.5-11.el7.x86_64 libselinux-python-2.5-11.el7.x86_64 libselinux-utils-2.5-11.el7.x86_64 selinux-policy-3.13.1-166.el7_4.7.noarch selinux-policy-targeted-3.13.1-166.el7_4.7.noarch tendrl-collectd-selinux-1.5.4-1.el7rhgs.noarch tendrl-commons-1.5.4-6.el7rhgs.noarch tendrl-gluster-integration-1.5.4-8.el7rhgs.noarch tendrl-node-agent-1.5.4-9.el7rhgs.noarch tendrl-selinux-1.5.4-1.el7rhgs.noarch On RHGS WA server:: carbon-selinux-1.5.4-1.el7rhgs.noarch libselinux-2.5-11.el7.x86_64 libselinux-python-2.5-11.el7.x86_64 libselinux-utils-2.5-11.el7.x86_64 selinux-policy-3.13.1-166.el7_4.7.noarch selinux-policy-targeted-3.13.1-166.el7_4.7.noarch tendrl-ansible-1.5.4-2.el7rhgs.noarch tendrl-api-1.5.4-4.el7rhgs.noarch tendrl-api-httpd-1.5.4-4.el7rhgs.noarch tendrl-commons-1.5.4-6.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-11.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-1.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch tendrl-node-agent-1.5.4-9.el7rhgs.noarch tendrl-notifier-1.5.4-6.el7rhgs.noarch tendrl-selinux-1.5.4-1.el7rhgs.noarch tendrl-ui-1.5.4-5.el7rhgs.noarch On all machines, SELinux is enabled and enforcing (just after standard installation with tendrl-ansible):: SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 28 Update on implementation ======================== As noted in the description of this BZ and following comments, we have SELinux running in enforcing mode, while keeping RHGS WA related domains running in permissive mode. SELinux policy for RHGS WA, so called *Independent SELinux policy* is provided in tendrl-selinux package, which provides/consists of multiple rpm subpackages: * carbon-selinux * tendrl-collectd-selinux * tendrl-grafana-selinux * tendrl-selinux Names of these rpm subpackages is consistend with SELinux package naming practices (verified few months ago during consultation with SELinux team). In these packages, we define (and are directly reponsible for) following SELinux domains, which are all configured as permissive: * carbon * grafana * tendrl Validation ========== I performed the following actions on qe virtual machines in standard configuration with volume_beta_arbiter_2_plus_1x2: * Install RHEL 7 on just created virtual machines * Install Gluster there * Configure trusted storage pool with volume_beta_arbiter_2_plus_1x2 * Install RHGS WA using tendrl-ansible * Import the cluster via RHGS WA * Check all grafana dashboards via web browser * Configure both SNMP and SMTB alerting using usmqe-setup playbooks * Shutdown one storage machine * Reboot all machines This was necesary to cover most core use cases when SELinux could prevent some action to be completed. Alerting was included as I'm familiar wit the feature and have the setup completelly automated. Since we are testing with SELinux in enforcing mode since last week only, this means that we can't be 100% sure that there are no other special cases when incomplete RHGS WA SELinux policy would prevent something to happen, breaking some RHGS WA feature. For a better test coverage here, we would need to test with this setup from the beggining. First, checking list of permissive domains, on RHGS WA Server:: # semanage permissive -l Customized Permissive Types Builtin Permissive Types grafana_t sanlk_resetd_t carbon_t hsqldb_t tendrl_t systemd_hwdb_t blkmapd_t ipmievd_t targetd_t While on all Gluster Storage servers:: Customized Permissive Types Builtin Permissive Types sanlk_resetd_t hsqldb_t tendrl_t systemd_hwdb_t collectd_t blkmapd_t ipmievd_t targetd_t And all our domains (as listed above are there). Ok. Checking AVC denial messages from all machines (gathered via usmqe evidence playbook), we see that there are lot of them:: $ git grep 'avc: denied' | wc -l 105509 And that almost all of them are related to tendrl domain:: $ git grep 'avc: denied' | grep scontext=.*tendrl_t:s0 | wc -l 104981 We don't have to worry about these as tendrl domain is permissive. The remaining few hunderd messages:: $ git grep 'avc: denied' | grep -v scontext=.*tendrl_t:s0 | wc -l 528 could be classified into 2 groups:: $ git grep 'avc: denied' | grep -v scontext=.*tendrl_t:s0 | sed 's/.*\(scontext=.*\ \).*/\1/' | sort | uniq scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:object_r:ephemeral_port_t:s0 scontext=system_u:system_r:syslogd_t:s0 tcontext=system_u:object_r:tendrl_log_t:s0 Checking both groups, I see that there are: * 12 messages related to glusterd_t scontext are out of scope of RHGS WA selinux policy: TODO create a Gluster SELinux BZ for this * 516 messages for syslogd_t Checking 516 messages for syslogd_t domain `````````````````````````````````````````` While checking these 516 syslogd_t avc denials in detail, all follows this pattern (only timestamp differs):: type=AVC msg=audit(1512909756.261:3444): avc: denied { create } for pid=563 comm=72733A6D61696E20513A526567 name="node-agent" scontext=system_u:system_r:syslogd_t:s0 tcontext=system_u:object_r:tendrl_log_t:s0 tclass=dir And I see it only on RHGS WA server. Moreover the denial appeared after the reboot. The denial itlself indicates that rsyslogd is trying to create logfile directory for node-agent and fails. As can be checked on the server:: # ls -l /var/log/tendrl/node-agent ls: cannot access /var/log/tendrl/node-agent: No such file or directory # journalctl -u rsyslog -e | tail -1 Dec 10 08:41:16 mbukatov-usm1-server.example.com rsyslogd[563]: omfile: creating parent directories for file 'Permission denied' failed: /var/log/tendrl/node-agent/node-agent.log [v8.24.0] While on all Gluster storage machines, the node agent log directory exists and there is no such avc denial message. Now the question is why is the node-agent log directory missing on RHGS WA machine, while on Gluster storage machines, the directory exists. TODO: To finish verificaton of this BZ, we need to investigate why is that. Additional Details ================== During the verification, I haven't noticed any additional functionality problem caused by SELinux. Alerts were delivered for both snmp and smtp clients.
Update on the missing log directory ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Just after installation, there is no ~/var/log/tendrl/~ directory on storage nodes, but there is ~/var/log/tendrl/api/~ on WA server. There are no avc denials for rsyslog anywhere at this point. Then after reboot, the log directory structure is created, so that there is ~/var/log/tendrl/node-agent/~ directory on storage nodes, but this fails on WA server because SELinux prevents rsyslogd to create the directory there. SELinux is enforcing in all machines. This is weird because: * the log directories are missing before 1st reboot after installation * after the reboot, only Storage nodes have the log directory for node agent created I don't see the root cause of this issue clear enought to blame the problem on the SELinux policy, as: * the fact that the log directory is missing after installation seems like a bug not directly related to SELinux * the fact that on Gluster Storage machine, the log directory is created while on WA machines is not (where SELinux actually blocks it), while SELinux is enforcing on both machines, also suggests that the problem could be related to other things Because of lack of understanding, I haven't created new BZ for SELinux policy to allow rsyslog to mkdir the directory in question.
What does dev team think about this system logging integration problem? What is expected behavior there? Answer would help QE team to debug this and create a separate BZ.
Besides issue with rsyslog described in comment 17, there are no other avc denial messages nor evidence of something broken because of SELinux, wrt the scenario described in comment 16, with limitations described in the same place.
/var/log/tendrl is the directory where rsyslog and tendrl-api services want to keep its log records. Currently tendrl-api creates this directory and adds tendrl_log context to it using selinux-policy which block rsyslog to access this directory.
Patch sent to upstream for review: https://github.com/Tendrl/tendrl-selinux/pull/9
(In reply to Martin Bukatovic from comment #19) > Besides issue with rsyslog described in comment 17, there are no other avc > denial messages nor evidence of something broken because of SELinux, wrt the > scenario described in comment 16, with limitations described in the same > place. If you consider the rsyslog problem as not blocking wrt verification of this BZ and agree with verification report in comment 16, you can move this bugzilla into verified state (based on comment 16 and 19). On the other hand, if you decide that the rsyslog problem blocks verification, move this BZ into ASSIGNED state.
Filip, could you check issues related to this BZ: * missing log directories after installation * glusterd_t avc denial messages and report separate BZ, after additional checking if needed.
Now after reboot there is created /var/log/tendrl/node-agent on all nodes and also on server and I see no avc denials for rsyslog. I need to investigate further why the directories appear after reboot but based on Comment 22 and mail conversation I am moving this bz to VERIFIED. Tested with: tendrl-ansible-1.5.4-7.el7rhgs.noarch tendrl-ui-1.5.4-6.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-14.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-commons-1.5.4-9.el7rhgs.noarch tendrl-api-1.5.4-4.el7rhgs.noarch tendrl-api-httpd-1.5.4-4.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-14.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-node-agent-1.5.4-16.el7rhgs.noarch tendrl-notifier-1.5.4-6.el7rhgs.noarch
Martin, I have filled BZ 1525376 about missing log directories after installation but I don't think it is related to selinux. Selinux problem with creating log directory on server node was resolved. I think that the problem with glusterd_t avc denial messages is because we are encountering issues described in BZ 1369420.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3478