Bug 1823829 - Deployment hangs when selinux is set to enforcing and getcert tries to resubmit a certificate (TLS everywhere)
Summary: Deployment hangs when selinux is set to enforcing and getcert tries to resubm...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-selinux
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z14
: 13.0 (Queens)
Assignee: Julie Pichon
QA Contact: Julie Pichon
URL:
Whiteboard:
: 1822411 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-14 15:03 UTC by Andrea Veri
Modified: 2023-09-07 22:51 UTC (History)
6 users (show)

Fixed In Version: openstack-selinux-0.8.18-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-16 13:59:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
openstack_related_packages (19.16 KB, text/plain)
2020-04-16 17:49 UTC, Andrea Veri
no flags Details
system_related_packages (62.09 KB, text/plain)
2020-04-16 17:50 UTC, Andrea Veri
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-28410 0 None None None 2023-09-07 22:51:56 UTC
Red Hat Product Errata RHBA-2020:5574 0 None None None 2020-12-16 13:59:52 UTC

Description Andrea Veri 2020-04-14 15:03:54 UTC
Description of problem:

The deployment hangs whenever selinux is set to enforcing, the TLS everywhere templates are included with certmonger tries to save the certificate and failing. I bet the issue is similar if not equivalent to https://bugzilla.redhat.com/show_bug.cgi?id=1743485.

audit2allow mentions:

#============= certmonger_t ==============
allow certmonger_t NetworkManager_t:dir { getattr search };
allow certmonger_t NetworkManager_t:file { open read };
allow certmonger_t auditd_t:dir { getattr search };
allow certmonger_t auditd_t:file { open read };
allow certmonger_t cluster_t:dir { getattr search };
allow certmonger_t cluster_t:file { open read };
allow certmonger_t container_runtime_exec_t:file { execute execute_no_trans getattr ioctl open read };
allow certmonger_t container_runtime_t:dir { getattr search };
allow certmonger_t container_runtime_t:file { open read };
allow certmonger_t crond_t:dir { getattr search };
allow certmonger_t crond_t:file { open read };
allow certmonger_t dhcpc_t:dir { getattr search };
allow certmonger_t dhcpc_t:file { open read };

#!!!! WARNING: 'etc_t' is a base type.
allow certmonger_t etc_t:file write;
allow certmonger_t getty_t:dir { getattr search };
allow certmonger_t getty_t:file { open read };
allow certmonger_t gssd_t:dir { getattr search };
allow certmonger_t gssd_t:file { open read };
allow certmonger_t gssproxy_t:dir { getattr search };
allow certmonger_t gssproxy_t:file { open read };
allow certmonger_t inetd_t:dir { getattr search };
allow certmonger_t inetd_t:file { open read };
allow certmonger_t irqbalance_t:dir { getattr search };
allow certmonger_t irqbalance_t:file { open read };
allow certmonger_t kernel_t:dir { getattr search };
allow certmonger_t kernel_t:file { open read };
allow certmonger_t ksmtuned_t:dir { getattr search };
allow certmonger_t ksmtuned_t:file { open read };
allow certmonger_t lvm_t:dir { getattr search };
allow certmonger_t lvm_t:file { open read };
allow certmonger_t mysqld_unit_file_t:service reload;
allow certmonger_t ntpd_t:dir { getattr search };
allow certmonger_t ntpd_t:file { open read };
allow certmonger_t openvswitch_t:dir { getattr search };
allow certmonger_t openvswitch_t:file { open read };
allow certmonger_t policykit_t:dir { getattr search };
allow certmonger_t policykit_t:file { open read };
allow certmonger_t postfix_master_t:dir { getattr search };
allow certmonger_t postfix_master_t:file { open read };
allow certmonger_t postfix_pickup_t:dir { getattr search };
allow certmonger_t postfix_pickup_t:file { open read };
allow certmonger_t postfix_qmgr_t:dir { getattr search };
allow certmonger_t postfix_qmgr_t:file { open read };
allow certmonger_t puppet_etc_t:dir search;
allow certmonger_t puppet_etc_t:file { getattr ioctl open read };
allow certmonger_t rhnsd_t:dir { getattr search };
allow certmonger_t rhnsd_t:file { open read };
allow certmonger_t rhsmcertd_t:dir { getattr search };
allow certmonger_t rhsmcertd_t:file { open read };

#!!!! WARNING: 'root_t' is a base type.
#!!!! This avc can be allowed using the boolean 'daemons_dump_core'
allow certmonger_t root_t:dir write;
allow certmonger_t rpcbind_t:dir { getattr search };
allow certmonger_t rpcbind_t:file { open read };
allow certmonger_t snmpd_t:dir { getattr search };
allow certmonger_t snmpd_t:file { open read };
allow certmonger_t spc_t:dir { getattr search };
allow certmonger_t spc_t:file { open read };
allow certmonger_t spc_t:process signal;
allow certmonger_t sshd_t:dir { getattr search };
allow certmonger_t sshd_t:file { open read };
allow certmonger_t sssd_t:dir { getattr search };
allow certmonger_t sssd_t:file { open read };
allow certmonger_t sysctl_net_t:dir search;
allow certmonger_t sysctl_net_t:file { open read };
allow certmonger_t syslogd_t:dir { getattr search };
allow certmonger_t syslogd_t:file { open read };
allow certmonger_t system_dbusd_t:dir { getattr search };
allow certmonger_t system_dbusd_t:file { open read };
allow certmonger_t systemd_logind_t:dir { getattr search };
allow certmonger_t systemd_logind_t:file { open read };
allow certmonger_t tuned_t:dir { getattr search };
allow certmonger_t tuned_t:file { open read };
allow certmonger_t udev_t:dir { getattr search };
allow certmonger_t udev_t:file { open read };
allow certmonger_t unconfined_service_t:dir { getattr search };
allow certmonger_t unconfined_service_t:file { open read };
allow certmonger_t unconfined_t:dir { getattr search };
allow certmonger_t unconfined_t:file { open read };
allow certmonger_t virtd_t:dir { getattr search };
allow certmonger_t virtd_t:file { open read };
allow certmonger_t virtd_unit_file_t:service reload;

Turning SELinux to permissive while deployment runs fixes the issue.

Version-Release number of selected component (if applicable):
Openstack 13

How reproducible:
100% of the times

Steps to Reproduce:
1. Include the TLS everywhere templates (which use Red Hat IDM behind the scenes)
2. Run deployment

Actual results:

getcert getting stuck (and taking ages to fail as it uses the -w (wait) flag) with NEED_TO_SAVE_CERT

Expected results:

Certificate to be generated and stored successfully.


Additional info:

Comment 1 Julie Pichon 2020-04-14 15:14:20 UTC
Thank you for the bug report. Could you share the permissive audit logs, as well as the openstack-selinux RPM version? Thank you.

Comment 3 Andrea Veri 2020-04-14 15:45:30 UTC
[root@overcloud-controller-0 heat-admin]# rpm -qa | grep openstack-selinux
openstack-selinux-0.8.18-1.el7ost.noarch

Comment 5 Julie Pichon 2020-04-15 10:10:14 UTC
Hi Andrea, thank you for the additional information, however I think this isn't the correct log file. There are no denials related to certmonger_t and the denials are in Enforcing mode. Would it be possible to get the permissive logs where the certmonger issue was reproduced while running in Permissive mode? Thank you.

We fixed a number of issues related to certmonger recently (e.g. bug 1777263 bug 1777738 bug 1777368) but it would be helpful to have the logs to compare more closely. A lot of the denials are likely related to pkill trying to read all the processes to find the right one to kill and can be ignored.

Would you be able to test a scratch build if provided? Thank you.

Comment 6 Andrea Veri 2020-04-15 10:16:31 UTC
My bad, I pointed you to the director audit.log instead of one from the controllers. Attached the correct file now, thanks!

Comment 8 Julie Pichon 2020-04-15 14:08:42 UTC
Posting the details of this one denial here as it might benefit from the Security DFG input like in the other bug. It's in relation to the very generic "allow certmonger_t etc_t:file write" rule:

type=PROCTITLE msg=audit(14/04/20 21:01:39.619:856940) : proctitle=/usr/sbin/certmonger -S -p /var/run/certmonger.pid -n 
type=SYSCALL msg=audit(14/04/20 21:01:39.619:856940) : arch=x86_64 syscall=open success=yes exit=11 a0=0x557dbbc3bd70 a1=O_WRONLY|O_CREAT|O_TRUNC a2=0666 a3=0x24 items=0 ppid=433753 pid=433761 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=certmonger exe=/usr/sbin/certmonger subj=system_u:system_r:certmonger_t:s0 key=(null) 
type=AVC msg=audit(14/04/20 21:01:39.619:856940) : avc:  denied  { write } for  pid=433761 comm=certmonger name=ca.crt dev="sda2" ino=4526027 scontext=system_u:system_r:certmonger_t:s0 tcontext=system_u:object_r:etc_t:s0 tclass=file permissive=1 

Although the command causing the denial looked different in bug 1743485 cf. comments 7 & 12, this was eventually fixed with a code change in THT rather than adding a new rule. I am not sure that ca.cert has the correct label here either, as etc_t looks too generic. If you can find the file, it would be nice to run ls -Z on it to get its full location and label.

Comment 9 Andrea Veri 2020-04-15 14:12:13 UTC
Julie, that would be the default path of IPA's ca.crt: /etc/ipa/ca.crt. The label it gets with the default contexts is etc_t, it should ideally be cert_t. Does the fix on THT take care of setting the cert_t context accordingly?

Comment 10 Stan Toporek 2020-04-15 14:14:22 UTC
Cu has fixed there issue with SELinux rules. They have closed the case.

Comment 11 Alex Schultz 2020-04-15 17:59:29 UTC
*** Bug 1822411 has been marked as a duplicate of this bug. ***

Comment 12 Julie Pichon 2020-04-16 13:23:24 UTC
Looking through the logs, there are a little over 1200 denials. The majority do seem related to reading through all processes to find the ones we want to kill to refresh the certs, so we can ignore that.

Most denials left are related to restarting containers once the process is found. I believe this should be handled with the container domain transition added for certmonger in the bugs I mentioned in comment 5, so I will prepare a rebase to pull those fixes in.

That still leaves us with certmonger_t needing to write to etc_t for the denial in comment 8. 

(In reply to Andrea Veri from comment #9)
> Julie, that would be the default path of IPA's ca.crt: /etc/ipa/ca.crt. The
> label it gets with the default contexts is etc_t, it should ideally be
> cert_t. Does the fix on THT take care of setting the cert_t context
> accordingly?

Thank you for providing the path. I looked around the current policies but it doesn't look like IPA defines any SELinux file context for /etc/ipa. That means the directory and any files within get the default etc_t type, which isn't ideal... If ca.crt was of type ipa_cert_t (or cert_t as you suggest) Certmonger would be fine working on it.

If this was a file specific to an OpenStack install we could manage its context in openstack-selinux (or maybe even in THT, which I don't know well). But since it seems to be a default IPA file, we can't do that without risking a conflict down the line if IPA adds a rule for it, or expects it to be etc_t. The other solution is to allow Certmonger to write to anything etc_t, which seems wide.

It looks like the THT patch in bug 1743485 simply changed a cert path from /etc/ipa/ca.crt back to a cert under /etc/pki, which defaults to a context certmonger_t can work on:

/etc/pki(/.*)?                                     all files          system_u:object_r:cert_t:s0

However that particular template parameter is already set to /etc/pki in OSP13 so it doesn't look like the exact same issue.

Harry, in bug 1743485 (comment 17) you worked on a revert to help resolve a SELinux denial that was very similar to the one I pasted in comment 8. I'm wondering if you may have any thoughts or insights on whether it's possible a similar issue cropped up in OSP13? I'll note that comment 12 on that other bug says that "We definitely don't want to have certmonger attempt to overwrite /etc/ipa/ca.crt, as this is obtained when the node is enrolled with IdM"... but then again, the reporter here mentions that the deployment succeeds in permissive mode so perhaps overwriting that file doesn't always cause a problem (?).

Andrea, did this used to work? I noticed the support case in the duplicate bug talks about a recent SELinux package upgrade. What package was that, openstack-selinux or libselinux-policy? From what versions? Did other things change?

Thank you.

Comment 13 Julie Pichon 2020-04-16 14:36:30 UTC
Hi Nathan,

In bug 1743485 comment 12, you wrote: "We definitely don't want to have certmonger attempt to overwrite /etc/ipa/ca.crt, as this is obtained when the node is enrolled with IdM."

We are thinking of adding a SELinux rule to allow certmonger to do just that as part of this bug. Could I clarify whether your comment above was highlighting an issue relevant only in the context of that other bug, or is this a file we never want to allow certmonger to touch in general?

Comment 14 Julie Pichon 2020-04-16 14:43:42 UTC
Potentially we could also test the rebase on its own to see if that's enough to resolve the "stuck" issue, and whether to ignore the etc_t write denial.

Comment 15 Andrea Veri 2020-04-16 17:49:29 UTC
Julie,

this was used to work just fine, yes. The problem started to occur when we upgraded these systems, the list of packages that were upgraded as per yum history will follow as attachments.

Thanks!

Comment 16 Andrea Veri 2020-04-16 17:49:49 UTC
Created attachment 1679452 [details]
openstack_related_packages

Comment 17 Andrea Veri 2020-04-16 17:50:06 UTC
Created attachment 1679453 [details]
system_related_packages

Comment 18 Nathan Kinder 2020-05-07 00:27:43 UTC
(In reply to Julie Pichon from comment #13)
> In bug 1743485 comment 12, you wrote: "We definitely don't want to have
> certmonger attempt to overwrite /etc/ipa/ca.crt, as this is obtained when
> the node is enrolled with IdM."
> 
> We are thinking of adding a SELinux rule to allow certmonger to do just that
> as part of this bug. Could I clarify whether your comment above was
> highlighting an issue relevant only in the context of that other bug, or is
> this a file we never want to allow certmonger to touch in general?

Apologies for the delay.  There are a few observations I have:

- The /etc/ipa/ca.crt should be considered configuration for the IPA client.  It is not a general purpose cert for other software on the system like those in /etc/pki.  You should not change the label of /etc/ipa/ca.crt to cert_t.
- You should not allow certmonger_t to overwrite /etc/ipa/ca.crt.  It should only be written to when running ipa-client-install.

It would be useful to look at the logs from the deployment where Director is using certmonger at the time that the AVC happens.  In bug#1743485, we saw an error like this showing an attempt to use 'getcert' with '-F /etc/ipa/ca.crt':

-----------------------------------------------------------------------------------------------
Aug 19 21:13:58 controller-0.redhat.local puppet-user[25453]: Warning: Could not get certificate: Execution of '/usr/bin/getcert request -I libvirt-vnc-client-cert -f /etc/pki/libvirt-vnc/client-cert.pem -c IPA -N CN=controller-0.internalapi.redhat.local -K libvirt-vnc/controller-0.internalapi.redhat.local -D controller-0.internalapi.redhat.local -C systemctl reload libvirtd -w -k /etc/pki/libvirt-vnc/client-key.pem -F /etc/ipa/ca.crt' returned 4: New signing request "libvirt-vnc-client-cert" added.
-----------------------------------------------------------------------------------------------

Is a similar error and certmonger invocation seen in this case?

Comment 19 Nathan Kinder 2020-05-07 00:53:46 UTC
I overlooked the invocation that triggered the AVC in comment#8.  The fact that it is not getcert makes me think that this is the certmonger daemon trying to resubmit an old request for a cert it is tracking (perhaps for a renewal).  I assume this is a deployment that has been around for some time, and there is an old request that used '-F /etc/ipa/ca.crt' due to bug#1743485 that certmonger is still tracking.  This can be checked on the system where the AVC occurred by running 'getcert list'.  Are you able to provide this output to prove/disprove the theory?

If my theory is correct, it should be possible to clean this up properly by using 'getcert start-tracking' to modify the existing request and change the '-F' option.  I have not personally tried modifying an existing request in this way, but it is explicitly mentioned in the 'getcert start-tracking' help output that is returned if you run it with no additional options.

Comment 20 Andrea Veri 2020-05-11 12:55:38 UTC
Nathan,

was the fix mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1743485 ported to OSP 16 as well (this specific cluster is particularly recent and based on OSP 16)? I just removed the certificate request for libvirt-vnc-client-cert and submitted a new one without the -F as requested. I just want to make sure the fix is contained on both OSP 13 and OSP 16 at this point.

Comment 21 Nathan Kinder 2020-05-13 20:12:52 UTC
(In reply to Andrea Veri from comment #20)
> Nathan,
> 
> was the fix mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1743485
> ported to OSP 16 as well (this specific cluster is particularly recent and
> based on OSP 16)? I just removed the certificate request for
> libvirt-vnc-client-cert and submitted a new one without the -F as requested.
> I just want to make sure the fix is contained on both OSP 13 and OSP 16 at
> this point.

Yes, I believe the changes are all in place in newer versions.  Did you see this issue in a fresh OSP 16 environment, or was it upgraded from an eaerlier version. 
 Was the only certificate that used "-F" in your environment the libvirt-vnc-client-cert, or were there others that used this option as well?

From what I can see in current code, puppet-certmonger will use the "-F" option when requesting a certificate if the "cacertificate" parameter is set.  There are only a few certificates obtained via tripleo-heat-templates that will ever set the "cacertificate" parameter:

  libvirt-vnc-server-cert (uses InternalTLSVncCAFile if LibvirtVncCACert is not set)
  libvirt-vnc-client-cert (uses InternalTLSVncProxyCAFile if LibvirtVncCACert is not set)
  qemu-server-cert (uses InternalTLSQemuCAFile if QemuCACert is not set)
  qemu-nbd-client-cert (uses InternalTLSQemuCAFile if QemuCACert is not set)

If LibvirtVncCACert or QemuCACert are set, the "cacertificate" parameter is not set when puppet-certmonger is called.  These parameters are only used to create symlinks in the expected locations that use an existing CA file as well as to mount custom CA files from the host into the relevant containers.  In general, these should not be set when using IdM to issue certs for OSP.

The defaults for these parameters is as follows:

  InternalTLSVncCAFile:      '/etc/pki/CA/certs/vnc.crt'
  InternalTLSVncProxyCAFile: '/etc/pki/CA/certs/vnc.crt'
  InternalTLSQemuCAFile:     '/etc/pki/CA/certs/qemu.pem'

Since none of these are '/etc/ipa/ca.crt' in the current code, there should be no chance of getcert using "-F /etc/ipa/ca.crt" unless you override these defaults.  I also confirmed that the code in this area is the same in 'stable/train'.

Comment 22 Andrea Veri 2020-05-14 10:28:40 UTC
Nathan,

this specific environment was based on a fresh OSP 16 installation. I believe this specific template we landed to workaround a bug within IDM Openstack integration would be the problem:

```
parameter_defaults:
  # https://opendev.org/openstack/tripleo-heat-templates/commit/ade09f3a3405f47fa9524ef4baea155a4262175b
  # https://bugzilla.redhat.com/show_bug.cgi?id=1710632
  # RHEL 8 prior to 8.2 has a bug and certmonger doesn't download the whole CA
  # chain properly (no intermediate chaining certs)
  # work around this until it's fixed
  InternalTLSVncCAFile: '/etc/ipa/ca.crt'
```

Technically with RHEL 8.2 this specific issue should go away together with the need of running a custom template to workaround VM consoles not showing anything useful due to SSL certificate verification failures. I'm wondering how a customer should be approaching this at this point and until RHEL 8.2 gets released. Avoid pointing InternalTLSVncCAFile to a certificate file that is managed by a system tool like certmonger? Copy /etc/ipa/ca.crt content to a separate file and reference that? Drop the -F all together as certificate files should ideally remain static and self-managed by the platform owners?

Comment 23 Nathan Kinder 2020-05-14 13:15:02 UTC
(In reply to Andrea Veri from comment #22)
> Nathan,
> 
> this specific environment was based on a fresh OSP 16 installation. I
> believe this specific template we landed to workaround a bug within IDM
> Openstack integration would be the problem:
> 
> ```
> parameter_defaults:
>   #
> https://opendev.org/openstack/tripleo-heat-templates/commit/
> ade09f3a3405f47fa9524ef4baea155a4262175b
>   # https://bugzilla.redhat.com/show_bug.cgi?id=1710632
>   # RHEL 8 prior to 8.2 has a bug and certmonger doesn't download the whole
> CA
>   # chain properly (no intermediate chaining certs)
>   # work around this until it's fixed
>   InternalTLSVncCAFile: '/etc/ipa/ca.crt'
> ```

Yes, I believe you have IdM set up as a subordinate CA and you are setting InternalTLSVncCAFile to ensure you have the full chain.
> 
> Technically with RHEL 8.2 this specific issue should go away together with
> the need of running a custom template to workaround VM consoles not showing
> anything useful due to SSL certificate verification failures. I'm wondering
> how a customer should be approaching this at this point and until RHEL 8.2
> gets released.

I'll ensure we get a test-only bug created to verify this RHEL 8.2 certmonger fix solves the problem at the OSP level.

> Avoid pointing InternalTLSVncCAFile to a certificate file
> that is managed by a system tool like certmonger? Copy /etc/ipa/ca.crt
> content to a separate file and reference that?

This would be the best option IMHO.  Also, be sure to revert the SELinux policy customizations that you made on your system(s) to avoid any potential future related problems.

Comment 24 Julie Pichon 2020-09-18 12:18:04 UTC
Nathan, thank you so much for the detailed explanations.

Andrea, I am wondering if it's okay to close this openstack-selinux bug? It seems like the issue was resolved through updating templates and I am not sure if there is still something missing. Thank you.

Comment 25 Andrea Veri 2020-09-23 15:09:22 UTC
Julie,

I had the same exact issue reported here during a recent OSP 13 z release upgrade, is the fix expected to land in OSP 13 at this point?

Comment 26 Julie Pichon 2020-09-23 16:02:33 UTC
I'm a bit lost as to what the fix is now, I thought this was about updating a template. Are we talking about THT bug 1743485? We may want to change the bz component if so. Or is this about other SELinux denials that are still causing issues? Thank you.

Comment 27 Andrea Veri 2020-09-23 16:05:02 UTC
Julie, the behavior is the one outlined at https://bugzilla.redhat.com/show_bug.cgi?id=1743485, in our case with SELinux set as enforcing we had certmonger timing out with a NEED_TO_SAVE_CERT status. Probably the fix wasn't backported to OSP 13?

Comment 28 Julie Pichon 2020-09-24 08:21:55 UTC
My apologies, I'm still having trouble understanding. From comments 22 and 23, I thought the issue was due to a custom template, and changing that template + using the workaround in comment 23 resolved the issue, in a recent environment (OSP16) at least?

From what I see in bug 1743485, the problem was due to a recent OSP15 patch that was then reverted so I don't think this should be affecting an OSP13 update...

If updating a template resolves the issue, I think we can move this to openstack-tripleo-heat-templates and to the Security DFG to help with this, as the advice in comment 23 was that this should/can be resolved without adding new openstack-selinux rules?

Does this make sense as a next step, or am I missing something else? Thank you.

Comment 29 Andrea Veri 2020-09-24 08:30:11 UTC
Julie, seems we're mixing topics here, this bug was originally opened for OSP 13 and then diverged with a discussion around OSP 16 and possible workarounds to apply there given we were using a custom template (the one around /etc/ipa/ca.crt). If you want me to re-open a new bug at the next OSP 13 we'll be performing I can go ahead and do that. What I can tell is that we had to turn SELinux off while a specific deployment task was running as certmonger was stuck with a NEED_TO_SAVE_CERT error.

Comment 30 Julie Pichon 2020-09-24 10:43:05 UTC
Thank you for the clarifications and my apologies for the delay, I thought the problem was resolved from the discussion. No need to open a new bug (although attaching additional permissive logs whenever it happens again may help.)

A couple of questions:
1. Is there a custom template related to certs on the OSP13 environments as well?
2. Does the workaround to fix this suggested by Nathan in comment 19 work to resolve the problem on OSP13?
3. Is SELinux still disabled on that environment? Would you be able to try a test package to confirm that a new openstack-selinux package works to fix it, or would we need to wait until another update?

At this point, I'm fairly sure a fix only in openstack-selinux won't be sufficient and we'll need SMEs from the Security DFG to look at any changes also required in THT.

If we can test an openstack-selinux scratch build on the broken environment, we could confirm that and then move the bug to THT as needed. However, if testing the change will require waiting longer, I'll clone this bug instead so that the issue with certmonger trying to modify /etc/ipa/ca.crt can get looked at in parallel.

Hopefully, this makes sense. Thank you!

Comment 31 Julie Pichon 2020-09-24 14:08:33 UTC
Rebase wouldn't be appropriate as that'd bring in too many changes (40+ patches) but I think I narrowed down the relevant certmonger patches:

commit f3686f13f83ab9715db9ca9ea1ddf85d4f94f859
Author: Cédric Jeanneret <cjeanner>
Date:   Wed Nov 27 15:49:50 2019 +0100

    Allow certmonger to access puppet_etc_t content
    
    Certmonger is calling scripts in order to reload containers. Those
    scripts call hiera in order to get a bunch of parameters, and
    certmonger_t isn't allowed to search/open/read puppet_etc_file_t
    content.
    
    This issue has been described in the following rhbz:
    https://bugzilla.redhat.com/show_bug.cgi?id=1777263


commit 8cd93366f5d96be5e419825a4cd22235ad60e083
Author: Cédric Jeanneret <cjeanner>
Date:   Fri Nov 29 15:39:29 2019 +0100

    Allow certmonger to actually manage containers
    
    Certmonger needs to run ps, exec and kill on containers in order to
    update the certificates used by the service within them.
    
    The following patch allow certmonger_t to "transition" to container_t
    and run the wanted commands.
    
    To my knowledge, we can't push example AVCs in the "test" directory
    because the transition is something the current test can't properly
    catch.
    
    This fixes rhbz#1777368
    https://bugzilla.redhat.com/show_bug.cgi?id=1777368


Note that:
- This won't get rid of all of the read-only denials, but that shouldn't be a problem. These are due to certmonger trying to read every running processes on the system to figure out which ones it should kill.
- The part of the problem where certmonger_t tries to overwrite /etc/ipa/ca.crt (label: etc_t) will still be present, and will likely still cause a problem. We should confirm what we can (which certmonger call causes it, if there are relevant custom templates on the environment, etc) before opening a separate bug against THT (tripleo-heat-templates) with that information.

I'll prepare a proper openstack-selinux build that includes the patches above.

Comment 32 Julie Pichon 2020-09-24 16:12:19 UTC
An OSP13 build that includes these patches is now available on brew (openstack-selinux-0.8.18-4.el7ost). I think we'll need the THT fix too before the bug in the description is fully fixed, so not setting to MODIFIED yet until that is confirmed and/or the blocker bug opened. If there is a chance to test this package, that would be great! Providing permissive logs if it doesn't work would be good too, to confirm that at least the expected denials are gone. Thank you.

Comment 34 Andrea Veri 2020-10-23 13:55:41 UTC
Julie, landed your package today and tested a deployment, it still fails. Attaching audit.log. 

[root@overcloud-controller-2 heat-admin]# rpm -qa | grep openstack-selinux
openstack-selinux-0.8.18-4.el7ost.noarch

Comment 36 Julie Pichon 2020-10-23 15:52:32 UTC
The only two relevant denials that remain are:

allow certmonger_t cert_t:dir create;
allow certmonger_t etc_t:file write;

type=PROCTITLE msg=audit(23/10/20 14:25:30.472:13603) : proctitle=/usr/libexec/certmonger/ipa-submit 
type=SYSCALL msg=audit(23/10/20 14:25:30.472:13603) : arch=x86_64 syscall=mkdir success=no exit=EACCES(Permission denied) a0=0x559404f28810 a1=0700 a2=0x77 a3=0x5f92d9ca items=0 ppid=3333 pid=390271 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=ipa-submit exe=/usr/libexec/certmonger/ipa-submit subj=system_u:system_r:certmonger_t:s0 key=(null) 
type=AVC msg=audit(23/10/20 14:25:30.472:13603) : avc:  denied  { create } for  pid=390271 comm=ipa-submit name=dbTemp.p1tXPS scontext=system_u:system_r:certmonger_t:s0 tcontext=system_u:object_r:cert_t:s0 tclass=dir permissive=0 


type=PROCTITLE msg=audit(23/10/20 14:25:40.657:15861) : proctitle=/usr/sbin/certmonger -S -p /var/run/certmonger.pid -n 
type=SYSCALL msg=audit(23/10/20 14:25:40.657:15861) : arch=x86_64 syscall=open success=no exit=EACCES(Permission denied) a0=0x55ee771da540 a1=O_WRONLY|O_CREAT|O_TRUNC a2=0666 a3=0x24 items=0 ppid=392443 pid=392444 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=certmonger exe=/usr/sbin/certmonger subj=system_u:system_r:certmonger_t:s0 key=(null) 
type=AVC msg=audit(23/10/20 14:25:40.657:15861) : avc:  denied  { write } for  pid=392444 comm=certmonger name=ca.crt dev="sda2" ino=4704484 scontext=system_u:system_r:certmonger_t:s0 tcontext=system_u:object_r:etc_t:s0 tclass=file permissive=0 

We can't allow the etc_t one as per Nathan's comments and I'm not sure about that cert_t:dir one either. Could you provide the information Nathan requested in comments 18/19? Thank you.

Comment 37 Andrea Veri 2020-10-27 12:26:15 UTC
Julie,

sorry, I missed that specific Nathan's comment. I can confirm dropping the -F flag when re-creating the cert request fixed the problem we were seeing. What's the next step for you to have that openstack-selinux package included into the next z release for OSP 13?

Thanks!

Comment 38 Julie Pichon 2020-10-27 14:05:20 UTC
I can set this bug to MODIFIED and will investigate when the next z-release is planned for, however could you clarify how you resolved the -F issue? Did that come from a custom template or command, or do we need to open another bug to get it fixed properly as well?

Comment 39 Andrea Veri 2020-10-27 14:18:15 UTC
Julie,

this cluster was originally set up ~1.5/2 years ago, which much likely means we hit https://bugzilla.redhat.com/show_bug.cgi?id=1743485 during one of the initial deployments as Nathan originally mentioned. How I fixed it is mainly:

getcert stop-tracking -i libvirt-vnc-client-cert

and then:

/usr/bin/getcert request -I libvirt-vnc-client-cert -f /etc/pki/libvirt-vnc/client-cert.pem -c IPA -N CN=controller-0.internalapi.$domain -K libvirt-vnc/controller-0.internalapi.$domain -D controller-0.internalapi.$domain -C "systemctl reload libvirtd" -w -k /etc/pki/libvirt-vnc/client-key.pem

Thanks!

Comment 40 Julie Pichon 2020-10-27 15:05:07 UTC
That other bug was fixed by reverting recent OSP15-only patches so I'm not sure if it is related. If you're satisfied that the new openstack-selinux package + the manual workaround from comment 39 is sufficient to resolve this though, then good! All the flags are correctly set for this bug to be picked up in OSP13z14, though I don't think there is a date for it yet.

Comment 41 Andrea Veri 2020-10-27 15:10:31 UTC
Julie,

the workaround was a one-timer, you may want to inform CEE to eventually include it on their documentation in case any other customer will ever hit it and they require a quick resolution. Thanks!

Comment 49 Julie Pichon 2020-11-09 15:20:20 UTC
Sanity-checked that the certmonger patches are included in the RPM. Based on this + comment 37 from the reporter confirming that the package together with a manual workaround resolved the issue, moving to VERIFIED.

Comment 54 errata-xmlrpc 2020-12-16 13:59:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13 Bug Fix and Enhancement Advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5574


Note You need to log in before you can comment on or make changes to this bug.