Bug 1859592 - /var/lib/kolla/ has wrong selinux context in some case which prevents containers from starting
Summary: /var/lib/kolla/ has wrong selinux context in some case which prevents contain...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-selinux
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Julie Pichon
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-22 14:21 UTC by David Hill
Modified: 2023-10-06 21:12 UTC (History)
3 users (show)

Fixed In Version: openstack-selinux-0.8.24-1.20211201143442.26243bf.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-24 10:59:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github redhat-openstack openstack-selinux pull 70 0 None closed Add container_file_t type for /var/lib/kolla 2021-01-18 14:26:16 UTC
Red Hat Issue Tracker OSP-7114 0 None None None 2022-01-25 16:44:16 UTC
Red Hat Knowledge Base (Solution) 5247321 0 None None None 2020-07-24 12:24:08 UTC
Red Hat Product Errata RHBA-2022:0986 0 None None None 2022-03-24 10:59:41 UTC

Description David Hill 2020-07-22 14:21:18 UTC
Description of problem:
/var/lib/kolla/ has wrong selinux context in some case which prevents containers from starting 


Version-Release number of selected component (if applicable):
Latest

How reproducible:
That customer environment

Steps to Reproduce:
1. unknown
2.
3.

Actual results:
podman containers are failing to start due to wrong selinux context on /var/lib/kolla.   It had the var_lib context instead of container_files.

Expected results:
Somethign wicked happend.

Additional info:
Really wicked.  Also, libvirtd was enable which spawned dnsmasq which binded to :53 and prevented ironic-dnsmasq from binding to the same port.

Comment 1 David Hill 2020-07-22 14:41:23 UTC
Shouldn't we add something like this 

semanage fcontext -a -t containers_file_t "/var/lib/kolla(/.*)?"

?

Comment 2 Julie Pichon 2020-07-22 14:54:17 UTC
I believe directories under /var/lib/kolla/ are bind-mounted with :z in tripleo-heat-templates which gives them the right context. But we could also add this to openstack-selinux as an extra safety measure. Were you able to set the right context, did that resolve the issue with the containers not starting?

Comment 3 Julie Pichon 2020-07-22 15:33:03 UTC
Is it possible a restorecon command was run after starting the containers? That would reset the bind-mounts among other things, and generally shouldn't be done here.

Comment 4 David Hill 2020-07-22 15:44:52 UTC
Bonjour Julie!   I've tried everything so far , restorecon, reboot, manually set the permissions on /var/lib/kolla, etc ... the only thing that solves their problem is to reboot and put selinux in permissive.  Something's uterly broken here in this case.

There was a libvirtd service running which prevented the ironic-inspector-dnsmasq process from starting properly so I'm wondering if it could've snowballed to a selinux issue.  Customer will try deploying a new minimal installation instead of the full installation with GUI and then re-install the undercloud. 

The main issue they faced was that introspection was failing... due to libvirtd but containers like mistral were in a "Z" state because it couldn't read some .json files .  Another thing I found puzzling is that some containers failed to start with "sudo -E kolla_set_configs" whereas some other containers like mistral would just fail loading some .json files due to selinux.

Comment 5 Cédric Jeanneret 2020-07-22 15:53:42 UTC
Hello,

Just stepping in.
Adding the fcontext thing is a good idea - but keep in mind this action is slow. really slow (and it's a pity). Also, fcontext won't update existing files, so it probably will not correct the environment. But with that fcontext in place, we might then be able to run restorecon on that precise location.

That said. Most of the files being mounted FROM that location have the :ro flag, meaning readonly, meaning the selinux type shouldn't really cause any issue. The locations with either flags are:
deployment/database/mysql-pacemaker-puppet.yaml:                  - /var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json:rw,z
deployment/glance/glance-api-container-puppet.yaml:                  - /var/lib/kolla/config_files/glance_api.json:/var/lib/kolla/config_files/config.json (no flag, implies :rw)

The "z" relabels thing, but that would affect only the subdirectory

This is it. Not really sure WHAT creates this location, apparently it's not within tripleo-heat-templates - maybe from a package?

Cheers,

C.

Comment 6 Julie Pichon 2020-07-22 15:59:03 UTC
(In reply to David Hill from comment #4)
> Bonjour Julie!   I've tried everything so far , restorecon, reboot, manually
> set the permissions on /var/lib/kolla, etc ... the only thing that solves
> their problem is to reboot and put selinux in permissive.  Something's
> uterly broken here in this case.

I would strongly advise against running restorecon commands, unless they're very targeted to a single file. This can cause a number of other problems.

What are the openstack-selinux and container-selinux versions?

It does seem like there are a number of other issues on-going in addition to the SELinux one...

Comment 7 David Hill 2020-07-22 16:03:15 UTC
@julie : so touch /.autorelabel is not advisable ?  What happens if some kind of rpm update / installation generate these ?  Everything blows up ?
@cedric: I'm not sure either ... could be a package or even a python script...

Comment 8 David Hill 2020-07-22 16:11:47 UTC
I think we can reproduce this issue just by running restorecon or touch /.autorelabel... customer might have done something there.

Comment 9 Julie Pichon 2020-07-22 16:19:17 UTC
touch /.autorelabel may be the least bad way to go about it, since it requires a reboot and the bind-mounts should be recreated when the containers restart after... But there are other labels that are e.g. only applied at deploy time in THT. I generally wouldn't recommend doing a wide restorecon on a running system. There are a lot of ways in which it can create problems (e.g. bug 1846540).

Comment 10 David Hill 2020-07-22 19:06:09 UTC
I'm not sure I agree with this at this point.  It should be a permanent context for those paths if it's required and /.autorelabel shouldn't break anything.  If we have selinux bugs, we should definitely fix them and I can see many reasons where a selinux relable could be required.

Comment 11 David Hill 2020-07-22 19:17:36 UTC
So basically here, a /.autorelabel will not restore the contexts to what they should be and then my undercloud deployment is no longer functionnal.   Even if I reboot, it's broken now and from what I understand, I'd have to run "openstack undercloud install" again to restore those contexts ?

Comment 12 Julie Pichon 2020-07-23 08:48:16 UTC
I think an update would reset the proper context as well. Also, when the containers are restarted after the reboot, the volumes bind-mounted with :z would also have the correct label.

Comment 13 David Hill 2020-07-23 12:11:52 UTC
I re-ran "openstack undercloud install" and it fixed the issue on my lab ... so this isn't as bad as it looks even though it's not how it should be fixed in my book.   Adding the seliux context to path with semanage would be more appropriate ...

Comment 31 errata-xmlrpc 2022-03-24 10:59:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0986


Note You need to log in before you can comment on or make changes to this bug.