Description of problem: Update of the overcloud fails on bootstrap_node (controller-0 in this case). During update the rabbitmq bundle fails to restart: INFO:__main__:Setting permission for /var/log/rabbitmq PermissionError: [Errno 13] Permission denied: '/var/log/rabbitmq' (operation_finished) notice: rabbitmq_start_0:210:stderr [ sh: /var/log/rabbitmq/startup_log: Permission denied ] This is detected during deployment phase of the update: Error running ['podman', 'run', '--name', 'rabbitmq_init_bundle', '--label', 'config_id=tripleo_step2' 2019-08-30 20:49:50 | "Error: 'rabbitmqctl eval \"rabbit_mnesia:is_clustered().\" | grep -q true' returned 1 instead of one of [0]", 2019-08-30 20:49:50 | "Error: /Stage[main]/Tripleo::Profile::Pacemaker::Rabbitmq_bundle/Exec[rabbitmq-ready]/returns: change from 'notrun' to ['0'] failed: 'rabbitmqctl eval \"rabbit_mnesia:is_clustered().\" | grep -q true' returned 1 instead of one of [0]", The final error message is the generic: ERROR tripleoclient.v1.overcloud_update.MinorUpdateRun RuntimeError: Update failed with: Ansible failed, check log at /var/lib/mistral/adda556e-b284-4e67-8304-c2efba72e13e/ansible.log Version-Release number of selected component (if applicable): - openstack-selinux d'origin : openstack-selinux-0.8.19-0.20190606150404.06faac7.el8ost (Beta-1.0 - openstack-selinux final: openstack-selinux-0.8.20-0.20190823110429.50e6b42.el8ost How reproducible: all the time. Steps to Reproduce: 1. deploy osp15 beta-1.0 2. update to passed_phase2 (or passed_phase1) 3. fails
So we had a big debug session with Damien and Cedric this morning. We can see that something change the selinux context: [root@controller-0 containers]# ls -lrthZ │················································ total 24K │················································ drwxr-xr-x. 14 root root system_u:object_r:var_log_t:s0 221 Aug 22 14:03 httpd │················································ drwxr-xr-x. 2 48 48 system_u:object_r:container_file_t:s0 25 Aug 22 14:17 horizon │················································ drwxr-xr-x. 2 42402 42402 system_u:object_r:container_file_t:s0 120 Aug 25 06:00 aodh │················································ drwxr-xr-x. 2 42434 42434 system_u:object_r:container_file_t:s0 24 Aug 26 00:01 mysql │················································ drwxr-xr-x. 2 42416 42416 system_u:object_r:container_file_t:s0 103 Aug 27 00:00 gnocchi │················································ drwxr-xr-x. 2 root root system_u:object_r:container_file_t:s0 42 Aug 27 00:00 swift │················································ drwxr-xr-x. 2 42438 42438 system_u:object_r:container_file_t:s0 97 Aug 27 01:00 panko │················································ drwxr-xr-x. 2 42415 42415 system_u:object_r:container_file_t:s0 78 Aug 27 14:00 glance │················································ drwxr-xr-x. 3 42439 42439 system_u:object_r:container_file_t:s0 170 Aug 27 19:14 rabbitmq │················································ drwxr-xr-x. 2 42460 42460 system_u:object_r:container_file_t:s0 98 Aug 27 19:26 redis │················································ drwxr-xr-x. 2 42405 42405 system_u:object_r:container_file_t:s0 117 Aug 27 20:01 ceilometer │················································ drwxr-xr-x. 2 42407 42407 system_u:object_r:container_file_t:s0 4.0K Aug 27 20:01 cinder │················································ drwxr-xr-x. 2 root root system_u:object_r:var_log_t:s0 46 Aug 27 20:01 haproxy │················································ drwxr-xr-x. 2 42418 42418 system_u:object_r:container_file_t:s0 98 Aug 27 20:01 heat │················································ drwxr-xr-x. 2 42425 42425 system_u:object_r:container_file_t:s0 112 Aug 27 20:01 keystone │················································ drwxr-xr-x. 2 42435 42435 system_u:object_r:container_file_t:s0 4.0K Aug 27 20:01 neutron │················································ drwxr-xr-x. 2 42436 42436 system_u:object_r:container_file_t:s0 4.0K Aug 27 20:01 nova │················································ drwxr-xr-x. 2 root root system_u:object_r:var_log_t:s0 8.0K Aug 27 20:01 stdouts │················································ [root@controller-0 containers]# ls -lrthZ rabbitmq/ │················································ total 68K │················································ -rw-r-----. 1 root root system_u:object_r:var_log_t:s0 0 Aug 25 13:44 startup_err │················································ -rw-r-----. 1 42439 42439 system_u:object_r:var_log_t:s0 0 Aug 25 14:01 rabbit │················································ -rw-r-----. 1 root root system_u:object_r:var_log_t:s0 0 Aug 25 14:01 startup_log │················································ drwxr-x--x. 2 42439 42439 system_u:object_r:var_log_t:s0 140 Aug 27 00:00 log │················································ -rw-r-----. 1 42439 42439 system_u:object_r:container_file_t:s0 1.4K Aug 27 00:00 rabbit.1 │················································ -rw-r--r--. 1 root root unconfined_u:object_r:container_file_t:s0 0 Aug 27 19:14 foo │················································ -rw-r-----. 1 42439 42439 system_u:object_r:var_log_t:s0 64K Aug 27 20:56 rabbit "foo" was created afterward to check the original context. Before the update everything was var_log_t. We tried just update the openstack-selinux container and all the selinux context switched from container_file_t to var_log_t: # dnf upgrade openstack-selinux then : [root@controller-1 ~]# ls -lrthZ /var/log/containers/ │················································ total 48K │················································ drwxr-xr-x. 14 root root system_u:object_r:var_log_t:s0 221 Aug 22 14:03 httpd │················································ drwxr-xr-x. 2 48 48 system_u:object_r:var_log_t:s0 25 Aug 22 14:17 horizon │················································ drwxr-xr-x. 2 42434 42434 system_u:object_r:var_log_t:s0 136 Aug 27 00:00 mysql │················································ drwxr-xr-x. 2 root root system_u:object_r:var_log_t:s0 46 Aug 27 00:00 haproxy │················································ drwxr-xr-x. 3 42439 42439 system_u:object_r:var_log_t:s0 4.0K Aug 27 00:00 rabbitmq │················································ drwxr-xr-x. 2 root root system_u:object_r:var_log_t:s0 42 Aug 27 00:00 swift │················································ drwxr-xr-x. 2 42438 42438 system_u:object_r:var_log_t:s0 180 Aug 27 01:01 panko │················································ drwxr-xr-x. 2 42415 42415 system_u:object_r:var_log_t:s0 219 Aug 27 14:00 glance │················································ drwxr-xr-x. 2 42435 42435 system_u:object_r:var_log_t:s0 4.0K Aug 27 17:00 neutron │················································ -rw-r--r--. 1 root root system_u:object_r:var_log_t:s0 0 Aug 27 17:58 foo │················································ drwxr-xr-x. 2 42460 42460 system_u:object_r:var_log_t:s0 4.0K Aug 27 18:01 redis │················································ drwxr-xr-x. 2 42407 42407 system_u:object_r:var_log_t:s0 4.0K Aug 27 19:37 cinder │················································ drwxr-xr-x. 2 42402 42402 system_u:object_r:var_log_t:s0 4.0K Aug 27 20:00 aodh │················································ drwxr-xr-x. 2 42405 42405 system_u:object_r:var_log_t:s0 4.0K Aug 27 20:00 ceilometer │················································ drwxr-xr-x. 2 42416 42416 system_u:object_r:var_log_t:s0 4.0K Aug 27 20:00 gnocchi │················································ drwxr-xr-x. 2 42418 42418 system_u:object_r:var_log_t:s0 230 Aug 27 20:00 heat │················································ drwxr-xr-x. 2 42425 42425 system_u:object_r:var_log_t:s0 4.0K Aug 27 20:00 keystone │················································ drwxr-xr-x. 2 42436 42436 system_u:object_r:var_log_t:s0 4.0K Aug 27 20:00 nova │················································ drwxr-xr-x. 2 root root system_u:object_r:var_log_t:s0 8.0K Aug 27 20:00 stdouts To complicate the matter, we did a controller-2 only update and that went fine, everything ends up being labelled container_file_t.
Some more info: - the directories located in /var/log/containers are bind-mounted "as is" into the container (no :Z nor any other flag) - those directories are created within tripleo-heat-templates, for instance: - name: create persistent directories file: path: "{{ item.path }}" state: directory setype: "{{ item.setype }}" with_items: - { 'path': /var/log/memcached, 'setype': svirt_sandbox_file_t } During the tests on another controller, we just updated the "openstack-selinux" package, and all the content of /var/log/container was set to var_log_t, which is an issue since container processes aren't allowed to write in this context.
This is probably surfacing due to the recent patch to fix issues around applying file contexts on upgrades (2ab4fae8 related to bug 1744259). I thought it only caused the file contexts semanage statement to fail but it seems like perhaps we broke out of the entire function early and never got to the relabel_files part at all? Because the container_file_t label is applied by THT on install but not defined in policy, there is no reference to it when restoring the contexts. So it isn't set permanently. It sounds like running restorecon on an existing system may cause the same issue then. After discussing on IRC, the current path forward would be to update set_file_contexts() in openstack-selinux local_settings.sh so that the file context on /var/log/containers is set to container_file_t as expected until now. Cedric confirmed that /var/log/containers is something TripleO creates and owns, so we should be okay. However it sounds like it is possible OpenShift, when installed with OpenStack (up to 14?), also uses that directory for some logs so we'll probably want to confirm the label OpenShift expects there before rebuilding openstack-selinux for older OSP versions later on...
Cedric proposed a PR at https://github.com/redhat-openstack/openstack-selinux/pull/40.
Adding that cleaner version (pull/41) as discussed on irc. It could prevent unwanted side effect during update and is overall cleaner.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811