Description of problem: From the upgrade from 3.7 to 3.9, the SELinux boolean container_manage_cgroup will not be enabled for the existing nodes. However, when we scale up the cluster, this boolean will be activated by default by the openshift node playbook. Version-Release number of selected component (if applicable): 3.9.40 How reproducible: On RHEL 7.4/7.5 Steps to Reproduce: 0. Check SELinux boolean container_manage_cgroup, it should not be enabled by default on all the nodes 1. Upgrade cluster from OCP 3.7 to 3.9 2. Scale up cluster 3. Check SELinux boolean container_manage_cgroup Actual results: Existing Openshift nodes do not have the SELinux boolean container_manage_cgroup enabled. New Openshift nodes have by default this SELinux boolean enabled. Expected results: Remove discrepancies between the nodes. NB: Users should have the flexibility to enable or not this boolean in the ansible inventory host file since not all of the users are using systemd containers.
Please manually set the boolean on existing nodes as a workaround.
(In reply to Scott Dodson from comment #1) > Please manually set the boolean on existing nodes as a workaround. Why wouldn't we fix this?
Hello Team, We need to change the upgrade playbook to enable the "container_manage_cgroup" boolean. Also, we should add in the documentation the remark that there is this bug going on currently. Regards, Anshul Verma
(In reply to Michael Gugino from comment #2) > (In reply to Scott Dodson from comment #1) > > Please manually set the boolean on existing nodes as a workaround. > > Why wouldn't we fix this? Because we didn't break it and the problem can be introduced entirely outside of the installer. You `yum upgrade` your selinux policy and now your cluster is broken without any involvement of openshift-ansible. If you have time to fix it in the upgrade go for it, please make sure it's addressed in 3.10 too.
(In reply to Scott Dodson from comment #4) > (In reply to Michael Gugino from comment #2) > > (In reply to Scott Dodson from comment #1) > > > Please manually set the boolean on existing nodes as a workaround. > > > > Why wouldn't we fix this? > > Because we didn't break it and the problem can be introduced entirely > outside of the installer. You `yum upgrade` your selinux policy and now your > cluster is broken without any involvement of openshift-ansible. > > If you have time to fix it in the upgrade go for it, please make sure it's > addressed in 3.10 too. Yeah, we're in a tough spot. This seems like one of those problems that we have to be quite reactive to as it's certainly nothing the users are doing to break themselves other than properly patching their hosts (which should be encouraged). I will try to take this on.
PR Created in master: https://github.com/openshift/openshift-ansible/pull/9824
3.9 merged: https://github.com/openshift/openshift-ansible/pull/9832
In openshift-ansible-3.9.42-1 and later
fixed. openshift-ansible-3.9.47-1.git.0.8180c87.el7.noarch before upgrade atomic-openshift version: v3.7.68 # getsebool -a | grep container_manage_cgroup container_manage_cgroup --> off after upgrade to 3.9 openshift v3.9.47 # getsebool container_manage_cgroup container_manage_cgroup --> on This value is consistent with v3.9 fresh install now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3748