Bug 1098446

Summary: Running setsebool lasts long time and is killed in the end
Product: Red Hat Enterprise Linux 7 Reporter: Martin Magr <mmagr>
Component: libsemanageAssignee: Petr Lautrbach <plautrba>
Status: CLOSED ERRATA QA Contact: Milos Malik <mmalik>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.0CC: amsharma, aortega, dneary, dwalsh, eparis, gdubreui, lbezdick, mgrepl, mmagr, mmalik, plautrba, rmeggins, sdsmall, yeylon
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1080481 Environment:
Last Closed: 2015-11-19 12:59:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1080481    
Attachments:
Description Flags
libsepol valgrind happy
none
massif output of setsebol
none
Skip policy module re-linking when only setting a boolean none

Description Martin Magr 2014-05-16 08:26:23 UTC
+++ This bug was initially created as a clone of Bug #1080481 +++

Description of problem:
Fedora20: packstack gives undefined method `split' for nil:NilClass

Version-Release number of selected component (if applicable):
openvswitch.x86_64 0:2.0.1-1.fc20 installed

Steps to Reproduce:
1.packstack --answer-file=/root/packstack-answers-20140325-030016.txt


Actual results:
ERROR : Error appeared during Puppet run: 10.65.201.125_horizon.pp
Error: undefined method `split' for nil:NilClass
You will find full trace in log /var/tmp/packstack/20140325-090940-WGyOcv/manifests/10.65.201.125_horizon.pp.log
Please check log file /var/tmp/packstack/20140325-090940-WGyOcv/openstack-setup.log for more information


Expected results:
Installation should be successful.

Additional info:
PFA for log files.

--- Additional comment from Amita Sharma on 2014-03-25 10:14:46 EDT ---



--- Additional comment from Francesco Vollero on 2014-04-08 11:41:05 EDT ---

Hi Amita, 

I wasn't able to replicat ethe bug, could you please also paste the logs from 10.65.201.125_horizon.pp.log in order to track it down properly.

--- Additional comment from Francesco Vollero on 2014-04-10 13:00:24 EDT ---

Ok, I did a little digging and seems that the problem is on setsebool and at the moment we could not do anything because is not packstack related. What anyhow you could try to do is to test with a bigger timeout and see if it work properly.

--- Additional comment from Martin Magr on 2014-04-24 07:59:10 EDT ---

There is no way to set timeout for setsebool. But unfortunately problem here is Puppet itself. I created bug for that some time ago: 

https://tickets.puppetlabs.com/browse/PUP-1948

--- Additional comment from Gilles Dubreuil on 2014-05-15 00:37:42 EDT ---

The same symptoms are occurring on RHEL7RC.
Meanwhile this is a RHEL7 bug:
Here is the manifest excerpt from <IP>_horizon.pp generating the error:
--------------------
if ($::selinux != "false"){
    selboolean{'httpd_can_network_connect':
        value => on,
        persistent => true,
    }
}
-------------------

Debugging selboolean resource leads to setsebool -P command to fail:
---------
[root@p1-rh7 ~]# /usr/sbin/setsebool -P httpd_can_network_connect on
Killed
---------

As a matter of fact, any boolean flag setup with persistent option '-P' will fail.

This has been fixed in F20, tested using latest policycoreutils-2.2.5-3.fc20.x86_64.


Note, it doesn't matter if selinux is enforced or not (permissive).

Comment 1 Miroslav Grepl 2014-05-16 08:34:36 UTC
I don't see this issue.

Could you provide a machine where it happens?

Comment 3 Martin Magr 2014-05-16 08:43:47 UTC
I believe Gilles is able to provide you the machine.

Comment 4 Daniel Walsh 2014-05-17 10:01:15 UTC
Only reason I could see for this would be if there was limited memory on the system.  setsebool and semanage commands use a lot of memory to compile policy

Comment 5 Gilles Dubreuil 2014-05-19 01:12:20 UTC
Hi Daniel,

(In reply to Daniel Walsh from comment #4)
> Only reason I could see for this would be if there was limited memory on the
> system.  

That makes sense and will explain why the issue is not happening consistently,
I was starting to suspect a memory issue.

In the case of deploying an All-in-one OpenStack scenario the machine is under heavy load of services. My initial test case was using a VM with 2048MB of RAM, which I doubled up later!

>setsebool and semanage commands use a lot of memory to compile
> policy

Do there is an option/way to extend the timeout kill?

Thanks

Comment 6 Daniel Walsh 2014-05-19 16:22:53 UTC
No sorry.

Comment 7 Lukas Bezdicka 2014-05-21 09:41:13 UTC
What could be done is running setsebool before starting all the services we start and probably setting vm.min_free_kbytes = 45056 might help woth oomkill :/

Comment 8 Gilles Dubreuil 2014-05-21 10:47:09 UTC
(In reply to Lukas Bezdicka from comment #7)
> What could be done is running setsebool before starting all the services we
> start and probably setting vm.min_free_kbytes = 45056 might help woth
> oomkill :/

That's a good idea. 
But I'm not sure if that's going to make an overall impact because every individual service is managed atomically (package, configuration and service).

I wonder if there is any way to pre-compile policies before hand in order to alleviate the process when setsebool is executed.

Comment 9 Lukas Bezdicka 2014-05-21 13:53:40 UTC
Created attachment 897992 [details]
libsepol valgrind happy

Comment 10 Gilles Dubreuil 2014-05-23 03:17:06 UTC
(In reply to Lukas Bezdicka from comment #9)
> Created attachment 897992 [details]
> libsepol valgrind happy

One thing, that might not be related but potentially bad, see lines 1460/1461:
------
if (linear_probe_create(&probe, 4096)) { /* Assume 4096 is enough for most cases */
------

Anyway, the question remains: 
Why is this happening at all?
Compiling time is inversely proportional to available resources.
Less memory usually translates to more process time but not a failure.

Comment 11 Lukas Bezdicka 2014-05-23 10:11:50 UTC
Created attachment 898647 [details]
massif output of setsebol

Comment 12 Gilles Dubreuil 2014-05-23 10:40:23 UTC
Hi Lukas,

That looks good but can you please comment your attachments and more specifically this massive output.

Thanks,
Gilles

Comment 13 Lukas Bezdicka 2014-05-23 11:03:32 UTC
Well the first patch is useless as the upstream dropped whole linear_probe patch (see e910cf6e62d94d09e810bd173c14c5c4afb72242). As for the massif output it's clear that issue is in semanage_link_sandbox(). I tried two different approaches one would be loading module, linking it and clearing it out, but link_modules() function seems to be too complex for me and I don't think it can be changed in this way without major refactor. The other approach was to try to store only required data from module but there I got lost in all the pointers in mod_pols in sepol_link_packages(). This is probably for someone more experienced with the selinux userspace code.

Comment 14 Gilles Dubreuil 2014-05-26 01:02:32 UTC
Let's see if we have SELINUX resource available to help confirm there's nothing much we could do on our side and see how to work that upstream.

Comment 15 Rich Megginson 2014-07-10 01:46:37 UTC
I am running F20 in a kvm/qemu virtual machine with 2 cores and 2GB RAM.  Both setsebool and semanage boolean run OOM trying to set httpd_can_network_connect.

If I increase the RAM in the VM to 4GB, it works.  So at least there is a workaround.

Comment 16 Gilles Dubreuil 2014-07-11 01:02:00 UTC
(In reply to Rich Megginson from comment #15)

> If I increase the RAM in the VM to 4GB, it works.  So at least there is a
> workaround.

No sure if that's a workaround for everyone ;-)

Comment 17 Stephen Smalley 2014-07-25 16:52:36 UTC
Created attachment 921017 [details]
Skip policy module re-linking when only setting a boolean

Comment 18 Lukas Bezdicka 2014-07-28 09:42:13 UTC
(In reply to Stephen Smalley from comment #17)
> Created attachment 921017 [details]
> Skip policy module re-linking when only setting a boolean

I confirm that with patch setsebool -P httpd_can_network_connect takes shorter time and does take about ~400MB which is much better.

Comment 19 Stephen Smalley 2014-07-28 15:40:01 UTC
~400MB?  In my testing, valgrind --tool=massif setsebool -P httpd_can_network_connect=1 peaks at 26.97MB with this patch.  On Fedora 20.

Comment 20 Lukas Bezdicka 2014-07-29 09:59:21 UTC
Oh yes, I was testing purely the patch on clean source, the redhat patch in libsemanage package does the magic. 28mb now.

Comment 21 Miroslav Grepl 2014-07-29 10:53:03 UTC
I can confirm it too.

Comment 22 Gilles Dubreuil 2014-07-30 04:40:01 UTC
That's great! 
Bravo to Stephen.
Thank you everyone for helping/testing.

Comment 23 Dave Neary 2014-11-29 00:36:22 UTC
Hi,

Has an updated package been released? I still see this problem on an up-to-date CentOS 7 image with 1024M RAM (it's a VM on a resource-constrained machine).

Thank,s
Dave.

Comment 25 Lukas Bezdicka 2014-12-01 11:20:10 UTC
*** Bug 1080481 has been marked as a duplicate of this bug. ***

Comment 29 errata-xmlrpc 2015-11-19 12:59:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2148.html

Comment 30 Red Hat Bugzilla 2023-09-14 02:08:00 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days