Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1566729

Summary: libsemanage installs policies very slowly
Product: Red Hat Enterprise Linux 7 Reporter: Rick Masters <masters>
Component: libsemanageAssignee: Vit Mojzis <vmojzis>
Status: CLOSED ERRATA QA Contact: Jan Zarsky <jzarsky>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: dapospis, dshumake, dwalsh, jzarsky, lvrabec, mgrepl, mmalik, plautrba, sardella, ssekidde, vmojzis
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:35:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rick Masters 2018-04-12 21:24:47 UTC
Description of problem:
Installing selinux policy modules is extremely slow. Installing the existing policies after extraction takes 43 minutes. It should be less than 30 seconds.

Version-Release number of selected component (if applicable):
libsemanage-2.5-11

How reproducible:
Every time.

Steps to Reproduce:
1. Install RHEL 7.5 and login as root.
2. for module in $(semodule -l | awk '{print $1}'); do semodule -E $module; done
3. semodule -X 100 -s targeted -i *.pp

Actual results:
Takes 43 minutes on my machine.

Expected results:
Should takes less than a minute. The same commands take less than 30 seconds on Fedora 27 on the same hardware.

Additional info:
This is a performance bug introduced by a Red Hat patch to libsemanage in order to support showing versions for semodule -l. The patch in question is located in libsemanage-rhel.patch.

At the start of the process, semodule calls libsemanage to install the policy file. Inside libsemanage, semanage_direct_install_info calls semanage_direct_get_module_info.
The problem is in semanage_direct_get_module_info. It calls semanage_direct_list_all. That gets info on *all* existing policies. But, semanage_direct_get_module_info doesn't really need all policies; it really just needs policies with the same name but varying priorities. This list is used to warn about which priority will actually be active.

Historically, calling semanage_direct_get_module_info and actually getting *all* policies was not a significant performance issue because the call to semanage_direct_get_module_info was relatively lightweight. However, the Red Hat patch mentioned above adds code that calls map_file on every module. That turns a relatively lightweight operation into a significantly expensive operation which reads and does bz2 decompression on entire files.

I've implemented a solution to change semanage_direct_get_module_info to stop getting all policies when it really just needs to get the ones with a matching name. After retrieving the list of files, it frees and zeros out the entries that do not match the name it cares about. This patch reduces policy installation time over 90%.

The following patch is against 2.7.1 but it also applies cleanly against 2.5-11.


--- libsemanage-2.5-orig/src/direct_api.c       2018-04-10 14:13:20.218745088 -0700
+++ libsemanage-2.5/src/direct_api.c    2018-04-11 10:14:03.598102408 -0700
@@ -91,6 +91,11 @@
                                    semanage_module_info_t **modinfo,
                                    int *num_modules);

+static int semanage_direct_list_by_name(semanage_handle_t *sh,
+                                   semanage_module_info_t **modinfo,
+                                   int *num_modules,
+                                   char *optional_module_name);
+
 static int semanage_direct_install_info(semanage_handle_t *sh,
                                        const semanage_module_info_t *modinfo,
                                        char *data,
@@ -2004,7 +2009,10 @@

        /* if priority == 0, then find the highest priority available */
        if (modkey->priority == 0) {
-               ret = semanage_direct_list_all(sh, &modinfos, &modinfos_len);
+               ret = semanage_direct_list_by_name(sh,
+                                                  &modinfos,
+                                                  &modinfos_len,
+                                                  modkey->name);
                if (ret != 0) {
                        status = -1;
                        goto cleanup;
@@ -2343,6 +2351,14 @@
                                    semanage_module_info_t **modinfos,
                                    int *modinfos_len)
 {
+       return semanage_direct_list_by_name(sh, modinfos, modinfos_len, NULL);
+}
+
+static int semanage_direct_list_by_name(semanage_handle_t *sh,
+                                       semanage_module_info_t **modinfos,
+                                       int *modinfos_len,
+                                       char *optional_module_name)
+{
        assert(sh);
        assert(modinfos);
        assert(modinfos_len);
@@ -2365,6 +2381,7 @@

        struct dirent **modules = NULL;
        int modules_len = 0;
+       int num_modules_found = 0;

        uint16_t priority = 0;

@@ -2428,8 +2445,11 @@
                /* cleanup old modules */
                if (modules != NULL) {
                        for (j = 0; j < modules_len; j++) {
-                               free(modules[j]);
-                               modules[j] = NULL;
+                               /* may have been filtered out */
+                               if (modules[j]) {
+                                       free(modules[j]);
+                                       modules[j] = NULL;
+                               }
                        }
                        free(modules);
                        modules = NULL;
@@ -2451,10 +2471,25 @@

                if (modules_len == 0) continue;

+               num_modules_found=0;
+               if (optional_module_name) {
+                       for (j = 0; j < modules_len; j++) {
+                               /* filter out */
+                               if (strcmp(modules[j]->d_name, optional_module_name)) {
+                                       free(modules[j]);
+                                       modules[j] = NULL;
+                               } else {
+                                       num_modules_found++;
+                               }
+                       }
+               }
+
                /* add space for modules */
                tmp = realloc(*modinfos,
                              sizeof(semanage_module_info_t) *
-                               (*modinfos_len + modules_len));
+                               (*modinfos_len +
+                               (optional_module_name ?
+                                       num_modules_found : modules_len)));
                if (tmp == NULL) {
                        ERR(sh, "Error allocating memory for module array.");
                        status = -1;
@@ -2464,6 +2499,8 @@

                /* for each module directory */
                for(j = 0; j < modules_len; j++) {
+                       if (!modules[j]) continue;
+
                        /* set module name */
                        ret = semanage_module_info_set_name(
                                        sh,
@@ -2527,7 +2564,10 @@

        if (modules != NULL) {
                for (i = 0; i < modules_len; i++) {
-                       free(modules[i]);
+                       /* may have been filtered out */
+                       if (modules[j]) {
+                               free(modules[i]);
+                       }
                }
                free(modules);
        }

Comment 2 Rick Masters 2018-04-19 17:36:00 UTC
Please note there is a typo in the last part of the patch I posted:

+                       if (modules[j]) {
+                               free(modules[i]);
+                       }

Should be:

+                       if (modules[i]) {
+                               free(modules[i]);
+                       }

Comment 3 Petr Lautrbach 2018-04-20 09:28:37 UTC
Thanks for the patch. The general approach looks good to me. I talked about this Vit with and we agreed that it could be improved a little bit, see comments bellow:

(In reply to Rick Masters from comment #0)
> --- libsemanage-2.5-orig/src/direct_api.c       2018-04-10
> 14:13:20.218745088 -0700
> +++ libsemanage-2.5/src/direct_api.c    2018-04-11 10:14:03.598102408 -0700
> @@ -91,6 +91,11 @@
>                                     semanage_module_info_t **modinfo,
>                                     int *num_modules);
> 
> +static int semanage_direct_list_by_name(semanage_handle_t *sh,
> +                                   semanage_module_info_t **modinfo,
> +                                   int *num_modules,
> +                                   char *optional_module_name);
> +
>  static int semanage_direct_install_info(semanage_handle_t *sh,
>                                         const semanage_module_info_t
> *modinfo,
>                                         char *data,
> @@ -2004,7 +2009,10 @@
> 
>         /* if priority == 0, then find the highest priority available */
>         if (modkey->priority == 0) {
> -               ret = semanage_direct_list_all(sh, &modinfos, &modinfos_len);
> +               ret = semanage_direct_list_by_name(sh,
> +                                                  &modinfos,
> +                                                  &modinfos_len,
> +                                                  modkey->name);
>                 if (ret != 0) {
>                         status = -1;
>                         goto cleanup;
> @@ -2343,6 +2351,14 @@
>                                     semanage_module_info_t **modinfos,
>                                     int *modinfos_len)
>  {
> +       return semanage_direct_list_by_name(sh, modinfos, modinfos_len,
> NULL);
> +}
> +
> +static int semanage_direct_list_by_name(semanage_handle_t *sh,
> +                                       semanage_module_info_t **modinfos,
> +                                       int *modinfos_len,
> +                                       char *optional_module_name)
> +{
>         assert(sh);
>         assert(modinfos);
>         assert(modinfos_len);
> @@ -2365,6 +2381,7 @@
> 
>         struct dirent **modules = NULL;
>         int modules_len = 0;
> +       int num_modules_found = 0;

This can be dropped.
 
>         uint16_t priority = 0;
> 
> @@ -2428,8 +2445,11 @@
>                 /* cleanup old modules */
>                 if (modules != NULL) {
>                         for (j = 0; j < modules_len; j++) {
> -                               free(modules[j]);
> -                               modules[j] = NULL;
> +                               /* may have been filtered out */
> +                               if (modules[j]) {
> +                                       free(modules[j]);
> +                                       modules[j] = NULL;
> +                               }
>                         }
>                         free(modules);
>                         modules = NULL;

It can be dropped as well.

> @@ -2451,10 +2471,25 @@
> 
>                 if (modules_len == 0) continue;
> 
> +               num_modules_found=0;
> +               if (optional_module_name) {
> +                       for (j = 0; j < modules_len; j++) {
> +                               /* filter out */
> +                               if (strcmp(modules[j]->d_name,
> optional_module_name)) {
> +                                       free(modules[j]);
> +                                       modules[j] = NULL;
> +                               } else {
> +                                       num_modules_found++;
> +                               }
> +                       }
> +               }
> +

We can expect that the modules array will have one element at most. Therefore it should be safe to create a new one element array, copy pointer to the matching element there, drop everything from the original array replace the old array with the new one.


>                 /* add space for modules */
>                 tmp = realloc(*modinfos,
>                               sizeof(semanage_module_info_t) *
> -                               (*modinfos_len + modules_len));
> +                               (*modinfos_len +
> +                               (optional_module_name ?
> +                                       num_modules_found : modules_len)));
>                 if (tmp == NULL) {
>                         ERR(sh, "Error allocating memory for module array.");
>                         status = -1;

It can be dropped

> @@ -2464,6 +2499,8 @@
> 
>                 /* for each module directory */
>                 for(j = 0; j < modules_len; j++) {
> +                       if (!modules[j]) continue;
> +
>                         /* set module name */
>                         ret = semanage_module_info_set_name(
>                                         sh,

It can be dropped

> @@ -2527,7 +2564,10 @@
> 
>         if (modules != NULL) {
>                 for (i = 0; i < modules_len; i++) {
> -                       free(modules[i]);
> +                       /* may have been filtered out */
> +                       if (modules[j]) {
> +                               free(modules[i]);
> +                       }
>                 }
>                 free(modules);
>         }

It can be dropped


What do you think about it?

Vit agreed that he'll prepare a new patch based on this suggestion for review.

Comment 4 Rick Masters 2018-04-20 21:21:22 UTC
The provided patch attempts to minimize the changes and keep the overall structure intact. It does feel like it might benefit from a more significant refactor but I'm afraid I'm not familiar enough with the code to suggest how to do it.

If you do decide to go with the approach in the provided patch, please be aware that further testing revealed another problem. An extra line to continue if no modules are found is necessary after filtering:

> +               if (optional_module_name) {
> +                       for (j = 0; j < modules_len; j++) {
> +                               /* filter out */
> +                               if (strcmp(modules[j]->d_name,
> optional_module_name)) {
> +                                       free(modules[j]);
> +                                       modules[j] = NULL;
> +                               } else {
> +                                       num_modules_found++;
> +                               }
> +                       }
HERE > +                       if (num_modules_found == 0) continue;
> +               }
> +


Without that, it subsequently attempts to realloc with size zero, which can return NULL and trigger error handling improperly.

Comment 5 Vit Mojzis 2018-04-30 11:39:55 UTC
Thank you for the time you put to perfecting the patch. 
I slightly simplified it and came up with the following (I still need to do some more testing):

---
 libsemanage/src/direct_api.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/libsemanage/src/direct_api.c b/libsemanage/src/direct_api.c
index 042c9ae8..6dd21dd8 100644
--- a/libsemanage/src/direct_api.c
+++ b/libsemanage/src/direct_api.c
@@ -89,6 +89,11 @@ static int semanage_direct_get_module_info(semanage_handle_t *sh,
 					   const semanage_module_key_t *modkey,
 					   semanage_module_info_t **modinfo);
 
+static int semanage_direct_list_by_name(semanage_handle_t *sh,
+					semanage_module_info_t **modinfo,
+					int *num_modules,
+					char *optional_module_name);
+
 static int semanage_direct_list_all(semanage_handle_t *sh,
 				    semanage_module_info_t **modinfo,
 				    int *num_modules);
@@ -2166,7 +2171,10 @@ static int semanage_direct_get_module_info(semanage_handle_t *sh,
 
 	/* if priority == 0, then find the highest priority available */
 	if (modkey->priority == 0) {
-		ret = semanage_direct_list_all(sh, &modinfos, &modinfos_len);
+		ret = semanage_direct_list_by_name(sh,
+			&modinfos,
+			&modinfos_len,
+			modkey->name);
 		if (ret != 0) {
 			status = -1;
 			goto cleanup;
@@ -2504,6 +2512,14 @@ static int semanage_modules_filename_select(const struct dirent *d)
 static int semanage_direct_list_all(semanage_handle_t *sh,
 				    semanage_module_info_t **modinfos,
 				    int *modinfos_len)
+{
+	return semanage_direct_list_by_name(sh, modinfos, modinfos_len, NULL);
+}
+
+static int semanage_direct_list_by_name(semanage_handle_t *sh,
+					semanage_module_info_t **modinfos,
+					int *modinfos_len,
+					char *optional_module_name)
 {
 	assert(sh);
 	assert(modinfos);
@@ -2603,6 +2619,7 @@ static int semanage_direct_list_all(semanage_handle_t *sh,
 				      &modules,
 				      semanage_modules_filename_select,
 				      versionsort);
+
 		if (modules_len == -1) {
 			ERR(sh,
 			    "Error while scanning directory %s.",
@@ -2613,6 +2630,33 @@ static int semanage_direct_list_all(semanage_handle_t *sh,
 
 		if (modules_len == 0) continue;
 
+		if (optional_module_name) {
+			for (j = 0; j < modules_len; j++) {
+				/* try to find specified module */
+				if (strcmp(modules[j]->d_name, optional_module_name)) {
+					free(modules[j]);
+				} else {
+					/* module found, move it to the beginning of */
+					/* the list and clean up the remaining entries */
+					modules[0] = modules[j];
+					for (j++; j < modules_len; j++){
+						free(modules[j]);
+					}
+
+					modules_len = 1;
+					j = 0;
+					break;
+				}
+			}
+			/* module not found on this priority, clean up and continue */
+			if (j == modules_len) {
+				modules_len = 0;
+				free(modules);
+				modules = NULL;
+				continue;
+			}
+		}
+
 		/* add space for modules */
 		tmp = realloc(*modinfos,
 			      sizeof(semanage_module_info_t) *
-- 
2.14.3

Comment 9 Vit Mojzis 2018-08-06 08:20:43 UTC
*** Bug 1609904 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2018-10-30 09:35:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3088