Bug 1313457 - krb5 selinux patch leaks memory
Summary: krb5 selinux patch leaks memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: krb5
Version: 7.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Robbie Harwood
QA Contact: Patrik Kis
URL:
Whiteboard:
Depends On: 1311287
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-01 15:59 UTC by Robbie Harwood
Modified: 2016-11-03 20:24 UTC (History)
20 users (show)

Fixed In Version: krb5-1.14.1-9.el7
Doc Type: Bug Fix
Doc Text:
Acquiring keytabs takes longer with SELinux after memory leaks have been fixed Previously, SELinux support in the _krb5_ packages caused _krb5_ to leak memory. This bug has been fixed. Note that acquiring keytabs now takes longer than before when SELinux is in `enforcing` or `permissive` mode.
Clone Of: 1311287
Environment:
Last Closed: 2016-11-03 20:24:44 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2591 0 normal SHIPPED_LIVE Low: krb5 security, bug fix, and enhancement update 2016-11-03 12:10:29 UTC

Description Robbie Harwood 2016-03-01 15:59:41 UTC
+++ This bug was initially created as a clone of Bug #1311287 +++

+++ This bug was initially created as a clone of Bug #1309730 +++

Description of problem:

selabel_lookup leaks memory per call on RHEL 6.7 under Samba.

Richard Sharpe of the Samba Team reported, that his malloc_hook experiment suggests that selabel_linux on selinux-2.0.94-5.8.el6 (for RHEL/CentOS 6.7) is leaking ~10 MB of memory under Samba.

Following is the stack trace:

[2016/02/15 14:12:37.713576,  0] ../source3/smbd/server.c:92(samba_malloc_hook)
  2048 allocate at 0x7f8bbf2fd930
[2016/02/15 14:12:37.713724,  0] ../source3/smbd/server.c:85(samba_malloc_hook)
  malloc size of 2048 requested
[2016/02/15 14:12:37.714175,  0] ../source3/lib/util.c:901(log_stack_trace)
  BACKTRACE: 46 stack frames:
   #0 /usr/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7f8bbc22e542]
   #1 smbd(+0xa16c) [0x7f8bbe84416c]
   #2 /lib64/libc.so.6(__libc_calloc+0x331) [0x7f8bba78fa51]
   #3 /lib64/libc.so.6(+0xbe7db) [0x7f8bba7d37db]
   #4 /lib64/libc.so.6(+0xc91a9) [0x7f8bba7de1a9]
   #5 /lib64/libc.so.6(regexec+0xc3) [0x7f8bba7de553]
   #6 /lib64/libselinux.so.1(+0xefc7) [0x7f8baeb60fc7]
   #7 /lib64/libselinux.so.1(+0xd22a) [0x7f8baeb5f22a]
   #8 /lib64/libselinux.so.1(selabel_lookup+0xe) [0x7f8baeb5f36e]
   #9 /lib64/libkrb5support.so.0(+0x3af6) [0x7f8bbe653af6]
   #10 /lib64/libkrb5support.so.0(krb5int_push_fscreatecon_for+0x8c)
[0x7f8bbe653dec]
   #11 /lib64/libkrb5.so.3(+0x78d01) [0x7f8bbe70cd01]
   #12 /lib64/libkrb5.so.3(+0x78b8e) [0x7f8bbe70cb8e]
   #13 /lib64/libkrb5.so.3(+0x78ecf) [0x7f8bbe70cecf]
   #14 /lib64/libkrb5.so.3(krb5_get_server_rcache+0x1a1) [0x7f8bbe709101]
   #15 /lib64/libkrb5.so.3(+0x6d584) [0x7f8bbe701584]
   #16 /lib64/libkrb5.so.3(krb5_rd_req_decoded+0x2a) [0x7f8bbe7018ea]
   #17 /lib64/libkrb5.so.3(krb5_rd_req+0xbd) [0x7f8bbe7008fd]
   #18 /lib64/libgssapi_krb5.so.2(+0x1b4d5) [0x7f8bb1f6a4d5]
   #19 /lib64/libgssapi_krb5.so.2(+0x1cc3a) [0x7f8bb1f6bc3a]
   #20 /lib64/libgssapi_krb5.so.2(+0x1cd89) [0x7f8bb1f6bd89]
   #21 /lib64/libgssapi_krb5.so.2(gss_accept_sec_context+0x20a) [0x7f8bb1f5e53a]
   #22 /usr/lib/samba/libgse-samba4.so(+0xd830) [0x7f8bb6f1e830]
   #23 /usr/lib/samba/libgse-samba4.so(+0xe6be) [0x7f8bb6f1f6be]
   #24 /usr/lib/libgensec.so.0(gensec_update_ev+0xc8) [0x7f8bb6d0034d]
   #25 /usr/lib/libgensec.so.0(+0xb650) [0x7f8bb6cee650]
   #26 /usr/lib/libgensec.so.0(+0xc717) [0x7f8bb6cef717]
   #27 /usr/lib/libgensec.so.0(+0xdd93) [0x7f8bb6cf0d93]
   #28 /usr/lib/libgensec.so.0(+0x1d8d4) [0x7f8bb6d008d4]
   #29 /usr/lib/libtevent.so.0(tevent_common_loop_immediate+0x1f9)
[0x7f8bbaaae384]
   #30 /usr/lib/libsmbconf.so.0(run_events_poll+0x57) [0x7f8bbc24fdf5]
   #31 /usr/lib/libsmbconf.so.0(+0x464a2) [0x7f8bbc2504a2]
   #32 /usr/lib/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f8bbaaad449]
   #33 /usr/lib/libtevent.so.0(tevent_common_loop_wait+0x25) [0x7f8bbaaad6c1]
   #34 /usr/lib/libtevent.so.0(_tevent_loop_wait+0x2b) [0x7f8bbaaad78c]
   #35 /usr/lib/samba/libsmbd-base-samba4.so(smbd_process+0xc49)
[0x7f8bbdd5244b]
   #36 smbd(+0xb815) [0x7f8bbe845815]
   #37 /usr/lib/libsmbconf.so.0(run_events_poll+0x544) [0x7f8bbc2502e2]
   #38 /usr/lib/libsmbconf.so.0(+0x465b8) [0x7f8bbc2505b8]
   #39 /usr/lib/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f8bbaaad449]
   #40 /usr/lib/libtevent.so.0(tevent_common_loop_wait+0x25) [0x7f8bbaaad6c1]
   #41 /usr/lib/libtevent.so.0(_tevent_loop_wait+0x2b) [0x7f8bbaaad78c]
   #42 smbd(+0xc5a9) [0x7f8bbe8465a9]
   #43 smbd(main+0x159c) [0x7f8bbe847cf7]
   #44 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f8bba733d5d]
   #45 smbd(+0x6029) [0x7f8bbe840029]
[2016/02/15 14:12:37.716145,  0] ../source3/smbd/server.c:92(samba_malloc_hook)
  2048 allocate at 0x7f8bbf2ff710

After messing with changing the krb5-libs-1.10.3, Richard hacked out the
guts of the call to krb5int_push_fscreatecon_for above and the problem
went away and Samba now has a nice small heap.

See the thread on the samba-technical mailinglist:

https://lists.samba.org/archive/samba-technical/2016-February/112237.html

--- Additional comment from Richard Sharpe on 2016-02-18 10:35:35 EST ---

Versions: RHEL/CentOS 6.7, krb5-libs-1.10.3-37.el6.x86_64 and libselinux-2.0.94-5.8.el6.x86_64.

After I hacked out this code in the Kerberos library I stopped seeing the leak:

--- krb5-1.10.3/src/util/support/selinux.c.orig 2016-02-17 12:54:22.102145416 -0800
+++ krb5-1.10.3/src/util/support/selinux.c      2016-02-17 13:01:23.185096415 -0800
@@ -252,18 +252,7 @@
 void *
 krb5int_push_fscreatecon_for(const char *pathname)
 {
-       struct stat st;
-       void *retval;
-       k5_once(&labeled_once, label_mutex_init);
-       if (k5_mutex_lock(&labeled_mutex) == 0) {
-               if (stat(pathname, &st) != 0) {
-                       st.st_mode = S_IRUSR | S_IWUSR;
-               }
-               retval = push_fscreatecon(pathname, st.st_mode);
-               return retval ? retval : (void *) -1;
-       } else {
-               return NULL;
-       }
+       return NULL;
 }

 void

--- Additional comment from Jeremy Allison on 2016-02-18 14:35 EST ---

Andreas, Here is the patch. If the init() function inside
libselinux-2.0.94/src/label_file.c fails it doesn't
properly clean up the half-constructed regexp
regex_t contexts (or any of the other allocated
blobs).

Let me know if this works.

--- Additional comment from Petr Lautrbach on 2016-02-18 15:35:09 EST ---

There seems to be a leak in selabel_lookup() which tries to translate raw contexts via mcstransd. selabel_lookup_raw() could be used in krb5 to prevent it:

--- krb5-1.10.3/src/util/support/selinux.c.orig 2016-02-18 21:11:50.420462236 +0100
+++ krb5-1.10.3/src/util/support/selinux.c      2016-02-18 21:12:02.833398079 +0100
@@ -152,7 +152,7 @@
                                                           NULL, 0);
                        }
                        if (selabel_ctx != NULL) {
-                               if (selabel_lookup(selabel_ctx, &configuredsc,
+                               if (selabel_lookup_raw(selabel_ctx, &configuredsc,
                                                   fullpath, mode) != 0) {
                                        free(genpath);
                                        if (previous != NULL) {

--- Additional comment from Miroslav Grepl on 2016-02-18 16:11:52 EST ---

Nice catch. 

Actually *_raw libselinux functions should be used where it is possible (if there is no requirement for readable contexts). We can avoid the overhead caused by context translating.

--- Additional comment from Jeremy Allison on 2016-02-18 16:41:27 EST ---

Changing to using that function doesn't fix an underlying leak though. Plus I'm pretty sure the error path in the init() function will leak memory (see my attached patch).

--- Additional comment from Richard Sharpe on 2016-02-18 21:17:40 EST ---

So, I tried Jeremy's fix, and although I think he is correct about the problems there, it does not fix the problem.

I will try one of the other suggestions.


--- Additional comment from Richard Sharpe on 2016-02-19 10:47:14 EST ---

It looks like that this problem is fixed by combining Petr's suggestion and Jeremy's suggestion.

I need to do some more checking which I will do later today, but it does seem to be fixed now.

--- Additional comment from Richard Sharpe on 2016-02-19 10:48:39 EST ---

(In reply to Richard Sharpe from comment #12)
> It looks like that this problem is fixed by combining Petr's suggestion and
> Jeremy's suggestion.
> 
> I need to do some more checking which I will do later today, but it does
> seem to be fixed now.

Sorry, I was wrong.

One of the smbd's is showing an RSS of 31M which is a reliable indicator that it has the problem.

--- Additional comment from Richard Sharpe on 2016-02-19 11:35:19 EST ---

The test we have for this is to run lots of VDI clients against a Samba server, however, it is probable that a long-running smbtorture nbench test using Kerberos auth will also turn this up.

I have a strategy for finding the problem.

Since I am hooking malloc and am hitting on these 2048-byte allocations, I can add an smb_panic in that code path.

If I have the debuginfo RPMs installed for both krb5-libs and libselinux I should get enough info to pinpoint exactly where the mallocs are coming from and maybe see how to fix it.

--- Additional comment from Jeremy Allison on 2016-02-19 13:04:25 EST ---

+1 on the smb_panic (or abort() strategy). I've used this successfully in the past to find problematic mallocs.

--- Additional comment from Richard Sharpe on 2016-02-19 14:42:28 EST ---

These appear to be the lines/locations accessed:

(gdb) info line *0x7efeb085732b
Line 234 of "label.c" starts at address 0x7efeb085732b <selabel_lookup_raw+11>
   and ends at 0x7efeb0857330 <selabel_lookup_raw+16>.
(gdb) info line *0x7efeb085722a
Line 200 of "label.c"
   starts at address 0x7efeb085721e <selabel_lookup_common+110>
   and ends at 0x7efeb085722d <selabel_lookup_common+125>.
(gdb) info line *0x7efeb08586cb
Line 621 of "label_file.c" starts at address 0x7efeb08586b8 <lookup+344>
   and ends at 0x7efeb08586cd <lookup+365>.


They seem to match the new stack where I am using selabel_lookup_raw:

  BACKTRACE: 46 stack frames:
   #0 /usr/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7efebe843542]
   #1 smbd(+0xa16c) [0x7efec0e5916c]
   #2 /lib64/libc.so.6(__libc_calloc+0x331) [0x7efebcda4a51]
   #3 /lib64/libc.so.6(+0x38d40be7db) [0x7efebcde87db]
   #4 /lib64/libc.so.6(+0x38d40c91a9) [0x7efebcdf31a9]
   #5 /lib64/libc.so.6(regexec+0xc3) [0x7efebcdf3553]
   #6 /lib64/libselinux.so.1(+0xe6cb) [0x7efeb08586cb]
   #7 /lib64/libselinux.so.1(+0xd22a) [0x7efeb085722a]
   #8 /lib64/libselinux.so.1(selabel_lookup_raw+0xb) [0x7efeb085732b]
   #9 /lib64/libkrb5support.so.0(+0x3b5a) [0x7efeb0c6fb5a]
   #10 /lib64/libkrb5support.so.0(krb5int_push_fscreatecon_for+0x76) [0x7efeb0c6ffa6]
   #11 /lib64/libkrb5.so.3(+0x7be62) [0x7efeb495ce62]
   #12 /lib64/libkrb5.so.3(+0x7c196) [0x7efeb495d196]
   #13 /lib64/libkrb5.so.3(+0x7c935) [0x7efeb495d935]
   #14 /lib64/libkrb5.so.3(krb5_get_server_rcache+0x1a1) [0x7efeb49593f1]
   #15 /lib64/libkrb5.so.3(+0x700e7) [0x7efeb49510e7]
   #16 /lib64/libkrb5.so.3(krb5_rd_req_decoded+0x2a) [0x7efeb495145a]
   #17 /lib64/libkrb5.so.3(krb5_rd_req+0xf9) [0x7efeb4950489]
   #18 /lib64/libgssapi_krb5.so.2(+0x1c691) [0x7efeb3e6e691]
   #19 /lib64/libgssapi_krb5.so.2(+0x1dfca) [0x7efeb3e6ffca]
   #20 /lib64/libgssapi_krb5.so.2(+0x1e129) [0x7efeb3e70129]
   #21 /lib64/libgssapi_krb5.so.2(gss_accept_sec_context+0x233) [0x7efeb3e5fb43]

--- Additional comment from Jeremy Allison on 2016-02-19 15:25:18 EST ---

OK, I see the issue. Both selabel_lookup_raw() and selabel_lookup() call selabel_lookup_common() which can leak memory on the:

static struct selabel_handle *selabel_ctx

inside krb5/src/util/support/selinux.c. This fits with the regexec() source for the mallocs.

Inside libselinux-2.0.94/src/label.c we have:

----------------------------------------------------------
static struct selabel_lookup_rec *
selabel_lookup_common(struct selabel_handle *rec, int translating,
                      const char *key, int type)
{
        struct selabel_lookup_rec *lr;
        char *ptr = selabel_sub(key);
        if (ptr) { 
                lr = rec->func_lookup(rec, ptr, type); 
                free(ptr);
        } else {
                lr = rec->func_lookup(rec, key, type); 
        }
        if (!lr)
                return NULL;

        if (compat_validate(rec, lr, "file_contexts", 0))
                return NULL;

        if (translating && !lr->ctx_trans &&
            selinux_raw_to_trans_context(lr->ctx_raw, &lr->ctx_trans))
                return NULL;

        return lr;
}
----------------------------------------------------------

The rec->func_lookup() call can allocate memory (on the regexec context), and if compat_validate() fails or the next if clause fails then the memory is left allocated on selabel_ctx inside krb5/src/util/support/selinux.c as there is no cleanup operation called.

This memory will hang around until the cleanup_fscreatecon() function in krb5/src/util/support/selinux.c is called which calls selabel_close(selabel_ctx).

However, that is never called explicitly, only as a MAKE_FINI_FUNCTION, which is a destructor called when the library is unloaded. Which never happens of course as this is a long-lived daemon.

There needs to be a way to call selabel_close() and re-open on failure of selabel_lookup().

--- Additional comment from Jeremy Allison on 2016-02-19 15:30:34 EST ---

Just to clarify:

rec->func_lookup() can call

static struct selabel_lookup_rec *lookup(struct selabel_handle *rec,
                                         const char *key, int type)

inside libselinux-2.0.94/src/label_file.c. That function calls compile_regex(), which then calls regcomp(), but there's no way to call regfree() on an error outside this module - so the memory allocated on regcomp just sits around.

The error handling in these functions is very poor - there's no proper cleanup at all.

My original patch for error cleanup inside the init function is still needed also :-).

--- Additional comment from Jeremy Allison on 2016-02-19 15:38:00 EST ---

Richard, can you try this fix to krb5/src/util/support/selinux.c (this version is a patch to the patch that gets applied inside krb5-1.10.2 but it should be clear what is needed - the extra cleanup_fscreatecon() call if the selabel_lookup() call fails. This should be safe - it looks like the the selabel_ctx is re-created on the next request if it's set to NULL (which cleanup_fscreatecon() will do).

diff -U 6 krb5-1.10.2-selinux-label.patch.orig krb5-1.10.2-selinux-label.patch
--- krb5-1.10.2-selinux-label.patch.orig	2016-02-19 12:32:35.992091371 -0800
+++ krb5-1.10.2-selinux-label.patch	2016-02-19 12:33:34.708495344 -0800
@@ -623,12 +623,13 @@
 +				if (selabel_lookup(selabel_ctx, &configuredsc,
 +						   fullpath, mode) != 0) {
 +					free(genpath);
 +					if (previous != NULL) {
 +						freecon(previous);
 +					}
++					cleanup_fscreatecon();
 +					return NULL;
 +				}
 +			}
 +#else
 +			if (matchpathcon(fullpath, mode, &configuredsc) != 0) {
 +				free(genpath);

--- Additional comment from Richard Sharpe on 2016-02-19 22:05:21 EST ---

Hmm, that is a patch to a patch. My head hurts.

OK, I think you are adding one line: a call to cleanup_fscreatecon() in the error path.

I will apply that change using a separate patch ... and test it out.

Then RedHat can figure out how to package it.

--- Additional comment from Richard Sharpe on 2016-02-19 22:28:10 EST ---

Unfortunately, that does not seem to be enough.

I still see some smbds leaking lots of memory.

I will have to try to figure it out this weekend.

--- Additional comment from Jeremy Allison on 2016-02-19 23:00:19 EST ---

Yep, it's a patch to a patch that just adds cleanup_fscreatecon() in the error path.

Hmmm. You know, you might need to add cleanup_fscreatecon() on successful exit also. I'm at home now with no access to the source code I've been working on, but it might be that the selabel_ctx needs to get freed before anything gets freed of those compiled regexps.

More on Monday when I'm back in at work (sorry).

--- Additional comment from Richard Sharpe on 2016-02-19 23:10:58 EST ---

What is curious, unless the info provided by gdb above is wrong, the memory that is leaked seems to be being allocated in regexec.

Time to look at glibc.

--- Additional comment from Richard Sharpe on 2016-02-20 10:15:03 EST ---

Hmmm, since all the callers of krb5int_push_fscreatecon_for also call krb5_pop_fscreatecon immediately after they have done whatever they were doing, it seems like there should be a call to cleanup_fscreatecon() in krb5int_pop_fscreatecon.

That's what I am going with.

--- Additional comment from Richard Sharpe on 2016-02-20 10:43 EST ---

This patch, along with Jeremy's earlier patch seems to fix the problem.

I will report back later today on whether or not it is holding up.

However, after 20 minutes of run time I am not seeing any smbds with large amounts of heap.

--- Additional comment from Richard Sharpe on 2016-02-20 11:16:45 EST ---

Hmmm, not yet fixed, or maybe a different problem. There is still one smbd that has crept up to 21MB in its heap.

Checking the problem now.

--- Additional comment from Richard Sharpe on 2016-02-20 11:20:24 EST ---

This time around there seems to be 2553 segments of 3680 bytes in the heap that should not be there.

This still points to the krb5/selinux axis of evil, so I will look at tracking this down.

--- Additional comment from Jeremy Allison on 2016-02-20 11:36:24 EST ---

"What is curious, unless the info provided by gdb above is wrong, the memory that is leaked seems to be being allocated in regexec"

No it's not curious, that's what I kept telling you above :-). The regcomp() calls allocate this memory on the selabel_ctx, and it won't get freed until the loop calling regfree() gets done when the ctx cleanup function is done.

Can you use the smb_panic method method to track down the malloc location of the remaining 3680 allocations ? Then we can nail that too.

--- Additional comment from Richard Sharpe on 2016-02-20 11:54:56 EST ---

(In reply to Jeremy Allison from comment #28)
> "What is curious, unless the info provided by gdb above is wrong, the memory
> that is leaked seems to be being allocated in regexec"
> 
> No it's not curious, that's what I kept telling you above :-). The regcomp()
> calls allocate this memory on the selabel_ctx, and it won't get freed until
> the loop calling regfree() gets done when the ctx cleanup function is done.
> 
> Can you use the smb_panic method method to track down the malloc location of
> the remaining 3680 allocations ? Then we can nail that too.

If I get desperate I will. However, I noticed that there are other return paths out of push_fscreatecon that return NULL but have already done a lookup.

So, I changed approaches and am cleaning up in krb5int_push_fscreatcon_for as well as krb5int_pop_fscreatecon.

It that fixes this last problem I will call it a day. If not, then on to the smb_panic method.

--- Additional comment from Jeremy Allison on 2016-02-20 13:07:44 EST ---

Yeah, someone *seriously* needs to run Coverity over this code.

--- Additional comment from Richard Sharpe on 2016-02-20 16:46:00 EST ---

OK, after changing Samba to try to detect the 3680-byte segments I am no longer seeing the leak with my final fixed change for krb5-1.10.3, so I will post the here.

I think the problem I saw was due to Samba's DEBUG facility causing problems but am not sure.

I have run the test for more than four hours and am not seeing any smbds with large amounts of heap.

--- Additional comment from Richard Sharpe on 2016-02-20 17:25 EST ---

This is a better fix, I think.

It applies last and seems to resolve the issue.

--- Additional comment from Jeremy Allison on 2016-02-21 01:50:01 EST ---

Looks like you're calling cleanup_fscreatecon() outside of the protection of the mutex k5_mutex_unlock(&labeled_mutex). Is that safe ?

--- Additional comment from Richard Sharpe on 2016-02-21 10:25:07 EST ---

(In reply to Jeremy Allison from comment #33)
> Comment on attachment 1128878 [details]
> I think this is a more correct fix
> 
> Looks like you're calling cleanup_fscreatecon() outside of the protection of
> the mutex k5_mutex_unlock(&labeled_mutex). Is that safe ?

Good point.

For Samba it probably doesn't matter as we will only ever call this from one thread, but we should keep other users in mind.

I need to move the final cleanup_fscreatecon() to before we release the mutex.

Comment 9 errata-xmlrpc 2016-11-03 20:24:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2591.html


Note You need to log in before you can comment on or make changes to this bug.