RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1528207 - rpc.gssd uses a lot of memory with krb5 mounts
Summary: rpc.gssd uses a lot of memory with krb5 mounts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nfs-utils
Version: 6.9
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: Yongcheng Yang
URL:
Whiteboard:
: 1544471 (view as bug list)
Depends On:
Blocks: 1807110 1809277
TreeView+ depends on / blocked
 
Reported: 2017-12-21 09:58 UTC by Pierguido Lambri
Modified: 2023-09-07 19:00 UTC (History)
8 users (show)

Fixed In Version: nfs-utils-1.2.3-78.el6_10.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1807110 (view as bug list)
Environment:
Last Closed: 2020-03-10 10:15:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch for memleak in find_keytab_entry() (1.18 KB, patch)
2018-09-20 06:22 UTC, Pierguido Lambri
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4878451 0 None None None 2020-03-05 10:33:28 UTC
Red Hat Product Errata RHBA-2020:0749 0 None None None 2020-03-10 10:15:40 UTC

Description Pierguido Lambri 2017-12-21 09:58:42 UTC
Description of problem:
After few months rpc.gssd reports a huge amount of used memory:

from top:

 1794 root      20   0 7357m 2.2g 1788 S  0.0 14.1  35:27.29 rpc.gssd

Version-Release number of selected component (if applicable):
Various versions report this problem.
The newest packages where this is seen are:

nfs-utils-1.2.3-64.el6
libtirpc-0.2.1-11.el6

How reproducible:
Very often by customer

Steps to Reproduce:
1. Mount a NFS share that uses krb5
2. Use the share for a while
3. Check the used memory

Actual results:
The used memory is very high

Expected results:
Used Memory shouldn't be that high

Additional info:

This takes some time. It's reported that high values are seen after few weeks (or months)

Comment 5 Pierguido Lambri 2018-02-12 16:10:29 UTC
*** Bug 1544471 has been marked as a duplicate of this bug. ***

Comment 14 Pierguido Lambri 2018-09-20 06:22:48 UTC
Created attachment 1485016 [details]
Patch for memleak in find_keytab_entry()

Comment 15 Pierguido Lambri 2018-09-20 06:25:05 UTC
With this patch I had no more leaks in find_keytab_entry().
This is the valgrind output after around 12 hours:

...
==00:08:42:22.758 2388== 2,048 bytes in 1 blocks are still reachable in loss record 45 of 48
==00:08:42:22.758 2388==    at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==00:08:42:22.758 2388==    by 0x10D6E1: init_client_list (gssd_proc.c:471)
==00:08:42:22.758 2388==    by 0x10D327: gssd_run (gssd_main_loop.c:203)
==00:08:42:22.758 2388==    by 0x10D103: main (gssd.c:222)
==00:08:42:22.758 2388== 
==00:08:42:22.758 2388== 4,096 bytes in 1 blocks are still reachable in loss record 46 of 48
==00:08:42:22.758 2388==    at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==00:08:42:22.758 2388==    by 0x4E2FB69: clnt_vc_create (clnt_vc.c:209)
==00:08:42:22.758 2388==    by 0x1124E5: nfs_get_tcpclient (rpc_socket.c:415)
==00:08:42:22.758 2388==    by 0x10DA36: create_auth_rpc_client (gssd_proc.c:883)
==00:08:42:22.758 2388==    by 0x10EC61: process_krb5_upcall (gssd_proc.c:1017)
==00:08:42:22.758 2388==    by 0x10F5BB: handle_gssd_upcall (gssd_proc.c:1214)
==00:08:42:22.758 2388==    by 0x10D482: gssd_run (gssd_main_loop.c:83)
==00:08:42:22.758 2388==    by 0x10D103: main (gssd.c:222)
==00:08:42:22.758 2388== 
==00:08:42:22.758 2388== 28,672 bytes in 7 blocks are still reachable in loss record 47 of 48
==00:08:42:22.758 2388==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==00:08:42:22.758 2388==    by 0x10D263: gssd_run (gssd_main_loop.c:113)
==00:08:42:22.758 2388==    by 0x10D103: main (gssd.c:222)
==00:08:42:22.758 2388== 
==00:08:42:22.758 2388== 49,152 bytes in 1 blocks are still reachable in loss record 48 of 48
==00:08:42:22.758 2388==    at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==00:08:42:22.758 2388==    by 0x4E2FBB2: clnt_vc_create (clnt_vc.c:219)
==00:08:42:22.758 2388==    by 0x1124E5: nfs_get_tcpclient (rpc_socket.c:415)
==00:08:42:22.758 2388==    by 0x10DA36: create_auth_rpc_client (gssd_proc.c:883)
==00:08:42:22.758 2388==    by 0x10EC61: process_krb5_upcall (gssd_proc.c:1017)
==00:08:42:22.758 2388==    by 0x10F5BB: handle_gssd_upcall (gssd_proc.c:1214)
==00:08:42:22.758 2388==    by 0x10D482: gssd_run (gssd_main_loop.c:83)
==00:08:42:22.758 2388==    by 0x10D103: main (gssd.c:222)
==00:08:42:22.758 2388== 
==00:08:42:22.758 2388== LEAK SUMMARY:
==00:08:42:22.758 2388==    definitely lost: 16 bytes in 1 blocks
==00:08:42:22.758 2388==    indirectly lost: 0 bytes in 0 blocks
==00:08:42:22.758 2388==      possibly lost: 0 bytes in 0 blocks
==00:08:42:22.758 2388==    still reachable: 87,908 bytes in 69 blocks
==00:08:42:22.758 2388==         suppressed: 0 bytes in 0 blocks
...

Comment 23 Steve Whitehouse 2019-10-08 11:31:10 UTC
Steve D, please review the proposed fix in comment #14. If it meets the z-stream criteria, of being small and targetted then we should fix this I think.

Comment 27 Steve Dickson 2020-02-12 18:54:26 UTC
It turns out this was added to upstream last Aug
commit 9a94f8bb8f1eec22a8ce4deb92c7ce42b5730b66
Author: Steve Dickson <steved>
Date:   Thu Aug 22 09:37:55 2019 -0400

    nfs-utils: Removed a number of Coverity Scan RESOURCE_LEAK errors

    Signed-off-by: Steve Dickson <steved>

and part of this patch will fixed the leak you are seeing (hopefully ;-) )

diff --git a/utils/gssd/krb5_util.c b/utils/gssd/krb5_util.c
index 454a6eb..f68be85 100644
--- a/utils/gssd/krb5_util.c
+++ b/utils/gssd/krb5_util.c
@@ -912,6 +912,8 @@ find_keytab_entry(krb5_context context, krb5_keytab kt,
                k5err = gssd_k5_err_msg(context, code);
                printerr(3, "%s while getting keytab entry for '%s'\n",
                     k5err, spn);
+               free(k5err);
+               k5err = NULL;
                /*
                 * We tried the active directory machine account
                 * with the hostname part as-is and failed...
@@ -1231,6 +1233,8 @@ gssd_destroy_krb5_machine_creds(void)
            k5err = gssd_k5_err_msg(context, code);
            printerr(0, "WARNING: %s while destroying credential "
                    "cache '%s'\n", k5err, ple->ccname);
+           free(k5err);
+           k5err = NULL;
        }
    }
    krb5_free_context(context);


I do have a question about this...

+    if (adhostoverride)
+        free(adhostoverride);

 the man page for krb5_appdefault_string says 
"def_val is the value to return if no value is found in krb5.conf(5)"
Is it known that 'def_val' is free-able? 

Also what happens if adhostoverride is freed above.
Would this be free the memory twice? 

It appears upstream has a similar problem... I'll
cc you on my upstream proposal.

Comment 28 Steve Dickson 2020-02-12 19:06:57 UTC
I forgot to cc you but here is what I came up with
commit 6c43c80764e17da1d95eefede45a0edfdf726ac8 
Author: Steve Dickson <steved>
Date:   Wed Feb 12 13:56:53 2020 -0500

    gssd: Closed a memory leak in find_keytab_entry()
    
    When 'adhostoverride' is "not set", which
    is most of the time, adhostoverride is not freed.
    
    Signed-off-by: Steve Dickson <steved>

diff --git a/utils/gssd/krb5_util.c b/utils/gssd/krb5_util.c
index a1c43d2..85f60ae 100644
--- a/utils/gssd/krb5_util.c
+++ b/utils/gssd/krb5_util.c
@@ -799,7 +799,7 @@ find_keytab_entry(krb5_context context, krb5_keytab kt,
        int tried_all = 0, tried_default = 0, tried_upper = 0;
        krb5_principal princ;
        const char *notsetstr = "not set";
-       char *adhostoverride;
+       char *adhostoverride = NULL;
 
 
        /* Get full target hostname */
@@ -827,7 +827,6 @@ find_keytab_entry(krb5_context context, krb5_keytab kt,
                                adhostoverride);
                /* No overflow: Windows cannot handle strings longer than 19 chars */
                strcpy(myhostad, adhostoverride);
-               free(adhostoverride);
        } else {
                strcpy(myhostad, myhostname);
                for (i = 0; myhostad[i] != 0; ++i) {
@@ -836,6 +835,8 @@ find_keytab_entry(krb5_context context, krb5_keytab kt,
                myhostad[i] = '$';
                myhostad[i+1] = 0;
        }
+       if (adhostoverride)
+               krb5_free_string(context, adhostoverride);
 
        if (!srchost) {
                retval = get_full_hostname(myhostname, myhostname, sizeof(myhostname));

Comment 29 Pierguido Lambri 2020-02-13 08:54:28 UTC
Yours looks much better. 
Now I don't remember exactly how I came up with the "free(adhostoverride)", 
I guess I missed the fact that it could be set to def_val 
(and also the fact that you should be using krb5_free_string().

Comment 46 Yongcheng Yang 2020-03-03 09:50:13 UTC
Moving to VERIFIED according to comment #42. I'll include this test into our regression tests.

Thanks Pierguido for the help!

Comment 49 errata-xmlrpc 2020-03-10 10:15:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0749


Note You need to log in before you can comment on or make changes to this bug.